CN116385465A

CN116385465A - Image segmentation model construction and image segmentation method, system, equipment and medium

Info

Publication number: CN116385465A
Application number: CN202310391080.7A
Authority: CN
Inventors: 黄梦珂; 杨家荣; 刘文奇; 毛晴; 徐胤; 叶松霖; 陈怡然
Original assignee: Shanghai Electric Group Corp
Current assignee: Shanghai Electric Group Corp
Priority date: 2023-04-11
Filing date: 2023-04-11
Publication date: 2023-07-04

Abstract

The invention discloses a method, a system, equipment and a medium for constructing an image segmentation model, wherein the construction method comprises the steps of obtaining a plurality of initial infrared images; for any initial infrared image, acquiring position association information corresponding to an operation position of a first selection operation on the initial infrared image and the selected initial segmentation image; generating a position information image corresponding to the initial infrared image based on the position-related information; acquiring a sample input image corresponding to each initial infrared image based on the initial infrared image and the corresponding position information image; and taking each sample input image as input, taking an initial segmentation image corresponding to each initial infrared image as output, and constructing to obtain an image segmentation model. The method and the device realize that the preset object in the infrared image can be segmented by using the image segmentation model and selecting any point on the preset object in the target scene through the first selection operation, so that the labeling efficiency and the labeling accuracy are greatly improved.

Description

Image segmentation model construction and image segmentation method, system, equipment and medium

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to a method, a system, an apparatus, and a medium for constructing an image segmentation model and segmenting an image.

Background

With the continuous development of solar technology and the strategic placement of energy sources, more and more photovoltaic power stations will be put into use. In the process of installing and operating the solar panel, the solar panel is inevitably influenced by various factors due to long-term exposure to the field severe climate environment, so that the solar panel (also called as a photovoltaic panel) works abnormally or has serious faults. Therefore, the daily inspection of the photovoltaic power station is carried out, and the situation which possibly affects the normal operation of the photovoltaic panel is reported in time according to the inspection result, so that the method becomes an important content for daily operation and maintenance of the large photovoltaic power station.

The image of the photovoltaic panel in the photovoltaic power station is shot, the image of the photovoltaic panel is segmented based on image data and a machine vision technology based on deep learning, and the running state of the photovoltaic panel is monitored according to the segmented image, so that the method has gradually become a promising development direction in the field of image processing.

However, current machine vision techniques based on deep learning generally require a large amount of manually labeled data to train a model, and the labeling process of a large number of images, especially the manually labeling of such images at the pixel level, consumes a large amount of time and labor.

Disclosure of Invention

The invention aims to overcome the defect that a great deal of time and labor are consumed in the labeling process of a great deal of infrared images in the prior art, and aims to provide a method, a system, equipment and a medium for constructing an image segmentation model and segmenting the images.

The invention solves the technical problems by the following technical scheme:

in a first aspect, a method for constructing an image segmentation model is provided, where the method includes:

acquiring a plurality of initial infrared images;

for any initial infrared image, acquiring position association information corresponding to an operation position of a first selection operation on the initial infrared image and a selected initial segmentation image; wherein the initial segmentation image comprises a preset object in a target scene;

generating a position information image corresponding to the initial infrared image based on the position-related information; wherein the relative positions of the operation positions in the initial infrared image and the position information image are the same;

acquiring a sample input image corresponding to each initial infrared image based on the initial infrared image and the corresponding position information image;

And taking each sample input image as input, and taking an initial segmentation image corresponding to each initial infrared image as output, and constructing to obtain the image segmentation model.

Preferably, the step of generating the position information image corresponding to the initial infrared image based on the position-related information includes:

and generating a Gaussian image with the same resolution corresponding to the initial infrared image based on the position correlation information and the initial infrared image by adopting a Gaussian function.

Preferably, after the step of generating the gaussian image having the same resolution corresponding to the initial infrared image, before the step of acquiring the sample input image corresponding to each initial infrared image, the method further comprises:

scaling the initial infrared image;

performing position matching processing on the Gaussian image corresponding to the initial infrared image, so that the resolution of the Gaussian image is the same as that of the initial infrared image after scaling processing, and the relative positions of the operation positions in the initial infrared image and the Gaussian image are the same;

and/or, the first selection operation includes a user external operation, and the step of obtaining, for any one of the initial infrared images, position association information corresponding to an operation position of the first selection operation on the initial infrared image and the selected initial divided image includes:

For each sample input image, acquiring position related information corresponding to an operation position of the user external operation on the initial infrared image;

determining an object area where a preset object corresponding to the external operation of the user is located based on the position association information;

and taking an image taking the object area in the initial infrared image as a foreground area as the corresponding initial segmentation image.

Preferably, the step of acquiring a plurality of initial infrared images includes:

acquiring an initial infrared video corresponding to the target scene;

performing frame extraction operation on the initial infrared video to obtain a plurality of corresponding initial infrared images;

and/or the target scene is a photovoltaic power station, and the preset object is a photovoltaic panel;

and/or, the step of acquiring a sample input image corresponding to each initial infrared image based on the initial infrared image and the corresponding position information image includes:

carrying out channel series connection operation on each initial infrared image and the corresponding position information image to obtain a sample input image corresponding to each initial infrared image;

and/or, the step of constructing the image segmentation model by taking each sample input image as input and taking an initial segmentation image corresponding to each initial infrared image as output comprises the following steps of:

And taking each sample input image as input, taking an initial segmentation image corresponding to each initial infrared image as output, and constructing and obtaining the image segmentation model based on a preset HRNet (high resolution network) model and a cavity space pyramid pooling module.

Preferably, the step of constructing the image segmentation model based on a preset HRNet model and a hole space pyramid pooling module by taking each sample input image as input and taking an initial segmentation image corresponding to each initial infrared image as output includes:

inputting the sample input images into the preset HRNet model, and extracting multi-scale feature data corresponding to each sample input image, wherein the multi-scale feature data are used for representing feature data with different resolutions and multiple dimensions;

unifying the feature data with different resolutions in the multi-scale feature data to the same preset resolution, inputting the feature data into the cavity space pyramid pooling module for feature extraction and feature fusion to obtain image feature data so as to construct and obtain the image segmentation model;

the scale of the image characteristic data is larger than that of the multi-scale characteristic data, and the image segmentation model outputs the initial segmentation image corresponding to each sample input image.

In a second aspect, there is also provided an image segmentation method, the image segmentation method comprising:

acquiring a current infrared image;

acquiring target position associated information corresponding to a target operation position of target selection operation on the current infrared image;

generating a target position information image corresponding to the current infrared image based on the target position association information; the relative positions of the target operation position in the current infrared image and the target position information image are the same;

acquiring a target input image corresponding to the current infrared image based on the current infrared image and the corresponding target position information image;

and inputting the target input image into the image segmentation model, and outputting a target segmentation image corresponding to the target input image.

In a third aspect, there is also provided a construction system of an image segmentation model, the construction system comprising:

the first image acquisition module is used for acquiring a plurality of initial infrared images;

the second image acquisition module is used for acquiring position association information corresponding to the operation position of the first selection operation on any initial infrared image and the selected initial segmentation image; wherein the initial segmentation image comprises a preset object in a target scene;

A third image acquisition module for generating a position information image corresponding to the initial infrared image based on the position-related information; wherein the relative positions of the operation positions in the initial infrared image and the position information image are the same;

the first input acquisition module is used for acquiring a sample input image corresponding to each initial infrared image based on the initial infrared image and the corresponding position information image;

the model construction module is used for taking each sample input image as input and taking an initial segmentation image corresponding to each initial infrared image as output to construct and obtain the image segmentation model.

Preferably, the third image acquisition module is specifically configured to generate a gaussian image with the same resolution corresponding to the initial infrared image based on the position-related information and the initial infrared image by using a gaussian function.

Preferably, the building system further comprises:

the scaling module is used for scaling the initial infrared image;

the matching module is used for carrying out position matching processing on the Gaussian image corresponding to the initial infrared image so that the resolution ratio of the Gaussian image is the same as that of the initial infrared image after scaling processing and the relative positions of the operation positions in the initial infrared image and the Gaussian image are the same;

And/or, the first selection operation includes a user external operation, and the second image acquisition module includes:

the position information acquisition unit is used for acquiring position association information corresponding to an operation position of the user external operation on any initial infrared image;

an object region determining unit, configured to determine an object region in which a preset object corresponding to the external operation of the user is located, based on the location association information;

and the segmentation unit is used for taking an image taking the object area in the initial infrared image as a foreground area as the corresponding initial segmentation image.

Preferably, the first image acquisition module includes:

the video acquisition unit is used for acquiring an initial infrared video corresponding to the target scene;

and the image acquisition unit is used for performing frame extraction operation on the initial infrared video to obtain a plurality of corresponding initial infrared images.

and/or the first input acquisition module is specifically configured to perform a channel series operation on each initial infrared image and the corresponding position information image, so as to obtain a sample input image corresponding to each initial infrared image;

And/or the model construction module is specifically configured to take each sample input image as input, take an initial segmentation image corresponding to each initial infrared image as output, and construct and obtain the image segmentation model based on a preset HRNet model and a hole space pyramid pooling module.

In an alternative embodiment, the model building module includes:

the characteristic extraction unit is used for inputting the sample input images into the preset HRNet model, extracting multi-scale characteristic data corresponding to each sample input image, wherein the multi-scale characteristic data are used for representing characteristic data with different resolutions and multiple dimensions;

the feature fusion unit is used for unifying the feature data with different resolutions in the multi-scale feature data to the same preset resolution, inputting the feature data into the cavity space pyramid pooling module for feature extraction and feature fusion to obtain image feature data so as to construct and obtain the image segmentation model;

In a fourth aspect, there is also provided an image segmentation system, comprising:

the fourth image acquisition module is used for acquiring a current infrared image;

the position acquisition module is used for acquiring target position associated information corresponding to a target operation position of target selection operation on the current infrared image;

a fifth image acquisition module, configured to generate a target position information image corresponding to the current infrared image based on the target position association information; the relative positions of the target operation position in the current infrared image and the target position information image are the same;

the second input acquisition module is used for acquiring a target input image corresponding to the current infrared image based on the current infrared image and the corresponding target position information image;

and the segmented image output module is used for inputting the target input image into the image segmentation model and outputting a target segmented image corresponding to the target input image.

In a fifth aspect, there is also provided an electronic device, including a memory, a processor, and a computer program stored on the memory and configured to run on the processor, where the processor implements the method for constructing an image segmentation model or the method for segmenting an image described above when the processor executes the computer program.

In a sixth aspect, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the above-described image segmentation model construction method, or the above-described image segmentation method.

On the basis of conforming to the common knowledge in the field, the above preferred conditions can be arbitrarily combined to obtain the preferred examples of the invention.

The invention has the positive progress effects that:

according to the method, the system, the equipment and the medium for constructing the image segmentation model, the auxiliary data annotators annotate the initial infrared image through the constructed image segmentation model, and the position information image corresponding to the initial infrared image is generated through the position associated information corresponding to the operation position of the first selection operation on the initial infrared image; the relative positions of the operation positions in the initial infrared image and the position information image are the same; by using the image segmentation model, the segmentation image selected by the first selection operation can be obtained only by selecting any point on the preset object in the target scene through the first selection operation, and the preset object in the infrared image can be segmented because the segmentation image contains the preset object in the target scene, so that the labeling efficiency and the labeling accuracy are greatly improved, and the training precision and the training efficiency of the image segmentation model are further improved, so that the effect of improving the precision of the image segmentation processing is achieved.

Drawings

Fig. 1 is a first flow chart of a method for constructing an image segmentation model according to embodiment 1 of the present invention;

fig. 2 is a second flow chart of the method for constructing an image segmentation model according to embodiment 1 of the present invention;

fig. 3 is a third flow chart of the method for constructing an image segmentation model according to embodiment 1 of the present invention;

fig. 4 is a fourth flowchart of a method for constructing an image segmentation model according to embodiment 1 of the present invention;

fig. 5 is a schematic diagram of a processing procedure of an image segmentation model in the method for constructing an image segmentation model according to embodiment 1 of the present invention;

fig. 6 is a schematic diagram of icon meaning in the processing procedure of the image segmentation model in the method for constructing the image segmentation model according to embodiment 1 of the present invention;

FIG. 7 is a schematic diagram of an initial infrared image and a Gaussian image corresponding to the initial infrared image in the method for constructing an image segmentation model according to embodiment 1 of the present invention;

FIG. 8 is a schematic diagram of a sample input image corresponding to an initial infrared image in the method for constructing an image segmentation model according to embodiment 1 of the present invention;

FIG. 9 is an initial segmentation image corresponding to an initial infrared image in the method for constructing an image segmentation model according to embodiment 1 of the present invention;

fig. 10 is a flowchart of an image segmentation method according to embodiment 2 of the present invention;

FIG. 11 is a schematic structural diagram of a system for constructing an image segmentation model according to embodiment 3 of the present invention;

fig. 12 is a schematic structural diagram of an image segmentation system according to embodiment 4 of the present invention;

fig. 13 is a schematic structural diagram of an electronic device according to embodiment 5 of the present invention.

Detailed Description

The invention is further illustrated by means of the following examples, which are not intended to limit the scope of the invention.

Example 1

The embodiment provides a method for constructing an image segmentation model, as shown in fig. 1, the method includes:

s101, acquiring a plurality of initial infrared images.

S102, for any initial infrared image, acquiring position correlation information corresponding to an operation position of a first selection operation on the initial infrared image and the selected initial segmentation image.

The initial segmentation image comprises a preset object in a target scene.

By the first selection operation, a certain operation position is selected on the initial infrared image, and position information of the operation position on the initial infrared image, that is, position-related information. The location association information may also be referred to as interactive location information.

The initial infrared image corresponds to a certain target scene, and a plurality of objects may exist in the target scene, wherein the preset objects are objects selected by the first selection operation and also are objects to be segmented.

For example, the target scene is a photovoltaic power plant and the preset object is a photovoltaic panel in the photovoltaic power plant.

S103, generating a position information image corresponding to the initial infrared image based on the position correlation information.

Wherein the relative positions of the operation positions in the initial infrared image and the position information image are the same.

S104, acquiring a sample input image corresponding to each initial infrared image based on the initial infrared image and the corresponding position information image.

S105, taking each sample input image as input, taking an initial segmentation image corresponding to each initial infrared image as output, and constructing to obtain an image segmentation model.

In the existing image segmentation process, a data labeling person needs to manually perform multiple clicking operations on an image to be segmented to finish the labeling process, so that a certain object in the image to be segmented is segmented, and a segmented image corresponding to the image to be segmented is obtained.

The image segmentation model of the present invention may also be referred to as an interactive segmentation model.

According to the method for constructing the image segmentation model, the constructed image segmentation model is used for assisting a data annotator in annotating an initial infrared image, and position information images corresponding to the initial infrared image are generated through position association information corresponding to the operation positions of first selection operation on the initial infrared image; the relative positions of the operation positions in the initial infrared image and the position information image are the same; by using the image segmentation model, the segmentation image selected by the first selection operation can be obtained only by selecting any point on the preset object in the target scene through the first selection operation, and the preset object in the infrared image can be segmented because the segmentation image contains the preset object in the target scene, so that the labeling efficiency and the labeling accuracy are greatly improved, and the training precision and the training efficiency of the image segmentation model are further improved, so that the effect of improving the precision of the image segmentation processing is achieved.

In an alternative embodiment, as shown in fig. 2, the step S103 includes:

s1031, generating a Gaussian image with the same resolution corresponding to the initial infrared image based on the position correlation information and the initial infrared image by using a Gaussian function.

Recording an operation position of the first selection operation on the initial infrared image, acquiring two-dimensional position information corresponding to the operation position, namely position correlation information, performing fuzzy processing on the position correlation information by using a two-dimensional Gaussian function, and generating a Gaussian image with the same resolution as the initial infrared image, wherein the relative positions of the operation position in the initial infrared image and the Gaussian image are the same.

According to the construction method of the image segmentation model, the Gaussian function is adopted to obtain the Gaussian image with the same resolution corresponding to the initial infrared image and serve as the position information image corresponding to the initial infrared image, so that the fact that the relative positions of the operation positions in the initial infrared image and the Gaussian image are the same is guaranteed, accurate recording and labeling of the operation positions are achieved, and accuracy of a sample input image is improved.

In an alternative embodiment, as shown in fig. 3, after step S1031, before step S104, the method further includes:

S1032, scaling the initial infrared image.

S1033, performing position matching processing on the Gaussian image corresponding to the initial infrared image, so that the Gaussian image and the zoomed initial infrared image have the same resolution and the relative positions of the operation positions in the initial infrared image and the Gaussian image are the same.

The pixels of the initial infrared image are generally higher, which can reach tens of millions of pixels, the excessive pixels can cause the training cost of the model to be higher, the training speed of the model to be slower, and the accuracy of the model to be reduced, so that the initial infrared image needs to be scaled to reduce the resolution of the initial infrared image to hundred-grade pixels, for example, the pixels of the initial infrared image 3024 x 4032 are reduced to 512 x 512 pixels, and meanwhile, the gaussian image corresponding to the pixels of the initial infrared image 3024 x 4032 needs to be subjected to position matching processing, so that the pixels of the gaussian image are reduced to 512 x 512 pixels, and the relative positions of the operation positions in the reduced initial infrared image and the gaussian image are the same.

Similarly, if the pixels of the initial infrared image are lower, the initial infrared image can be amplified, so that the resolution of the amplified initial infrared image is ensured to be the same as that of the corresponding Gaussian image, and the relative positions of the operation positions in the initial infrared image and the Gaussian image are the same.

The scaling range may be [0.5,2.0] when the initial infrared image is scaled. The scaling processing is one of data enhancement, and other data enhancement modes can be adopted to process the initial infrared image, but the position matching processing is required to be carried out on the Gaussian image corresponding to the initial infrared image, so that the resolution of the Gaussian image is the same as that of the initial infrared image after the scaling processing, and the relative positions of the operation positions in the initial infrared image and the Gaussian image are the same, and the robustness of the image segmentation model is improved.

Meanwhile, the normalization operation can be performed on the initial infrared image, so that the image segmentation model can be conveniently converged.

In an alternative embodiment, the first selection operation includes an external operation of the user, as shown in fig. 4, and step S102 includes:

s1021, for any initial infrared image, acquiring position correlation information corresponding to an operation position of external operation of a user on the initial infrared image.

S1022, determining an object area where a preset object corresponding to the external operation of the user is located based on the position association information.

S1023, taking an image taking a target area in the initial infrared image as a foreground area as a corresponding initial segmentation image.

The external operation of the user can be clicking operation of the initial infrared image by the user, the clicking operation corresponds to a certain operation position on the initial infrared image, the operation position points to a certain preset object, the preset object occupies a certain object area in the initial infrared image, the object area is taken as an image of a foreground area, the rest objects in the initial infrared image are taken as background areas, the initial infrared image is manually marked, the marking content comprises the position of each preset object pixel level, and therefore the preset object is segmented from the initial infrared image, and an initial segmentation image corresponding to the initial infrared image is obtained.

According to the method for constructing the image segmentation model, position association information corresponding to an operation position of a user on an initial infrared image is obtained, and an image taking a target area in the initial infrared image as a foreground area is taken as a corresponding initial segmentation image according to the position association information; by using the image segmentation model, only any point on the preset object in the target scene is selected through external operation of a user, the object region corresponding to the preset object can be used as a foreground region, the preset object is segmented from the infrared image, the labeling efficiency and the labeling accuracy are greatly improved, the training precision and the training efficiency of the image segmentation model are further improved, and the effect of improving the precision of image segmentation processing is achieved.

In an alternative embodiment, step S101 includes:

s1011, acquiring an initial infrared video corresponding to the target scene.

S1012, performing frame extraction operation on the initial infrared video to obtain a plurality of corresponding initial infrared images.

In an alternative embodiment, the target scene is a photovoltaic power station and the preset object is a photovoltaic panel. The corresponding region of the photovoltaic panel is the object region, and the image with the object region of the initial infrared image in which the photovoltaic panel is positioned as the foreground region is used as the corresponding initial segmentation image.

The rapid development of unmanned aerial vehicle technology has promoted the inspection mode of photovoltaic power plant at present to change from the manual inspection that consumes time and energy to automatic technical direction that examines. In the automatic inspection process, due to the study of the machine vision technology based on deep learning and the photovoltaic hot spot fault principle in the artificial intelligence field, the unmanned aerial vehicle with the infrared camera is utilized to obtain the infrared image of the photovoltaic cell panel in an aerial way, and the infrared image is automatically detected, so that the efficiency of the photovoltaic power station is improved, the cost is reduced, and the degree of influence of external environment and human factors on the inspection process is reduced.

Compared with the natural light image data inspected by an unmanned plane, the detection technology based on the thermal infrared image is not required to use a light source in an external environment, can adapt to the complex environment where a photovoltaic power station is usually located, can work for 24 hours in all weather, and is greatly helpful for improving the detection precision of automatic inspection.

The unmanned aerial vehicle carries an infrared camera, a photovoltaic cell panel picture is shot at a vertical downward angle through a stable cradle head in the aerial photographing process, in an actual operation and maintenance scene, the aerial line information is loaded according to requirements, the unmanned aerial vehicle flies in parallel according to each row arrangement direction of the photovoltaic array, and an optical axis visual angle of the onboard infrared thermal imager is vertical to the ground for vertical downward viewing, so that an initial infrared video of the photovoltaic panel is collected. The route planning should guarantee that all photovoltaic panels in the photovoltaic power station can be completely shot by one aerial photograph.

When shooting, the flying speed should be less than 1m/s as much as possible to prevent initial infrared images from generating smear, the vertical distance between the unmanned aerial vehicle and equipment is about 5-10 m, the photovoltaic panel in the picture after zooming should occupy the area of the whole picture as much as possible, but at least one complete photovoltaic panel should be shot, the initial infrared video shot by the unmanned aerial vehicle is transmitted to a ground workstation in real time by using a 5G (5 th generation mobile communication technology) transmission technology based on unmanned aerial vehicle products, the ground workstation is used for calling, and meanwhile, the workstation extracts and stores video frames of the initial infrared video transmitted by the unmanned aerial vehicle in real time to obtain a plurality of initial infrared images, the normal frame extraction interval is 1 frame/second, and various frame extraction intervals can be set.

For example, 1502 pictures of high-resolution images (tens of millions of pixels) of a photovoltaic panel under infrared light are collected, and 1497 pictures are marked in a COCO (Common Objects in Context, a data set for image recognition) format. Wherein the training set is 1047 sheets, and the verification set is 450 sheets.

And reading and displaying the initial infrared image obtained after frame extraction, clicking a point in the photovoltaic panel on the initial infrared image by using a mouse by a labeling person, wherein the point is usually the position in the photovoltaic panel, which points to the farthest position from the boundary of the photovoltaic panel, recording the two-dimensional position information of the point, blurring the two-dimensional position information by using a two-dimensional Gaussian function, and generating a Gaussian image with the same resolution as the corresponding initial infrared image so as to facilitate the next processing.

The target scene is not limited to a photovoltaic power station, the preset object is not limited to a photovoltaic panel, and the preset object can be shot to obtain an effective infrared video as long as the preset object has a self-heating function.

In an alternative embodiment, step S104 includes:

s1041, performing channel series connection operation on each initial infrared image and the corresponding position information image to obtain a sample input image corresponding to each initial infrared image.

An RGB (red green blue color mode) image is three-channel, each pixel consisting of three channels, each channel representing a color.

And carrying out channel characteristic serial connection operation on the initial infrared image of the 3 channels and the Gaussian image of the 1 channel to obtain a sample input image corresponding to each initial infrared image.

In an alternative embodiment, step S105 includes:

s1051, taking each sample input image as input, taking an initial segmentation image corresponding to each initial infrared image as output, and constructing and obtaining an image segmentation model based on a preset HRNet model and a cavity space pyramid pooling module.

The HRNet model is an algorithm for semantic segmentation by utilizing a convolutional neural network compared with a reference, and because the photovoltaic panel occupies a larger proportion of the initial infrared image area and does not need to segment and classify a single photovoltaic panel, the accuracy requirement can be met by the HRNet model compared with the reference semantic segmentation algorithm, and the algorithm model is smaller and occupies less memory. In general, the use of the model requires pixel-level semantic labeling of a photovoltaic panel infrared image shot by an unmanned aerial vehicle acquired in advance, pixel-level classification of whether each pixel in the image belongs to a photovoltaic panel part or not, then the labeled image is input into an HRNet convolutional neural network for training, the photovoltaic panel part in the image can be finally separated through learning, the infrared image transmitted to equipment carrying an algorithm by the unmanned aerial vehicle is processed by the model, and a mask (mask) of a region where the photovoltaic panel is located and other regions in the image can be separated.

The structure of the HRNet model is divided into 4 stages, each stage expands a branch based on the previous stage, and downsampling operation is used for the characteristics of the branch, so that the spatial resolution of the characteristic diagram is reduced by 2 times, and meanwhile, the channel number of the characteristic diagram is expanded by 2 times. And the output features of the last stage are fused under multiple scales through feature series operation, and are input into a subsequent network structure.

In the invention, besides the original RGB initial infrared image is input into the HRNet model, the corresponding Gaussian image containing the position association information is input into the HRNet model together for training and prediction, and simultaneously, the position association information provided by a data labeling person and the initial infrared image are utilized for feature extraction.

The cavity space pyramid pooling module depends on cavity convolution, can enlarge the receptive field, capture multi-scale context information, and the traditional downsampling can increase the receptive field but can reduce the spatial resolution. And the cavity convolution can ensure resolution while expanding receptive fields. The method is very suitable for detection and segmentation tasks, large targets can be detected and segmented by increasing receptive fields, and the targets can be accurately positioned by high resolution.

In an alternative embodiment, step S1051 includes:

s10511, inputting the sample input images into a preset HRNet model, and extracting to obtain multi-scale feature data corresponding to each sample input image, wherein the multi-scale feature data are used for representing feature data with different resolutions and multiple dimensions.

S10512, unifying the feature data with different resolutions in the multi-scale feature data to the same preset resolution, inputting the feature data into a cavity space pyramid pooling module for feature extraction and feature fusion to obtain image feature data so as to construct and obtain an image segmentation model;

the image segmentation model outputs an initial segmentation image corresponding to each sample input image.

In this embodiment, an HRNet model including 18 convolution layers is used to extract multi-scale features from an image. Specifically, as shown in fig. 5, the HRNet model extracts image features of different scales and includes 4 stages, namely stage 1, stage 2, stage 3 and stage 4, wherein each stage is implemented by a stride convolution with a step length of 2 by expanding a branch convolution structure after the spatial resolution of the feature map of the previous stage is reduced by 2 times, so as to extract image features of different scales. For the fusion of small-scale features and large-scale features, the spatial resolution of the small-scale features is enlarged using an upsampling operation. And for the features of different scales, integrating the features by using channel feature series operation after unifying the features to the same spatial resolution.

And (3) for the multi-scale characteristic data extracted by the HRNet model, further fusing the multi-scale characteristic data by using a hole space pyramid pooling module. The hole space pyramid pooling module further extracts features by using expansion convolution with different expansion rates, expands convolution receptive fields to extract richer local and global image features under the condition of not increasing convolution parameters, obtains image feature data, and finally outputs corresponding initial segmentation images, wherein the initial segmentation images are stored in a ground workstation, namely segmentation results based on position association information labeling. In the HRNet model, as shown in fig. 6, a convolution operation, a stride convolution operation, an above-mentioned operation and a feature pooling operation are used to process a feature map of an input HRNet model to perform feature extraction to obtain multi-scale feature data, and then based on a hole space pyramid pooling module, expansion convolution with different expansion rates, such as 3*3 expansion convolution layers, are used to further extract and fuse features of the multi-scale feature data to obtain image feature data, so as to construct an image segmentation model, and a final image segmentation model outputs an initial segmentation image corresponding to each sample input image.

Taking a photovoltaic panel in a photovoltaic power plant as an example, fig. 7 is a schematic diagram of an initial infrared image and a gaussian image corresponding to the initial infrared image, fig. 8 is a schematic diagram of a sample input image corresponding to the initial infrared image, and fig. 9 is an initial segmentation image corresponding to the initial infrared image. Specifically, as shown in fig. 7, the relative positions of the operation positions in the initial infrared image and the corresponding gaussian image are the same, in fig. 9, the operation position is a certain point in the photovoltaic panel, the photovoltaic panel is a preset object, the area where the photovoltaic panel is located is a foreground area, the rest object is a background area, and the foreground area and the background area can be distinguished by different colors, so that the photovoltaic panel can be conveniently distinguished, and the segmentation of the photovoltaic panel in the initial infrared image is realized.

In the training process of the image segmentation model, experimental parameters can be set as follows: the size of the sample input image is 512×512 pixels, 90% of the training data set formed by the sample input image is used for training, 10% is used for testing, the initial weight is 0, the initial learning rate is 0.01, the number of samples (batch size) selected by one training is 4, and the training process is iterated for 20000 times.

The final image segmentation model was constructed successfully and the photovoltaic panel region segmentation image, i.e., the initial segmentation image, as shown in fig. 9 was output.

The image segmentation model may be implemented based on an mmsbegmentation (a semantic segmentation tool) framework. The training of the model as a whole adopts an end-to-end training method, and uses cross entropy loss as a loss function, for example, the loss function is as follows:

wherein S is a loss function, N is the number of pixels of the sample input image, i is a pixel index, p _i E (0, 1) is the probability that the ith pixel predicted by the model is foreground, y _i E (0, 1) is the label of the ith pixel, i.e. the foreground region not labeled with the artificial pixel level of the model. Unlike a common picture segmentation network, image segmentation based on position-related information only distinguishes between foreground and background regions of an object.

The above-mentioned settings of the respective parameters are only exemplary, and those skilled in the art can perform the setting of the model parameters according to actual use requirements.

Example 2

The present embodiment provides an image segmentation method, as shown in fig. 10, including:

s201, acquiring a current infrared image.

S202, acquiring target position related information corresponding to a target operation position of a target selection operation on a current infrared image.

And S203, generating a target position information image corresponding to the current infrared image based on the target position related information.

The relative positions of the target operation position in the current infrared image and the target position information image are the same.

S204, acquiring a target input image corresponding to the current infrared image based on the current infrared image and the corresponding target position information image.

S205, the target input image is input to the image segmentation model in embodiment 1, and the target segmentation image corresponding to the target input image is output.

In the image segmentation method in this embodiment, the process of obtaining the target input image in steps S201 to S204 is the same as the implementation logic of the process of obtaining the sample input image in embodiment 1, and will not be described here again.

According to the image segmentation method in the embodiment, based on the image segmentation model in the embodiment 1, when a user needs to segment an object to be segmented in a current infrared image, target position association information corresponding to a target operation position of target selection operation on the current infrared image is obtained, a target position information image corresponding to the current infrared image is generated based on the target position association information, a target input image for inputting the model is further obtained, and finally a target segmentation image corresponding to the target input image is output through the model.

And taking an image with the area of the object to be segmented corresponding to the target operation position as a foreground area as a target segmentation image, thereby realizing the segmentation of the object to be segmented.

If the object to be segmented in the current infrared image is a photovoltaic panel, a labeling person clicks a point in the photovoltaic panel to be segmented on the current infrared image by using a mouse, the point is usually the position, which points to the farthest position from the boundary of the photovoltaic panel, in the photovoltaic panel, two-dimensional position information of the point is recorded, a two-dimensional Gaussian function is used for blurring the two-dimensional position information, a Gaussian image with the same resolution as the corresponding current infrared image is generated, the Gaussian image and the current infrared image are subjected to channel series connection operation, so that a target input image is obtained, the target input image is input into an image segmentation model, and the model outputs an image taking the photovoltaic panel to be segmented as a foreground area, so that the segmentation of the photovoltaic panel to be segmented is realized.

According to the image segmentation method, based on the image segmentation model of the embodiment 1, the auxiliary data annotator annotates the current infrared image, and by using the image segmentation model, the data annotator only needs to determine the target operation position on the current infrared image through the target selection operation, and click on any point on the object to be segmented on the current infrared image, so that the object to be segmented in the current infrared image can be segmented, and the annotation efficiency and the annotation accuracy are greatly improved.

Example 3

The embodiment provides a construction system of an image segmentation model, as shown in fig. 11, the construction system comprises a first image acquisition module 1, a second image acquisition module and a third image acquisition module, wherein the first image acquisition module is used for acquiring a plurality of initial infrared images;

the second image acquisition module 2 is used for acquiring position association information corresponding to the operation position of the first selection operation on any initial infrared image and the selected initial segmentation image; the initial segmentation image comprises a preset object in a target scene;

a third image acquisition module 3 for generating a position information image corresponding to the initial infrared image based on the position-related information; the relative positions of the operation positions in the initial infrared image and the position information image are the same;

the first input acquisition module 4 is used for acquiring a sample input image corresponding to each initial infrared image based on the initial infrared image and the corresponding position information image;

the model building module 5 is configured to build an image segmentation model by taking each sample input image as input and taking an initial segmentation image corresponding to each initial infrared image as output.

In an alternative embodiment, the third image acquisition module 3 is specifically configured to generate, based on the position-related information and the initial infrared image, a gaussian image with the same resolution corresponding to the initial infrared image using a gaussian function.

In an alternative embodiment, the build system further comprises:

the scaling module is used for scaling the initial infrared image;

the matching module is used for carrying out position matching processing on the Gaussian image corresponding to the initial infrared image so that the resolution ratio of the Gaussian image is the same as that of the initial infrared image after the scaling processing and the relative positions of the operation positions in the initial infrared image and the Gaussian image are the same;

in an alternative embodiment, the first selection operation includes a user external operation, and the second image acquisition module 2 includes:

a position information acquiring unit 21, configured to acquire, for any one of the initial infrared images, position-related information corresponding to an operation position of an external operation of a user on the initial infrared image;

an object region determining unit 22, configured to determine, based on the position association information, an object region in which a preset object corresponding to an external operation of the user is located;

the segmentation unit 23 is configured to take an image with a foreground region being a target region in the initial infrared image as a corresponding initial segmentation image.

In an alternative embodiment, the first image acquisition module 1 includes a video acquisition unit 11, configured to acquire an initial infrared video corresponding to a target scene; the image acquisition unit 12 performs frame extraction operation on the initial infrared video to obtain a plurality of corresponding initial infrared images.

In an alternative embodiment, the target scene is a photovoltaic power station and the preset object is a photovoltaic panel.

In an alternative embodiment, the first input obtaining module 4 is specifically configured to perform a channel serial operation on each initial infrared image and the corresponding position information image, so as to obtain a sample input image corresponding to each initial infrared image.

In an alternative embodiment, the model building module 5 is specifically configured to build an image segmentation model based on a preset HRNet model and a hole space pyramid pooling module by taking each sample input image as input and taking an initial segmentation image corresponding to each initial infrared image as output.

In an alternative embodiment, the model building module 5 includes a feature extraction unit 51, configured to input a sample input image to a preset HRNet model, and extract multi-scale feature data corresponding to each sample input image, where the multi-scale feature data is used to characterize feature data with different resolutions and multiple dimensions; the feature fusion unit 52 is configured to unify feature data with different resolutions in the multi-scale feature data to the same preset resolution, and input the feature data to the hole space pyramid pooling module for feature extraction and feature fusion to obtain image feature data, so as to construct and obtain an image segmentation model; the image segmentation model outputs an initial segmentation image corresponding to each sample input image.

The working principle of the image segmentation model construction system of the present embodiment is the same as that of the image segmentation model construction method of embodiment 1, and will not be described here again.

According to the image segmentation model construction system, the image segmentation model is used, any point on the preset object in the target scene is selected through the first selection operation, so that the segmented image selected by the first selection operation can be obtained, and the preset object in the infrared image can be segmented because the segmented image contains the preset object in the target scene, so that the labeling efficiency and the labeling accuracy are greatly improved, the training precision and the training efficiency of the image segmentation model are further improved, and the effect of improving the precision of image segmentation processing is achieved.

Example 4

The present embodiment provides an image segmentation system, as shown in fig. 12, which includes a fourth image acquisition module 6 for acquiring a current infrared image;

a position acquisition module 7, configured to acquire target position related information corresponding to a target operation position of a target selection operation on a current infrared image;

a fifth image acquisition module 8 for generating a target position information image corresponding to the current infrared image based on the target position association information; the relative positions of the target operation position in the current infrared image and the target position information image are the same;

A second input acquiring module 9, configured to acquire a target input image corresponding to the current infrared image based on the current infrared image and the corresponding target position information image;

the divided image output module 10 is configured to input a target input image to the image division model in embodiment 1, and output a target divided image corresponding to the target input image.

The working principle of the image segmentation system of the present embodiment is the same as that of the image segmentation method of embodiment 2, and will not be described here again.

According to the image segmentation system, based on the image segmentation model of the embodiment 1, the auxiliary data labeling personnel label the current infrared image, through the use of the image segmentation model, only the target operation position on the current infrared image is determined through the target selection operation, any point on the object to be segmented on the current infrared image is clicked, the object to be segmented in the current infrared image can be segmented, the labeling efficiency and the labeling accuracy are greatly improved, and the training precision and the training efficiency of the image segmentation model are further improved, so that the effect of improving the precision of image segmentation processing is achieved.

Example 5

Fig. 13 is a schematic structural diagram of an electronic device according to the present embodiment. The electronic device includes a memory, a processor, and a computer program stored on the memory and for execution on the processor, the processor implementing the construction method of the image segmentation model as in embodiment 1 described above, or the image segmentation method in embodiment 2, when executing the program. The electronic device 80 shown in fig. 13 is merely an example, and should not be construed as limiting the functionality and scope of use of embodiments of the present invention.

As shown in fig. 13, the electronic device 80 may be in the form of a general purpose computing device, which may be a server device, for example. Components of the electronic device 80 may include, but are not limited to: the at least one processor 81, the at least one memory 82, a bus 83 connecting the various system components, including the memory 82 and the processor 81.

The bus 83 includes a data bus, an address bus, and a control bus.

The memory 82 may include volatile memory such as Random Access Memory (RAM) 821 and/or cache memory 822, and may further include Read Only Memory (ROM) 823.

Memory 82 may also include a program/utility 825 having a set (at least one) of program modules 824, such program modules 824 include, but are not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.

The processor 81 executes a computer program stored in the memory 82 to thereby execute various functional applications and data processing, such as the construction method of the image segmentation model in the above-described embodiment 1 of the present invention or the image segmentation method in embodiment 2.

The electronic device 80 may also communicate with one or more external devices 84 (e.g., keyboard, pointing device, etc.). Such communication may occur through an input/output (I/O) interface 85. Also, model-generating device 80 may also communicate with one or more networks, such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet, through network adapter 86. As shown in fig. 13, the network adapter 86 communicates with other modules of the model-generating device 80 via the bus 83. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in connection with the model-generating device 80, including, but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID (disk array) systems, tape drives, data backup storage systems, and the like.

It should be noted that although several units/modules or sub-units/modules of an electronic device are mentioned in the above detailed description, such a division is merely exemplary and not mandatory. Indeed, the features and functionality of two or more units/modules described above may be embodied in one unit/module in accordance with embodiments of the present invention. Conversely, the features and functions of one unit/module described above may be further divided into ones that are embodied by a plurality of units/modules.

Example 6

The present embodiment provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps in the image segmentation model construction method as in embodiment 1 described above, or the steps in the image segmentation method in embodiment 2.

More specifically, among others, readable storage media may be employed including, but not limited to: portable disk, hard disk, random access memory, read only memory, erasable programmable read only memory, optical storage device, magnetic storage device, or any suitable combination of the foregoing.

In a possible embodiment, the invention may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps in a construction method for implementing an image segmentation model as in the above-mentioned embodiment 1 or in an image segmentation method as in the embodiment 2, when the program product is executed on the terminal device.

Wherein the program code for carrying out the invention may be written in any combination of one or more programming languages, the program code may execute entirely on the user device, partly on the user device, as a stand-alone software package, partly on the user device, partly on a remote device or entirely on the remote device.

While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that this is by way of example only, and the scope of the invention is defined by the appended claims. Various changes and modifications to these embodiments may be made by those skilled in the art without departing from the principles and spirit of the invention, but such changes and modifications fall within the scope of the invention.

Claims

1. A method of constructing an image segmentation model, the method comprising:

acquiring a plurality of initial infrared images;

2. The construction method according to claim 1, wherein the step of generating a position information image corresponding to the initial infrared image based on the position-related information includes:

3. The method of constructing of claim 2, wherein the step of generating gaussian images having the same resolution corresponding to the initial infrared images further comprises, after the step of acquiring the sample input image corresponding to each initial infrared image:

scaling the initial infrared image;

for any initial infrared image, acquiring position related information corresponding to an operation position of the user external operation on the initial infrared image;

4. The method of claim 1, wherein the step of acquiring a plurality of initial infrared images comprises:

acquiring an initial infrared video corresponding to the target scene;

and taking each sample input image as input, taking an initial segmentation image corresponding to each initial infrared image as output, and constructing and obtaining the image segmentation model based on a preset HRNet model and a cavity space pyramid pooling module.

5. The method according to claim 4, wherein the step of constructing the image segmentation model based on a preset HRNet model and a hole space pyramid pooling module with each sample input image as input and each initial segmentation image corresponding to each initial infrared image as output comprises:

6. An image segmentation method, characterized in that the image segmentation method comprises:

acquiring a current infrared image;

inputting the target input image into the image segmentation model according to any one of claims 1-5, and outputting a target segmentation image corresponding to the target input image.

7. A construction system of an image segmentation model, the construction system comprising:

8. An image segmentation system, the image segmentation system comprising:

a segmented image output module for inputting the target input image into the image segmentation model according to any one of claims 1-5, and outputting a target segmented image corresponding to the target input image.

9. An electronic device comprising a memory, a processor and a computer program stored on the memory for execution on the processor, characterized in that the processor implements the method of constructing an image segmentation model according to any one of claims 1-5 or the method of image segmentation according to claim 6 when executing the computer program.

10. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the method of constructing an image segmentation model according to any one of claims 1-5, or the image segmentation method according to claim 6.