CN116168272A - Training method of image processing model, image processing method, device and medium - Google Patents

Training method of image processing model, image processing method, device and medium Download PDF

Info

Publication number
CN116168272A
CN116168272A CN202310230060.1A CN202310230060A CN116168272A CN 116168272 A CN116168272 A CN 116168272A CN 202310230060 A CN202310230060 A CN 202310230060A CN 116168272 A CN116168272 A CN 116168272A
Authority
CN
China
Prior art keywords
image
infrared
sample
resolution
infrared image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310230060.1A
Other languages
Chinese (zh)
Inventor
秦臻
孙启永
寸孟杰
俞贵涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ningbo Fotile Kitchen Ware Co Ltd
Original Assignee
Ningbo Fotile Kitchen Ware Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ningbo Fotile Kitchen Ware Co Ltd filed Critical Ningbo Fotile Kitchen Ware Co Ltd
Priority to CN202310230060.1A priority Critical patent/CN116168272A/en
Publication of CN116168272A publication Critical patent/CN116168272A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/7715Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The present disclosure provides a training method of an image processing model, an image processing method, an apparatus, and a medium, the training method including: acquiring a plurality of groups of preset sample images; each group of preset sample images comprises a first sample infrared image, a second sample infrared image and a sample visible light image at the same sample visual angle, and the resolution of the first sample infrared image is lower than that of the second sample infrared image and that of the sample visible light image; based on each set of preset sample images, training a preset network model to obtain an image processing model. The method and the device can combine the low-resolution infrared image with the visible light image to obtain the high-resolution infrared image, so that the cost is saved and the information of the infrared image is expanded; and through convolution, pooling and up-sampling processing of a plurality of layers, the detail characteristics and the integral characteristics of the image are considered, the accuracy of the output high-resolution infrared image is ensured, and the resolution of the infrared image is accurately and efficiently improved.

Description

Training method of image processing model, image processing method, device and medium
Technical Field
The disclosure belongs to the technical field of image processing, and in particular relates to a training method of an image processing model, an image processing method, image processing equipment and a medium.
Background
The infrared sensor can detect the temperature distribution of a target area and output an infrared image, and is widely applied to various fields such as non-contact temperature measurement, gas composition analysis, nondestructive inspection and the like. When the infrared temperature field to be detected is finer, a high resolution infrared sensor is required. However, the cost of either thermopile array type or microbolometer type sensors is positively correlated with the resolution of the sensor. At present, in order to reduce the use cost of the sensor, the image resolution is mainly improved by adopting a mode of carrying out interpolation processing (bilinear difference value, nearest neighbor difference value and the like) on the infrared image with low resolution, however, the mode cannot expand the information quantity of the original image, so that the problem that the refinement degree of the infrared image information cannot be improved is caused.
Disclosure of Invention
The technical problem to be solved by the present disclosure is to overcome the defect that in the prior art, the resolution of an infrared image is improved by adopting an interpolation method, and the information amount of an original image cannot be expanded, so that the refinement degree of the infrared image is low, and provide a training method, an image processing device and a medium for an image processing model.
The technical problems are solved by the following technical scheme:
the present disclosure provides a training method of an image processing model, the training method including:
acquiring a plurality of groups of preset sample images;
each group of preset sample images comprises a first sample infrared image, a second sample infrared image and a sample visible light image under the same sample visual angle, and the resolution of the first sample infrared image is lower than that of the second sample infrared image and that of the sample visible light image;
training a preset network model based on each set of preset sample images to obtain the image processing model.
Preferably, the step of training a preset network model based on each set of the preset sample images to obtain the image processing model includes:
inputting the first sample infrared image and the sample visible light image of each group into the preset network model, and outputting a simulated infrared image;
wherein the resolution of the simulated infrared image is higher than the resolution of the first sample infrared image;
calculating a loss value of the simulated infrared image compared with the corresponding second sample infrared image;
And updating model parameters of the preset network model based on the loss value until the loss value meets a preset loss condition to obtain the image processing model.
Preferably, the step of inputting the first sample infrared image and the sample visible light image of each group into the preset network model, and outputting the simulated infrared image includes:
for each group of preset sample images input into the preset network model, acquiring a plurality of infrared feature images under a first resolution according to the first sample infrared images;
acquiring a plurality of optical feature images under a second resolution according to the sample visible light image;
splicing each infrared characteristic image with one optical characteristic image to obtain a plurality of fusion characteristic images under a third resolution;
the resolution difference between the spliced infrared feature images and the optical feature images is smaller than a first preset resolution difference;
and acquiring the simulated infrared image according to a plurality of the fusion feature images.
Preferably, the preset network model comprises an infrared feature extraction module, an optical feature extraction module, a splicing module and an image reconstruction module;
The infrared characteristic extraction module is used for acquiring the infrared characteristic map;
the optical characteristic extraction module is used for acquiring the optical characteristic map;
the splicing module is used for splicing the infrared characteristic image and the optical characteristic image;
the image reconstruction module is used for acquiring the simulated infrared image.
Preferably, the preset network model is a convolutional neural network model;
the step of acquiring a plurality of infrared feature images under the first resolution according to the first sample infrared image comprises the following steps:
alternately performing convolution processing and upsampling processing on the first sample infrared image to obtain a plurality of infrared feature images under the first resolution;
and/or the number of the groups of groups,
the step of obtaining a plurality of optical feature images under the second resolution according to the sample visible light image comprises the following steps:
alternately carrying out convolution processing and pooling processing on the sample visible light image to obtain a plurality of optical feature images under the second resolution;
and/or the number of the groups of groups,
the step of acquiring the simulated infrared image according to the fusion feature maps comprises the following steps:
and alternately carrying out convolution processing and up-sampling processing on the fusion characteristic map so as to acquire the simulated infrared image.
Preferably, the step of alternately performing convolution processing and upsampling processing on the fused feature map to obtain the simulated infrared image includes:
alternately carrying out convolution processing and up-sampling processing on one fusion feature image with the minimum resolution, splicing the other fusion feature image after each up-sampling processing, and traversing the processing sequentially from small to large according to the resolution until all the fusion feature images are traversed to obtain the simulated infrared image;
and the resolution difference between the two fusion feature images spliced at each time is smaller than a second preset resolution difference.
Preferably, before the step of stitching each of the infrared feature maps with one of the optical feature maps, the method further includes:
cutting and/or expanding at least one of the spliced infrared characteristic diagram and the spliced optical characteristic diagram so that the resolution of the infrared characteristic diagram is equal to that of the optical characteristic diagram;
and/or the number of the groups of groups,
the step of splicing another fusion characteristic diagram after each upsampling process further comprises the following steps:
and cutting and/or expanding at least one of the two spliced fusion feature images to make the resolutions of the two fusion feature images equal.
The disclosure also provides an image processing method, which is implemented based on the image processing model obtained by the training method, and comprises the following steps:
acquiring a first target infrared image and a target visible light image under the same target visual angle;
inputting the first target infrared image and the target visible light image into the image processing model to output a second target infrared image;
wherein the resolution of the second target infrared image is higher than the resolution of the first target infrared image.
The present disclosure also provides a training system of an image processing model, the training system comprising:
the sample image acquisition module is used for acquiring a plurality of groups of preset sample images;
each group of preset sample images comprises a first sample infrared image, a second sample infrared image and a sample visible light image under the same sample visual angle, and the resolution of the first sample infrared image is lower than that of the second sample infrared image and that of the sample visible light image;
and the model training module is used for training a preset network model based on each group of preset sample images to obtain the image processing model.
Preferably, the model training module comprises:
the analog image output unit is used for inputting the first sample infrared image and the sample visible light image of each group into the preset network model and outputting an analog infrared image;
wherein the resolution of the simulated infrared image is higher than the resolution of the first sample infrared image;
the loss calculation unit is used for calculating a loss value of the simulated infrared image compared with the corresponding second sample infrared image;
and the parameter updating unit is used for updating the model parameters of the preset network model based on the loss value until the loss value meets a preset loss condition so as to obtain the image processing model.
Preferably, the analog image output unit is further configured to:
for each group of preset sample images input into the preset network model, acquiring a plurality of infrared feature images under a first resolution according to the first sample infrared images;
acquiring a plurality of optical feature images under a second resolution according to the sample visible light image;
splicing each infrared characteristic image with one optical characteristic image to obtain a plurality of fusion characteristic images under a third resolution;
The resolution difference between the spliced infrared feature images and the optical feature images is smaller than a first preset resolution difference;
and acquiring the simulated infrared image according to a plurality of the fusion feature images.
Preferably, the preset network model comprises an infrared feature extraction module, an optical feature extraction module, a splicing module and an image reconstruction module;
the infrared characteristic extraction module is used for acquiring the infrared characteristic map;
the optical characteristic extraction module is used for acquiring the optical characteristic map;
the splicing module is used for splicing the infrared characteristic image and the optical characteristic image;
the image reconstruction module is used for acquiring the simulated infrared image.
Preferably, the preset network model is a convolutional neural network model;
the analog image output unit is further configured to:
alternately performing convolution processing and upsampling processing on the first sample infrared image to obtain a plurality of infrared feature images under the first resolution;
and/or the number of the groups of groups,
alternately carrying out convolution processing and pooling processing on the sample visible light image to obtain a plurality of optical feature images under the second resolution;
and/or the number of the groups of groups,
and alternately carrying out convolution processing and up-sampling processing on the fusion characteristic map so as to acquire the simulated infrared image.
Preferably, the analog image output unit is further configured to:
alternately carrying out convolution processing and up-sampling processing on one fusion feature image with the minimum resolution, splicing the other fusion feature image after each up-sampling processing, and traversing the processing sequentially from small to large according to the resolution until all the fusion feature images are traversed to obtain the simulated infrared image;
and the resolution difference between the two fusion feature images spliced at each time is smaller than a second preset resolution difference.
Preferably, the analog image output unit is further configured to:
cutting and/or expanding at least one of the spliced infrared characteristic diagram and the spliced optical characteristic diagram so that the resolution of the infrared characteristic diagram is equal to that of the optical characteristic diagram;
and/or the number of the groups of groups,
and cutting and/or expanding at least one of the two spliced fusion feature images to make the resolutions of the two fusion feature images equal.
The disclosure further provides an image processing system, which is implemented based on the image processing model obtained by the training system, and the image processing system includes:
The target image acquisition module is used for acquiring a first target infrared image and a target visible light image under the same target visual angle;
the image processing module is used for inputting the first target infrared image and the target visible light image into the image processing model so as to output a second target infrared image;
wherein the resolution of the second target infrared image is higher than the resolution of the first target infrared image.
The present disclosure also provides an electronic device including a memory, a processor, and a computer program stored on the memory and configured to run on the processor, where the processor implements the training method and the image processing method of the image processing model described above when executing the computer program.
The present disclosure also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the training method and image processing method of the image processing model described above.
On the basis of conforming to the common knowledge in the art, each preferable condition can be arbitrarily combined to obtain each preferable embodiment of the disclosure.
The positive progress effect of the present disclosure is: the image processing model is obtained through training, the low-resolution infrared image obtained at low cost can be combined with the visible light image to obtain the high-resolution infrared image, and the information of the infrared image is expanded while the cost is saved; and through convolution, pooling and up-sampling processing of a plurality of layers, the detail characteristics and the integral characteristics of the image are considered, the accuracy of the output high-resolution infrared image is ensured, and the resolution of the infrared image is further accurately and efficiently improved.
Drawings
Fig. 1 is a first flowchart of a training method of an image processing model of embodiment 1 of the present disclosure.
Fig. 2 is a second flowchart of a training method of an image processing model of embodiment 1 of the present disclosure.
Fig. 3 is a third flowchart of a training method of an image processing model of embodiment 1 of the present disclosure.
Fig. 4 is a schematic structural diagram of an image processing model in the present disclosure.
Fig. 5 is a flowchart of an image processing method of embodiment 2 of the present disclosure.
Fig. 6 is a first block diagram of a training system of an image processing model of embodiment 3 of the present disclosure.
Fig. 7 is a second block diagram of a training system of an image processing model according to embodiment 3 of the present disclosure.
Fig. 8 is a block diagram of an image processing system according to embodiment 4 of the present disclosure.
Fig. 9 is a schematic structural diagram of an electronic device according to embodiment 5 of the present disclosure.
Detailed Description
The present disclosure is further illustrated by way of examples below, but is not thereby limited to the scope of the examples.
Example 1
The present embodiment provides a training method of an image processing model, as shown in fig. 1, the training method includes the following steps:
s1, acquiring a plurality of groups of preset sample images;
each group of preset sample images comprises a first sample infrared image, a second sample infrared image and a sample visible light image under the same sample visual angle, and the resolution of the first sample infrared image is lower than that of the second sample infrared image and that of the sample visible light image;
S2, training a preset network model based on each set of preset sample images to obtain an image processing model.
Specifically, an optical camera is used to collect a visible light Image (Image c ) Having a resolution of (w c ,h c ) The method comprises the steps of carrying out a first treatment on the surface of the Acquisition of low resolution infrared images (images) of a temperature region to be measured using a low resolution infrared sensor (e.g., an infrared array sensor) il ) The first sample infrared image has a resolution (w il ,h il ) The method comprises the steps of carrying out a first treatment on the surface of the Acquisition of high resolution infrared images (Image) of a region to be measured using high resolution infrared sensors ih ) The second sample infrared image as described above has a resolution (w ih ,h ih ). The cost of the photosensitive device of the optical camera is low, so that the optical camera with higher resolution can be used, and the maximum resolution of the optical camera can reach 10-100 times of the resolution of the infrared sensor. Thus, w c ≥w ih >w il ,h c ≥h ih >h il . The visible light image has high resolution and contains detailed information of objects in the target field of view, such as contour information, boundary information and the like, so that the infrared image with low resolution can be further expanded, temperature information in each pixel point is thinned, the resolution of the infrared image is improved, and the infrared image with high resolution is obtained.
By acquiring a plurality of sets of preset sample images, a preset network model can be trained to obtain an image processing model by using a first sample infrared image, a second sample infrared image and a sample visible light image in each set of samples. The image processing model is used for outputting a corresponding second target infrared image according to the first target infrared image and the target visible light image of any target area, namely, the resolution of the infrared image is improved.
The sample data set comprises a plurality of groups of preset sample images, and each group of preset sample images comprises a first sample infrared image collected by a low-resolution infrared sensor, a second sample infrared image collected by a high-resolution sensor and a sample visible light image collected by a high-resolution optical camera under the same sample visual field. Further, several sets of preset sample images in the sample dataset may be divided into a training set, a validation set and a test set according to a ratio of 3:1:1 for training, validation and testing of the neural network, respectively.
In the scheme, the image processing model is obtained through training, the low-resolution infrared image acquired at low cost can be combined with the visible light image to obtain the high-resolution infrared image, the cost is saved, the information of the infrared image is expanded, and the resolution of the infrared image is accurately and efficiently improved.
In one embodiment, as shown in fig. 2, step S2 includes:
s201, inputting a first sample infrared image and a sample visible light image of each group into a preset network model, and outputting a simulated infrared image;
wherein the resolution of the simulated infrared image is higher than the resolution of the first sample infrared image;
S202, calculating a loss value of the simulated infrared image compared with the corresponding second sample infrared image;
and S203, updating model parameters of a preset network model based on the loss value until the loss value meets a preset loss condition to obtain an image processing model.
Specifically, a first sample infrared image and a sample visible light image in each set of preset sample images are input into a preset network model, and a simulated infrared image is output. And calculating a loss value between the simulated infrared image and the corresponding second sample infrared image according to the preset loss function, and updating model parameters of the preset network model based on the loss value to obtain an image processing model. The cross entropy of RGB (red green blue color channel) values of each pixel is used as a loss function, the preset network model is trained to reduce the error to a minimum value, and model parameters at the moment are stored.
In the scheme, the image processing model is obtained through training of each group of preset sample images, and the model parameters are updated by simulating the back propagation of the loss values between the infrared images and the corresponding second sample infrared images, so that the performance of the image processing model can be improved, and the accuracy of the model in outputting the high-resolution infrared images is improved.
In one embodiment, as shown in fig. 3, step S201 includes:
s2011, acquiring a plurality of infrared feature images under a first resolution according to a first sample infrared image for each group of preset sample images input into a preset network model;
s2012, acquiring a plurality of optical feature images under the second resolution according to the sample visible light image;
s2013, splicing each infrared characteristic image with one optical characteristic image to obtain a plurality of fusion characteristic images under the third resolution;
the resolution difference between the spliced infrared feature images and the optical feature images is smaller than a first preset resolution difference;
s2014, acquiring a simulated infrared image according to the fusion feature maps.
Specifically, an infrared characteristic image and an optical characteristic image are obtained through characteristic extraction; then, a fusion characteristic diagram is obtained by splicing the two characteristic diagrams, wherein the channel number of the fusion characteristic diagram after splicing is the sum of the channel numbers of the infrared characteristic diagram and the optical characteristic diagram; and finally, carrying out convolution, downsampling, splicing and other reconstruction operations on the fusion feature map to obtain a high-resolution simulated infrared image.
In the scheme, the infrared characteristic images and the optical characteristic images are spliced and fused, so that the information quantity of the infrared characteristic images can be expanded, and the accuracy of the simulated infrared images output by the model is improved.
In one embodiment, the preset network model includes an infrared feature extraction module, an optical feature extraction module, a stitching module, and an image reconstruction module;
the infrared characteristic extraction module is used for obtaining an infrared characteristic diagram;
the optical characteristic extraction module is used for obtaining an optical characteristic diagram;
the splicing module is used for splicing the infrared characteristic diagram and the optical characteristic diagram;
the image reconstruction module is used for acquiring a simulated infrared image.
In the scheme, the preset network model is constructed through the infrared characteristic extraction module, the optical characteristic extraction module, the splicing module and the image reconstruction module so as to train to obtain the image processing model, and the performance of the model can be improved, so that the accuracy of the model in outputting the high-resolution infrared image is improved.
In one embodiment, the predetermined network model is a convolutional neural network model;
step S2011 includes: alternately performing convolution processing and up-sampling processing on the infrared image of the first sample to obtain a plurality of infrared feature images under the first resolution;
and/or the number of the groups of groups,
step S2012 includes: alternately carrying out convolution treatment and pooling treatment on the sample visible light image to obtain a plurality of optical characteristic images under the second resolution;
and/or the number of the groups of groups,
Step S2014 includes: and alternately carrying out convolution processing and up-sampling processing on the fusion characteristic diagram to obtain a simulated infrared image.
Specifically, since the high-resolution feature map can retain the detailed information of the image, and the low-resolution feature map can transmit the semantic information of the image, in the feature extraction stage, convolution processing and pooling processing or up-sampling processing are alternately performed to obtain an infrared feature map and a visible light feature map under different resolutions. The visible light image is subjected to convolution, pooling and other operations, the size (namely the resolution) of the feature map is continuously reduced, and the number of channels is continuously increased; the infrared image is subjected to convolution, upsampling and other operations, and the size of the characteristic image and the number of channels are continuously increased.
In the feature fusion stage, the infrared feature images and the optical feature images with similar or same size can be spliced, so that the features of the two input images are fused. The feature graphs are similar in size, and the feature graphs represent that the two heterogeneous feature graphs have feature information with similar dimensions and are suitable for being integrated into a whole.
And in the reconstruction stage, alternately rolling and upsampling one fusion characteristic diagram, and continuously splicing other fusion characteristic diagrams to obtain a simulated infrared image.
In the scheme, the detail information of the image and the semantic information of the transferred image can be reserved by acquiring the infrared feature images and the optical feature images with different resolutions, and the information quantity of the infrared feature images can be expanded by splicing and fusing the infrared feature images and the optical feature images with the resolution difference values within a preset range, so that the accuracy of the high-resolution infrared image output by the model is improved.
In one embodiment, step S2014 includes:
alternately carrying out convolution processing and up-sampling processing on one fusion feature image with the minimum resolution, splicing the other fusion feature image after each up-sampling processing, and sequentially traversing the processing according to the sequence from small to large resolution until traversing all the fusion feature images to obtain a simulated infrared image;
the resolution difference between the two fusion feature images spliced at each time is smaller than a second preset resolution difference.
Specifically, convolution processing and downsampling processing are alternately performed on the fusion feature map with the minimum resolution, and the feature map obtained by the i-1 th downsampling processing is spliced with the fusion feature map with the resolution similar to that (the difference value is smaller than the second preset resolution difference value), so that the fusion feature map is used as an input feature map of the i-th convolution processing.
In the reconstruction stage, the resolution of the feature map is increased by adopting an up-sampling method of a plurality of layers, and the fused feature maps with different resolutions in the feature fusion stage are continuously spliced in the up-sampling process, so that the features have continuity and stability.
In the scheme, the fusion characteristic diagrams with different resolutions are continuously spliced in the convolution and up-sampling processes, so that the continuity and stability of the characteristics of the image are ensured, the detail characteristics and the integral characteristics of the image are considered, and the accuracy of the output high-resolution infrared image is improved.
In one embodiment, the step of stitching each of the infrared signatures with an optical signature further comprises:
cutting and/or expanding at least one of the spliced infrared characteristic images and optical characteristic images to make the resolutions of the infrared characteristic images and the optical characteristic images equal;
and/or the number of the groups of groups,
the method further comprises the following steps after each upsampling process and before the step of splicing another fusion characteristic diagram:
and cutting and/or expanding at least one of the two spliced fusion feature images to make the resolutions of the two fusion feature images equal.
Specifically, when the sizes of the spliced feature images are different, edge filling operation or expansion operation can be performed on the periphery of one or two feature images. For example, the common padding values include 0, 0.5, and 1 (the padding value is 0 in this embodiment). If the feature map is 244×244 pixels in size and needs to be enlarged to 280×280 pixels, 0 values of (280-244)/2=18 pixels are added to the four sides of the original feature map, so that the feature map size becomes 280×280.
In the scheme, the feature images are cut or expanded so that the resolutions of the feature images to be spliced are equal, and the accuracy of feature splicing fusion is ensured.
The specific structure of the image processing model trained by the training method of the present embodiment is described below with reference to fig. 4:
the image processing model mainly comprises three parts which are respectively used for feature extraction, feature fusion and high-resolution infrared image generation:
(1) In the feature extraction stage, the input infrared temperature field image with the resolution of 68 x 68 and the channel number of 3 is alternately subjected to convolution processing and pooling processing for a plurality of times to obtain a plurality of infrared feature images with different resolutions and different channel numbers; the method comprises the steps of alternately carrying out convolution processing and up-sampling processing on an input visible light image with the resolution of 572 x 572 and the channel number of 3 for a plurality of times so as to obtain a plurality of optical characteristic diagrams with different resolutions and different channel numbers.
(2) In the feature fusion stage, the infrared feature images and the optical feature images with similar or identical sizes are spliced to obtain a fusion feature image (A, B, C in fig. 4), so that features of two input images are fused. The channel number of the fusion characteristic diagram after the splicing is the sum of the channel numbers of the infrared characteristic diagram and the optical characteristic diagram.
(3) In the high-resolution infrared image generation stage, the resolution of the feature images is increased by adopting a multi-level up-sampling method, and the feature images with different resolutions in the feature fusion stage are continuously spliced in the up-sampling process, so that the feature has continuity and stability. Specifically, convolution processing and downsampling processing are alternately performed on the fusion feature map with the minimum resolution (the fusion feature map with the resolution of 64×64 and the channel number of 576 in fig. 4), and the feature map obtained by the i-1 th downsampling processing is spliced with the fusion feature map with the resolution similar to that (the difference value is smaller than the second preset resolution difference value) (a ', B ' and C ' respectively spliced in fig. 4) to be used as the input feature map of the i-th convolution processing. Wherein A corresponds to A', A is a fusion feature map generated in a feature fusion stage, and is needed to be used as an input of a final image generation stage and spliced with an up-sampled result. Since the feature map size of the two inputs is different when the two input feature maps are spliced, if A is 136x136 and the object to be spliced is 120x120, the 4 sides of A are respectively cut off 8 pixels to become the size of 120x120, so that the size is marked as A'. B and B ', C and C' are as defined above. And finally, carrying out convolution operation on the characteristic diagram with the resolution of 456 x 456 and the channel number of 1152, and recovering the channel number to 3 while ensuring the high resolution to output the high-resolution infrared image.
According to the training method of the image processing model, the image processing model is obtained through training, the low-resolution infrared image obtained at low cost can be combined with the visible light image to obtain the high-resolution infrared image, and the cost is saved and the information of the infrared image is expanded; and through convolution, pooling and up-sampling processing of a plurality of layers, the detail characteristics and the integral characteristics of the image are considered, the accuracy of the output high-resolution infrared image is ensured, and the resolution of the infrared image is further accurately and efficiently improved.
Example 2
The present embodiment provides an image processing method implemented based on an image processing model obtained by the training method in embodiment 1, as shown in fig. 5, the image processing method including the steps of:
s3, acquiring a first target infrared image and a target visible light image under the same target visual angle;
s4, inputting the first target infrared image and the target visible light image into an image processing model to output a second target infrared image;
wherein the resolution of the second target infrared image is higher than the resolution of the first target infrared image.
According to the image processing method, through the image processing model, the first target infrared image acquired at low cost can be combined with the target visible light image to obtain the second target infrared image with high resolution, so that the cost is saved, the information of the infrared image is expanded, the accuracy of the output high-resolution infrared image is ensured, and further the resolution of the infrared image is accurately and efficiently improved.
Example 3
The present embodiment provides a training system for an image processing model, as shown in fig. 6, including:
the sample image acquisition module 1 is used for acquiring a plurality of groups of preset sample images;
each group of preset sample images comprises a first sample infrared image, a second sample infrared image and a sample visible light image under the same sample visual angle, and the resolution of the first sample infrared image is lower than that of the second sample infrared image and that of the sample visible light image;
the model training module 2 is configured to train a preset network model based on each set of preset sample images to obtain an image processing model.
In one embodiment, as shown in fig. 7, the model training module 2 includes:
a simulated image output unit 21 for inputting the first sample infrared image and the sample visible light image of each group into a preset network model and outputting a simulated infrared image;
wherein the resolution of the simulated infrared image is higher than the resolution of the first sample infrared image;
a loss calculation unit 22, configured to calculate a loss value of the simulated infrared image compared to the corresponding second sample infrared image;
the parameter updating unit 23 is configured to update model parameters of a preset network model based on the loss value until the loss value meets a preset loss condition, so as to obtain an image processing model.
In an embodiment, the analog image output unit 21 is further configured to:
for each group of preset sample images input into a preset network model, acquiring a plurality of infrared feature images under a first resolution according to the first sample infrared images;
acquiring a plurality of optical feature images under a second resolution according to the sample visible light image;
splicing each infrared characteristic image with one optical characteristic image to obtain a plurality of fusion characteristic images under a third resolution;
the resolution difference between the spliced infrared feature images and the optical feature images is smaller than a first preset resolution difference;
and acquiring a simulated infrared image according to the fusion feature maps.
In one embodiment, the preset network model includes an infrared feature extraction module, an optical feature extraction module, a stitching module, and an image reconstruction module;
the infrared characteristic extraction module is used for obtaining an infrared characteristic diagram;
the optical characteristic extraction module is used for obtaining an optical characteristic diagram;
the splicing module is used for splicing the infrared characteristic diagram and the optical characteristic diagram;
the image reconstruction module is used for acquiring a simulated infrared image.
In one embodiment, the predetermined network model is a convolutional neural network model;
The analog image output unit 21 is also configured to:
alternately performing convolution processing and up-sampling processing on the infrared image of the first sample to obtain a plurality of infrared feature images under the first resolution;
and/or the number of the groups of groups,
alternately carrying out convolution treatment and pooling treatment on the sample visible light image to obtain a plurality of optical characteristic images under the second resolution;
and/or the number of the groups of groups,
and alternately carrying out convolution processing and up-sampling processing on the fusion characteristic diagram to obtain a simulated infrared image.
In an embodiment, the analog image output unit 21 is further configured to:
alternately carrying out convolution processing and up-sampling processing on one fusion feature image with the minimum resolution, splicing the other fusion feature image after each up-sampling processing, and sequentially traversing the processing according to the sequence from small to large resolution until traversing all the fusion feature images to obtain a simulated infrared image;
the resolution difference between the two fusion feature images spliced at each time is smaller than a second preset resolution difference.
In an embodiment, the analog image output unit 21 is further configured to:
cutting and/or expanding at least one of the spliced infrared characteristic images and optical characteristic images to make the resolutions of the infrared characteristic images and the optical characteristic images equal;
And/or the number of the groups of groups,
and cutting and/or expanding at least one of the two spliced fusion feature images to make the resolutions of the two fusion feature images equal.
Since the training system of the image processing model provided in this embodiment is the same as the training method of the image processing model provided in embodiment 1, the description thereof will not be repeated here.
According to the training system of the image processing model, the image processing model is obtained through training, the low-resolution infrared image obtained at low cost can be combined with the visible light image to obtain the high-resolution infrared image, and the cost is saved and the information of the infrared image is expanded; and through convolution, pooling and up-sampling processing of a plurality of layers, the detail characteristics and the integral characteristics of the image are considered, the accuracy of the output high-resolution infrared image is ensured, and the resolution of the infrared image is further accurately and efficiently improved.
Example 4
The present embodiment provides an image processing system implemented based on an image processing model obtained by the training system in embodiment 3, as shown in fig. 8, comprising:
the target image acquisition module 3 is used for acquiring a first target infrared image and a target visible light image under the same target visual angle;
An image processing module 4 for inputting the first target infrared image and the target visible light image into an image processing model to output a second target infrared image;
wherein the resolution of the second target infrared image is higher than the resolution of the first target infrared image.
Since the image processing system provided in this embodiment is the same as the image processing method provided in embodiment 2, the description thereof will not be repeated here.
According to the image processing method, through the image processing model, the first target infrared image acquired at low cost can be combined with the target visible light image to obtain the second target infrared image with high resolution, so that the cost is saved, the information of the infrared image is expanded, the accuracy of the output high-resolution infrared image is ensured, and further the resolution of the infrared image is accurately and efficiently improved.
Example 5
The embodiment provides an electronic device, and fig. 9 is a schematic block diagram of the electronic device. The electronic device includes a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the training method of the image processing model of embodiment 1 and the image processing method of embodiment 2 when executing the program. The electronic device 30 shown in fig. 9 is merely an example and should not be construed to limit the functionality and scope of use of the disclosed embodiments.
As shown in fig. 9, the electronic device 30 may be embodied in the form of a general purpose computing device, which may be a server device, for example. Components of electronic device 30 may include, but are not limited to: the at least one processor 31, the at least one memory 32, a bus 33 connecting the different system components, including the memory 32 and the processor 31.
The bus 33 includes a data bus, an address bus, and a control bus.
Memory 32 may include volatile memory such as Random Access Memory (RAM) 321 and/or cache memory 322, and may further include Read Only Memory (ROM) 323.
Memory 32 may also include a program/utility 325 having a set (at least one) of program modules 324, such program modules 324 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.
The processor 31 executes various functional applications and data processing, such as the training method of the image processing model of embodiment 1 and the image processing method of embodiment 2 of the present disclosure, by executing a computer program stored in the memory 32.
The electronic device 30 may also communicate with one or more external devices 34 (e.g., keyboard, pointing device, etc.). Such communication may be through an input/output (I/O) interface 35. Also, model-generating device 30 may also communicate with one or more networks, such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet, via network adapter 36. As shown in fig. 9, network adapter 36 communicates with the other modules of model-generating device 30 via bus 33. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in connection with the model-generating device 30, including, but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID (disk array) systems, tape drives, data backup storage systems, and the like.
It should be noted that although several units/modules or sub-units/modules of an electronic device are mentioned in the above detailed description, such a division is merely exemplary and not mandatory. Indeed, the features and functionality of two or more units/modules described above may be embodied in one unit/module in accordance with embodiments of the present disclosure. Conversely, the features and functions of one unit/module described above may be further divided into ones that are embodied by a plurality of units/modules.
Example 6
The present embodiment provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the training method of the image processing model of embodiment 1 and the image processing method of embodiment 2.
More specifically, among others, readable storage media may be employed including, but not limited to: portable disk, hard disk, random access memory, read only memory, erasable programmable read only memory, optical storage device, magnetic storage device, or any suitable combination of the foregoing.
In a possible embodiment, the disclosure may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the training method implementing the image processing model of embodiment 1 and the image processing method of embodiment 2 when the program product is run on the terminal device.
Wherein the program code for carrying out the present disclosure may be written in any combination of one or more programming languages, and the program code may execute entirely on the user device, partly on the user device, as a stand-alone software package, partly on the user device, partly on a remote device or entirely on the remote device.
While specific embodiments of the present disclosure have been described above, it will be appreciated by those skilled in the art that this is by way of example only, and the scope of the disclosure is defined by the appended claims. Various changes and modifications to these embodiments may be made by those skilled in the art without departing from the principles and spirit of the disclosure, but such changes and modifications fall within the scope of the disclosure.

Claims (12)

1. A method of training an image processing model, the method comprising:
acquiring a plurality of groups of preset sample images;
each group of preset sample images comprises a first sample infrared image, a second sample infrared image and a sample visible light image under the same sample visual angle, and the resolution of the first sample infrared image is lower than that of the second sample infrared image and that of the sample visible light image;
training a preset network model based on each set of preset sample images to obtain the image processing model.
2. The method of training an image processing model according to claim 1, wherein the step of training a preset network model based on each set of the preset sample images to obtain the image processing model comprises:
Inputting the first sample infrared image and the sample visible light image of each group into the preset network model, and outputting a simulated infrared image;
wherein the resolution of the simulated infrared image is higher than the resolution of the first sample infrared image;
calculating a loss value of the simulated infrared image compared with the corresponding second sample infrared image;
and updating model parameters of the preset network model based on the loss value until the loss value meets a preset loss condition to obtain the image processing model.
3. The method of training an image processing model as claimed in claim 2, wherein the step of inputting the first sample infrared image and the sample visible light image of each group into the preset network model and outputting a simulated infrared image comprises:
for each group of preset sample images input into the preset network model, acquiring a plurality of infrared feature images under a first resolution according to the first sample infrared images;
acquiring a plurality of optical feature images under a second resolution according to the sample visible light image;
splicing each infrared characteristic image with one optical characteristic image to obtain a plurality of fusion characteristic images under a third resolution;
The resolution difference between the spliced infrared feature images and the optical feature images is smaller than a first preset resolution difference;
and acquiring the simulated infrared image according to a plurality of the fusion feature images.
4. The method for training an image processing model according to claim 3, wherein the preset network model comprises an infrared feature extraction module, an optical feature extraction module, a stitching module and an image reconstruction module;
the infrared characteristic extraction module is used for acquiring the infrared characteristic map;
the optical characteristic extraction module is used for acquiring the optical characteristic map;
the splicing module is used for splicing the infrared characteristic image and the optical characteristic image;
the image reconstruction module is used for acquiring the simulated infrared image.
5. The method for training an image processing model according to claim 3, wherein the preset network model is a convolutional neural network model;
the step of acquiring a plurality of infrared feature images under the first resolution according to the first sample infrared image comprises the following steps:
alternately performing convolution processing and upsampling processing on the first sample infrared image to obtain a plurality of infrared feature images under the first resolution;
And/or the number of the groups of groups,
the step of obtaining a plurality of optical feature images under the second resolution according to the sample visible light image comprises the following steps:
alternately carrying out convolution processing and pooling processing on the sample visible light image to obtain a plurality of optical feature images under the second resolution;
and/or the number of the groups of groups,
the step of acquiring the simulated infrared image according to the fusion feature maps comprises the following steps:
and alternately carrying out convolution processing and up-sampling processing on the fusion characteristic map so as to acquire the simulated infrared image.
6. The method of training an image processing model as claimed in claim 5, wherein said step of alternately performing convolution processing and upsampling processing on said fused feature map to obtain said simulated infrared image comprises:
alternately carrying out convolution processing and up-sampling processing on one fusion feature image with the minimum resolution, splicing the other fusion feature image after each up-sampling processing, and traversing the processing sequentially from small to large according to the resolution until all the fusion feature images are traversed to obtain the simulated infrared image;
and the resolution difference between the two fusion feature images spliced at each time is smaller than a second preset resolution difference.
7. The method of training an image processing model of claim 6 wherein said step of stitching each of said infrared signature with one of said optical signatures further comprises:
cutting and/or expanding at least one of the spliced infrared characteristic diagram and the spliced optical characteristic diagram so that the resolution of the infrared characteristic diagram is equal to that of the optical characteristic diagram;
and/or the number of the groups of groups,
the step of splicing another fusion characteristic diagram after each upsampling process further comprises the following steps:
and cutting and/or expanding at least one of the two spliced fusion feature images to make the resolutions of the two fusion feature images equal.
8. An image processing method implemented based on the image processing model obtained by the training method according to any one of claims 1 to 7, characterized in that the image processing method comprises:
acquiring a first target infrared image and a target visible light image under the same target visual angle;
inputting the first target infrared image and the target visible light image into the image processing model to output a second target infrared image;
Wherein the resolution of the second target infrared image is higher than the resolution of the first target infrared image.
9. A training system for an image processing model, the training system comprising:
the sample image acquisition module is used for acquiring a plurality of groups of preset sample images;
each group of preset sample images comprises a first sample infrared image, a second sample infrared image and a sample visible light image under the same sample visual angle, and the resolution of the first sample infrared image is lower than that of the second sample infrared image and that of the sample visible light image;
and the model training module is used for training a preset network model based on each group of preset sample images to obtain the image processing model.
10. An image processing system implemented based on the image processing model derived by the training system of claim 9, the image processing system comprising:
the target image acquisition module is used for acquiring a first target infrared image and a target visible light image under the same target visual angle;
the image processing module is used for inputting the first target infrared image and the target visible light image into the image processing model so as to output a second target infrared image;
Wherein the resolution of the second target infrared image is higher than the resolution of the first target infrared image.
11. An electronic device comprising a memory, a processor and a computer program stored on the memory for execution on the processor, characterized in that the processor implements the training method of the image processing model of any one of claims 1-7 and the image processing method of claim 8 when executing the computer program.
12. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the training method of the image processing model of any one of claims 1-7 and the image processing method of claim 8.
CN202310230060.1A 2023-03-06 2023-03-06 Training method of image processing model, image processing method, device and medium Pending CN116168272A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310230060.1A CN116168272A (en) 2023-03-06 2023-03-06 Training method of image processing model, image processing method, device and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310230060.1A CN116168272A (en) 2023-03-06 2023-03-06 Training method of image processing model, image processing method, device and medium

Publications (1)

Publication Number Publication Date
CN116168272A true CN116168272A (en) 2023-05-26

Family

ID=86418341

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310230060.1A Pending CN116168272A (en) 2023-03-06 2023-03-06 Training method of image processing model, image processing method, device and medium

Country Status (1)

Country Link
CN (1) CN116168272A (en)

Similar Documents

Publication Publication Date Title
CN110136056B (en) Method and device for reconstructing super-resolution image
CN108961180B (en) Infrared image enhancement method and system
CN112861729B (en) Real-time depth completion method based on pseudo-depth map guidance
US11164306B2 (en) Visualization of inspection results
JP7150468B2 (en) Structural deterioration detection system
US11037030B1 (en) System and method for direct learning from raw tomographic data
CN110245683B (en) Residual error relation network construction method for less-sample target identification and application
CN112037223B (en) Image defect detection method and device and electronic equipment
CN114219890A (en) Three-dimensional reconstruction method, device and equipment and computer storage medium
CN113920538A (en) Object detection method, device, equipment, storage medium and computer program product
CN114010180B (en) Magnetic resonance rapid imaging method and device based on convolutional neural network
CN108520532B (en) Method and device for identifying motion direction of object in video
CN108573510B (en) Grid map vectorization method and device
CN112733724B (en) Relativity relationship verification method and device based on discrimination sample meta-digger
CN114332457A (en) Image instance segmentation model training method, image instance segmentation method and device
CN113516697A (en) Image registration method and device, electronic equipment and computer-readable storage medium
CN116168272A (en) Training method of image processing model, image processing method, device and medium
CN113610856B (en) Method and device for training image segmentation model and image segmentation
CN115330930A (en) Three-dimensional reconstruction method and system based on sparse to dense feature matching network
CN115375715A (en) Target extraction method and device, electronic equipment and storage medium
CN114022458A (en) Skeleton detection method and device, electronic equipment and computer readable storage medium
CN113139617A (en) Power transmission line autonomous positioning method and device and terminal equipment
KR20220023841A (en) Magnetic resonance image analysis system and method for alzheimer's disease classification
CN112989919A (en) Method and system for extracting target object from image
CN113706450A (en) Image registration method, device, equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination