CN113947555A

CN113947555A - Infrared and visible light fused visual system and method based on deep neural network

Info

Publication number: CN113947555A
Application number: CN202111126389.0A
Authority: CN
Inventors: 申强; 杨晓望; 黎斌; 张洁; 王晨宇; 李波; 杨新晖; 程鹏; 杨阳; 孙元超; 赵凡墨; 陈稷
Original assignee: Xixian New District Power Supply Company State Grid Shaanxi Electric Power Co
Current assignee: Xixian New District Power Supply Company State Grid Shaanxi Electric Power Co
Priority date: 2021-09-26
Filing date: 2021-09-26
Publication date: 2022-01-18

Abstract

The invention provides a visual method based on the fusion of infrared and visible light of a deep neural network, which relates to the technical field of power system monitoring and comprises the following steps: acquiring an infrared camera image and a visible light camera image through a polling platform; carrying out image fusion on the infrared camera image and the visible light camera image to obtain a fused image; carrying out salient region temperature measurement on the fused image; and identifying the target and alarming the disaster by a target detection technology based on the generation of the countermeasure network. The invention provides a visual system based on fusion of infrared and visible light of a deep neural network. According to the invention, different fusion weights of infrared and visible light images are fused according to different weights, a two-point neural network and time domain high-pass filtering combined non-uniform correction algorithm is adopted to improve the temperature measurement precision, and the target identification precision is improved through a target detection technology based on a deep neural network.

Description

Infrared and visible light fused visual system and method based on deep neural network

Technical Field

The invention relates to the technical field of power system monitoring, in particular to an infrared and visible light fusion visual system and method based on a deep neural network.

Background

With the development of the scale of the power system, the requirements on the safe operation and the power supply reliability of the power line are higher and higher. In order to ensure the normal operation of the power system, power inspection is one of the important daily works of power practitioners. According to a conventional traditional manual inspection mode, operation and maintenance personnel need to complete inspection work of various devices according to a planned inspection route map in sequence, and the work consumes a large amount of manpower and time.

The work flow of the intelligent unattended inspection image AI processing system generally comprises three parts, namely front-end refined image acquisition, data returning and background data processing. How to ensure the effectiveness of image acquisition data and the efficiency of background data processing is an important index for measuring the performance of the intelligent unattended power inspection system. In the daily inspection process, the important point is to quickly find the detail problems of equipment such as lines, towers and the like, monitor, identify and early warn the state of a power system, particularly disaster early warning in severe environment, but during field operation, the condition that external force has negative influence on the inspection precision of the unmanned aerial vehicle often occurs, and the external force comprises strong wind, strong light and signal interference.

In the prior outdoor power inspection, in a severe environment, power inspection personnel need to turn over mountains and go over mountains, the occupied amount of manpower is very large, and the overall efficiency is not high; an important part of the inspection operation flow is how to effectively manage all data collected on site and refine the data into effective information. Data is not equivalent to information, which may be massive or large in size, but information must be condensed and generalizable. Under the current technical conditions, the real-time transmission of large-size data samples through a wireless link is not mature.

Disclosure of Invention

In order to overcome the problems in the prior art, the invention provides a visual system and a visual method based on the fusion of infrared and visible light of a deep neural network. Based on the unmanned aerial vehicle platform, have infrared, a plurality of payloads such as visible light and multispectral can be real-timely monitor and discernment early warning power system's state, greatly promoted disaster early warning's under the adverse circumstances success rate.

The technical scheme of the invention is as follows:

the infrared and visible light fusion vision method based on the deep neural network comprises the following steps:

acquiring an infrared camera image and a visible light camera image through a polling platform;

carrying out image fusion on the infrared camera image and the visible light camera image to obtain a fused image;

carrying out salient region temperature measurement on the fused image;

and identifying the target and alarming the disaster by a target detection technology based on the generation of the countermeasure network.

The further technical scheme of the invention is that the infrared camera image and the visible light camera image are subjected to image fusion to obtain a fused image; the method specifically comprises the following steps:

carrying out image preprocessing on the collected visible light camera and the collected infrared camera;

and calculating an entropy value of the visible light image, taking the visible light image as a fusion image when the entropy value is higher than a threshold value, and finishing the fusion of the visible light image and the infrared image by adopting a VGG-19 deep neural network to obtain the fusion image when the entropy value is lower than the threshold value.

The further technical scheme of the invention is that the fusion of the visible light image and the infrared image is completed by adopting the VGG-19 deep neural network to obtain a fused image, and the fusion method specifically comprises the following steps: the method for calculating the salient feature maps of the images in different wave bands by adopting the VGG-19 deep neural network so that the finally fused images can keep the salient features of the respective images comprises the following steps:

image scale segmentation: carrying out scale decomposition on the visible light image to form a visible light base layer and a visible light detail layer, and carrying out scale decomposition on the infrared image to form an infrared base layer and an infrared detail layer; the method specifically comprises the following steps:

dividing the original image into twoLayers, one of which is a base layer and the other is a detail layer, assuming a source image is I_kThe base layer image is

The fine layer is

The base layer image is obtained by solving the following equation:

wherein gx [ -1, 1], gy [ -1, 1] T, which are horizontal and vertical gradient operators, the detail layer image is equal to the source image minus the base layer image;

and (3) base layer image fusion: performing base layer fusion on the visible light detail layer and the infrared base layer; the method specifically comprises the following steps:

the basic level image comprises the common characteristics and redundant information of the source image, and the average weight strategy is adopted to carry out basic level image fusion:

wherein, (x, y) represents the position of the corresponding pixel point in the image, α₁And alpha₂Represent different fusion weight values;

detail layer image fusion: carrying out detail layer fusion on the visible light detail layer and the infrared detail layer through a deep neural network; the method specifically comprises the following steps:

for detail layer images

And

calculating a fusion weight value by adopting a deep learning network VGG-19, performing deep feature extraction through the VGG-19 network, obtaining a weight image through a multilayer fusion strategy, and obtaining a final detail layer fusion result by using the weight value obtained by training and a detail layer image; wherein, the multilayer fusion strategy is as follows:

consider a detail layer image as

Indicating a characteristic weight map of the detail layer image of the K layer, wherein m is a channel number of the i layer;

Φ_i() Representing that one layer i in the VGG network belongs to {1, 2, 3, 4}, representing relu _1_1, relu _2_1, relu _3_1 and relu _4_ 1;

is an M-dimensional vector;

the L1 norm as a function of the activity level rating of the source image detail layer, so the initial active layer weight map can be obtained by:

computing a final active weight image using a block-based averaging operator

Can be obtained from the following equation:

wherein γ represents the block size;

when live layer images are acquired

The initial weight map is obtained by the soft-max function:

k represents the number of the live layer images, K is set to be 2,

indicating that the initial weight map is in the range of (0-1); as is known, the pooling layer operator of the VGG network is a down-sampling algorithm, and each time the operator reduces the image by 1/s, making s equal to 2, so the span of the pooling layer in the VGG network is set to 2, so the size of the weight map at different layers is 1/2i-1 times that of the detail layer image, i belongs to {1, 2, 3, 4}, which means four layers of relu _1_1, relu _2_1, relu _3_1 and relu _4_ 1; after obtaining the initial weight map, the weight map is up-sampled, the weight map with small size is enlarged to the image with the same size as the detail layer, four pairs of weight images are obtained,

for each pair of weight maps, the result of detail level fusion is shown as follows:

finally, detail layer image fusion is obtained by the following formula:

F_d(x,y)＝max[F_d(x,y)|i∈{1,2,3,4}]；

F_dthe meaning of the function is that the maximum value of each layer is taken as the final fused image of the detail layer;

and (3) final image fusion: fusing the base layer fusion and the detail layer fusion to form a fusion image; the method specifically comprises the following steps:

when the base layer image and the detail layer image are fused, the base layer image and the detail layer image are added by the following formula to obtain a final fused image:

F(x,y)＝F_b(x，y)+F_d(x,y)；

wherein F_bFused image representing a base layer, F_dA fused image representing a detail layer.

Further, the value of γ is 1.

Further, said α is₁And alpha₂Are each equal to 0.5.

The method adopts the further technical scheme that the salient region temperature measurement is carried out on the fused image; the method specifically comprises the following steps: extracting a saliency region from the fused image through superpixel segmentation, measuring and screening the temperature of the saliency region, early warning the saliency region with the temperature greater than a set value, and sending early warning information; meanwhile, smoke detection is carried out on the significance region larger than the set value on the basis of a deep learning model, and whether the significance region larger than the set value is in fire or not is judged.

As a further technical solution of the present invention, the detecting smoke in an area of an image range by using a deep learning-based model to determine whether the area is a fire includes:

performing characteristic training on smoke generated in a fire by adopting a deep learning-based model;

adopting a generated countermeasure network to carry out early-stage training of smoke detection, and inputting the trained weighted value as a judgment parameter into an early warning system;

when smoke generated by fire is detected, the smoke is uploaded to the early warning terminal through the wireless network to give an alarm, and the fact that the fire is generated in the area is indicated.

The method comprises the following steps of extracting a significant region from a fused image through superpixel segmentation, measuring and screening the temperature of the significant region, early warning the significant region with the temperature being greater than a set value, and sending early warning information; the method specifically comprises the following steps: carrying out temperature measurement on the salient region, and carrying out temperature measurement by adopting a two-point + neural network + time domain high-pass filtering combined non-uniform correction algorithm and a colorimetric temperature measurement and ratio inversion high-order fitting algorithm;

firstly, acquiring high and low temperature black body images, solving correction gain and offset coefficients according to a two-point method, and taking the correction gain and offset coefficients as initial values of correction coefficients of the improved neural network method, wherein the calculation formula of the two-point method is as follows:

wherein the response outputs of the images of the high-temperature black body and the low-temperature black body are respectively x (T)_H) And x (T)_L) Expressed as the mean values of their output responses are respectively

And

g and O represent a gain coefficient and an offset coefficient, respectively;

on the basis of a two-point method, a neural network method is utilized to take a neighborhood weighted average value of an output response as an expected output value of an iterative algorithm, and repeated iteration is carried out until correction gain and an offset coefficient are converged, wherein the calculation formula of the neural network method is as follows:

wherein f is_n(i, j) is an output expectation expression, G_n+1(i, j) and O_n+1(i, j) are an iterative gain coefficient and an offset coefficient, respectively;

on the basis of a neural network method, updating of an offset coefficient is completed by adopting a time domain high-pass filtering algorithm under the condition of not changing a gain coefficient, and the time domain high-pass filtering algorithm has the following calculation formula:

y(n)＝x(n)-f(n)；

wherein x (n) represents the nth frame image of the focal plane output, f (n) represents the low frequency portion of the input image, and y (n) represents the corrected output image;

the colorimetric temperature measurement technology determines the real temperature of a measured target by establishing a functional relation between the temperature and the ratio of spectral radiance of the measured target under two wavelengths; according to planck's law of black body radiation, the spectral radiance of an actual object can be expressed as the following formula:

wherein M is the spectral radiance, and T is the temperature of the object;

when the temperature is T, the central wavelength is lambda₁And λ₂The ratio of the corresponding degrees of radiation is:

wherein R is the ratio of the degrees of radiation;

as can be seen from the above formula, for the colorimetric temperature measurement system, T and R form a single-value relationship, and T can be obtained as long as R is obtained;

in order to improve the colorimetric temperature measurement accuracy, a high-order fitting algorithm is adopted to obtain the fitting coefficients of T and R, and the calculation relation of T and R is set as follows:

by calculation such that:

the following system of linear equations can be obtained,

to a about₀，a₁，…，a_nThe polynomial fitting coefficients of T and R can be obtained by solving the linear equation set.

As a further technical solution of the present invention, identifying a target and performing disaster warning by a target detection technology based on generation of a countermeasure network specifically includes:

the generation countermeasure network is composed of two models, a model G and a discrimination model D are generated, random noise z generates a sample G (z) which is as compliant with real data distribution Pdata as much as possible through G, and the discrimination model D can judge whether an input sample is real data x or generated data G (z); g and D are non-linear mapping functions;

the specific algorithm for generating the countermeasure network is as follows: firstly, optimizing a discriminator under the condition that a generator is given; the discriminator is a two-classification model, and the training of the discriminator is a process for realizing the minimum cross entropy; e (-) is the calculation of the expected value, x is sampled from the real data distribution Pdata (x), z is sampled from the prior distribution Pz (z); the generator is divided by a priori noise in order to learn the distribution of the data xThe Pz (z) constructs a mapping space G (z; theta)_z) The corresponding discriminator mapping function is D (x; theta_d) Outputting a scalar to represent the probability that x is real data;

the loss function of the generator is defined as the output of the generator and the Cross entropy (Cross entropy) of 1; the loss function of the discriminator consists of two parts, namely 1) the cross entropy of the real sample and 1 is made through the output of the discriminator; the output of the sample generated by the generator and 0 are subjected to cross entropy; the penalty function of the discriminator is the sum of the two components; after the loss functions of the discriminator and the generator are obtained, an Adam optimizer is selected to optimize the loss functions.

Visual system based on integration of infrared and visible light of deep neural network includes:

the visible light and infrared image acquisition unit acquires an infrared camera image and a visible light camera image through the inspection platform;

the image fusion unit is used for carrying out image fusion on the infrared camera image and the visible light camera image to obtain a fusion image;

the temperature measuring unit is used for measuring the temperature of the salient region of the fused image;

and the identification early warning unit is used for identifying the target and giving a disaster alarm through a target detection technology based on the generation countermeasure network.

Further, the visible light and infrared image unit is an unmanned aerial vehicle inspection platform.

The invention has the beneficial effects that:

1. the invention provides a depth learning algorithm-based method, which is characterized in that different fusion weights of infrared and visible light images are calculated, then image fusion is carried out according to different weights, and important information in the two images is combined on one image;

2. according to the method, a two-point + neural network + time domain high-pass filtering combined non-uniform correction algorithm and a colorimetric temperature measurement and ratio inversion high-order fitting algorithm are adopted, so that the temperature measurement precision is improved, and the temperature measurement precision is guaranteed to reach 1%;

3. according to the method, through a target detection technology based on a deep neural network, a model sample is generated in the generation process of fitting data by using an antagonistic network model and a generator, and the optimization target is Nash balance, so that the generator estimates the distribution of the data sample. The GAN network is widely researched and applied in the field of images and vision at present, and due to the fact that the GAN network is provided with the generator and the discriminator, the recognition capability of a target can be continuously improved during training, and the target recognition precision is higher.

Drawings

FIG. 1 is a flow chart of a visual method based on the fusion of infrared and visible light of a deep neural network according to the present invention;

FIG. 2 is a diagram of an image fusion framework proposed by the present invention;

FIG. 3 is a diagram of a detail layer image fusion architecture according to the present invention;

FIG. 4 is a flow chart of a countermeasure generation network proposed by the present invention;

fig. 5 is a structural diagram of a visual system based on the fusion of infrared and visible light of a deep neural network.

Detailed Description

The conception, the specific structure, and the technical effects produced by the present invention will be clearly and completely described below in conjunction with the embodiments and the accompanying drawings to fully understand the objects, the features, and the effects of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments, and those skilled in the art can obtain other embodiments without inventive effort based on the embodiments of the present invention, and all embodiments are within the protection scope of the present invention.

The interference under extreme weather such as big fog, sleet, haze can be overcome to this project utilization technologies such as laser lighting technology, infrared and visible light fusion technique and artificial intelligence, knows the on-the-spot harm degree of electric power very first time.

Referring to fig. 1, it is a flow chart of a visual method based on the fusion of infrared and visible light of a deep neural network proposed by the present invention;

as shown in fig. 1, the infrared and visible light fusion visual method based on the deep neural network includes the following steps:

101, acquiring an infrared camera image and a visible light camera image through an inspection platform;

102, carrying out image fusion on an infrared camera image and a visible light camera image to obtain a fused image;

103, carrying out salient region temperature measurement on the fused image;

and 104, identifying the target by a target detection technology based on the generation countermeasure network and alarming the disaster.

In the embodiment of the invention, the infrared camera image and the visible light camera image are subjected to image fusion to obtain a fused image; the method specifically comprises the following steps:

Firstly, image preprocessing is carried out on a visible light camera and an infrared camera, image enhancement, denoising and other processing are included, then an entropy value of the visible light image is calculated, when the entropy value is higher than a threshold value, the visible light camera can be considered to be in a daytime stage at present, the visible light camera is not fused with the infrared image, when the entropy value is lower than the threshold value, the visible light camera can be considered to be in an evening stage at present, and because the infrared image can not well reflect and detect texture and detail information of an object, the visible light camera is started to be fused with the infrared image, and therefore the fused image can be guaranteed to have good detail and texture information all the time. Then, completing the fusion work of the visible light and the infrared image by adopting a VGG-19 deep neural network;

referring to fig. 2, in the embodiment of the present invention, a VGG-19 deep neural network is used to complete visible light and infrared image fusion to obtain a fusion image, which specifically includes: the method for calculating the salient feature maps of the images in different wave bands by adopting the VGG-19 deep neural network so that the finally fused images can keep the salient features of the respective images comprises the following steps:

dividing an original image into two layers, wherein one layer is a base layer, the other layer is a detail layer, and assuming that a source image is I_kThe base layer image is

The fine layer is

The base layer image is obtained by solving the following equation:

referring to fig. 3, detail layer image fusion: carrying out detail layer fusion on the visible light detail layer and the infrared detail layer through a deep neural network; the method specifically comprises the following steps:

for detail layer images

And

consider a detail layer image as

is an M-dimensional vector;

using block-based averagingOperator calculation of the final active weight image

Can be obtained from the following equation:

wherein γ represents the block size;

when live layer images are acquired

The initial weight map is obtained by the soft-max function:

k represents the number of the live layer images, K is set to be 2,

finally, detail layer image fusion is obtained by the following formula:

F_d(x,y)＝max[F_d(x,y)|i∈{1,2,3,4}]；

F(x,y)＝F_b(x,y)+F_d(x,y)；

Further, the value of γ is 1. If γ becomes larger, the fused image will become more dynamic, but image details are easily lost, so γ is selected to be 1.

Further, said α is₁And alpha₂Are each equal to 0.5. To maintain more general information and attenuate redundant information, we choose α₁And alpha₂Are each equal to 0.5.

In the embodiment of the invention, the technology for fusing infrared and visible light multi-source information based on the deep neural network specifically comprises the following steps: firstly, decomposing an image into a base layer image and a detail layer image, and fusing the base layer image by adopting an average weight algorithm; secondly, for a detail layer, extracting multilayer features by using a VGG-19 multilayer neural network, then generating a candidate image of a detail fusion image by using an L1 norm and an average weight, and taking a maximum candidate image as a final detail layer fusion image; and finally, adding the base layer image and the detail layer image to form a final fusion image.

The invention provides a depth learning algorithm, which is used for calculating different fusion weights of infrared and visible light images and then fusing the images according to the different weights.

Further, carrying out salient region temperature measurement on the fused image; the method specifically comprises the following steps: extracting a saliency region from the fused image through superpixel segmentation, measuring and screening the temperature of the saliency region, early warning the saliency region with the temperature greater than a set value, and sending early warning information; meanwhile, smoke detection is carried out on the significance region larger than the set value on the basis of a deep learning model, and whether the significance region larger than the set value is in fire or not is judged.

The fused image is divided by a super-pixel image to extract a salient region, temperature measurement is carried out on the salient region, the salient region usually comprises regions which are prone to fire hazards, such as power equipment and electric wires, and when the temperature of one region is found to be far higher than that of other regions, the system can send information to an early warning system terminal through a wireless network, and the fact that the region has the fire hazards is indicated.

The method comprises the steps of detecting the temperature of a significant area, simultaneously, carrying out smoke detection on the area within an image view field range, generating a confrontation network for carrying out early-stage training of smoke detection by adopting a deep learning-based model for reducing the fire misjudgment rate because the types of smoke are wide and the smoke with low concentration cannot represent the occurrence of a fire, inputting a trained weighted value into an early warning system as a judgment parameter, and uploading the weighted value to an early warning terminal for warning through the wireless network when the system detects the smoke generated by the fire, thereby indicating that the area has the fire.

Further, the detecting smoke in the region of the image range by using a deep learning-based model to determine whether the region is in fire specifically includes:

Further, extracting a significance region from the fused image through superpixel segmentation, measuring and screening the temperature of the significance region, and early warning and sending early warning information to the significance region with the temperature being greater than a set value; the method specifically comprises the following steps: carrying out temperature measurement on the salient region, and carrying out temperature measurement by adopting a two-point + neural network + time domain high-pass filtering combined non-uniform correction algorithm and a colorimetric temperature measurement and ratio inversion high-order fitting algorithm;

And

g and O represent a gain coefficient and an offset coefficient, respectively;

y(n)＝x(n)-f(n)；

in the embodiment of the invention, the two-point neural network and time domain high-pass filtering combined non-uniform correction algorithm is adopted, the output image has higher contrast, the edge of the image is clear, the time required by the convergence of the algorithm is shorter, the over-dark area can be removed, and the fixed pattern noise is inhibited.

wherein M is the spectral radiance, and T is the temperature of the object;

wherein R is the ratio of the degrees of radiation;

by calculation such that:

the following system of linear equations can be obtained,

In the embodiment of the invention, a two-point + neural network + time domain high-pass filtering combined non-uniform correction algorithm and a colorimetric temperature measurement and ratio inversion high-order fitting algorithm are adopted, so that the temperature measurement precision is improved, and the temperature measurement precision is ensured to reach 1%.

Referring to fig. 4, in the implementation of the present invention, identifying a target and performing disaster warning by using a target detection technology based on a generated countermeasure network specifically includes:

the generation countermeasure network is composed of two models, a model G and a discrimination model D are generated, random noise z generates a sample G (z) which is as compliant with real data distribution Pdata as much as possible through G, and the discrimination model D can judge whether an input sample is real data x or generated data G (z); g and D are non-linear mapping functions, such as multilayer perceptrons;

the specific algorithm for generating the countermeasure network is as follows: firstly, optimizing a discriminator under the condition that a generator is given; the discriminator is a two-classification model, and the training of the discriminator is a process for realizing the minimum cross entropy; e (-) is the calculation of the expected value, x is sampled from the real data distribution Pdata (x), z is sampled from the prior distribution Pz (z); to learn the distribution of the data x, the generator constructs a mapping space G (z; θ) from the prior noise distribution Pz (z)_z) The corresponding discriminator mapping function is D (x; theta_d) Outputting a scalar to represent the probability that x is real data;

the generator aims at the condition that the output of the generator is close to 1 after passing through the discriminator (the generated sample is close to a real sample); the purpose of the discriminator is to make the result of the real sample passing through the discriminator approach to 1, and make the result of the sample generated by the generator passing through the discriminator approach to 0.

The loss function of the generator is defined as the output of the generator and the Cross entropy (Cross entropy) of 1; the penalty function of the discriminator consists of two parts: the output of the real sample passing through the discriminator and the '1' are subjected to cross entropy; the output of the sample generated by the generator and 0 are subjected to cross entropy; the penalty function of the discriminator is the sum of the two components; after the loss functions of the discriminator and the generator are obtained, an Adam optimizer is selected to optimize the loss functions. In the actual training process, the arbiter can easily win the competition training with the generator, resulting in the generator appearing gradient disappearance (vanizing gradient). Therefore, in training, updating the arbiter once needs to update k (k >1) secondary generators to make the arbiter not reach (approximate) optimum quickly in the training process, so as to keep the antagonistic balance of the generator and the arbiter; the k value is selected according to data sets of different scales, if the k value is small, the discriminator can reach (approximate) optimum, the generator can have the situation that the gradient disappears, and the loss function can not be reduced; if the k value is too large, the gradient of the generator is inaccurate and oscillates back and forth. Applying the trained GAN to target detection and recognition; extracting a trained GAN discriminator part (removing the last layer), finely adjusting parameters in a new structure, wherein the output dimension of the last full-connection layer is n, which is different from the structure of the discriminator; the output result passes through a Softmax classifier; a loss function obtained by cross entropy with the label y is optimized by using Adam in the same way; in addition, when training this structure, Dropout is set to a value of 0.5 to prevent the full connectivity layer from overfitting.

Generative countermeasure networks (GANs) are one generative model. Different from the traditional generation model, the network structure of the method also comprises a discriminant network besides the generation network. The generation network and the discrimination network are in a confrontational relationship. The idea of confrontation is derived from Game theory (Game theory), and Game parties respectively utilize the strategy of the other party to transform the confrontation strategy in the equivalent Game to achieve the goal of winning. And the method is extended to a generation countermeasure network, namely, a generator and a discriminator are game parties, the generator fits the data generation process to generate model samples, and the optimization goal is to achieve Nash equilibrium so that the generator estimates the distribution of the data samples. The GAN network is widely researched and applied in the field of images and vision at present, and due to the fact that the GAN network is provided with the generator and the discriminator, the recognition capability of a target can be continuously improved during training, and therefore after the training is completed, the target recognition accuracy is higher.

Compared with a general mode of supervised learning for image classification, all feature extraction layers in a classification model used in the method can be used as GAN discriminators to be pre-trained in a data set, and the method has the advantages that except real samples in the data set, samples generated by a GAN generator can be input into the discriminators as data to play a role in data enhancement, so that the discriminators can better identify ground targets through training.

Referring to fig. 5, it is a diagram of a visual system structure based on the fusion of infrared and visible light of a deep neural network proposed by the present invention;

as shown in fig. 5, the visual system based on the fusion of infrared and visible light of the deep neural network includes:

a visible light and infrared image acquisition unit 201, which acquires an infrared camera image and a visible light camera image through an inspection platform;

the image fusion unit 202 is used for carrying out image fusion on the infrared camera image and the visible light camera image to obtain a fusion image;

the temperature measurement unit 203 is used for measuring the temperature of the salient region of the fused image;

and the identification early warning unit 204 is used for identifying the target and alarming disasters by using a target detection technology based on the generation of the countermeasure network.

The infrared and visible light multi-source information fusion remote vision system based on the deep neural network can be a handheld or unmanned aerial vehicle-mounted detection system.

The invention can be widely applied to the visualization of specific targets in various severe weathers. The system can know the real-time situation under a specific scene under any extreme condition, provides 24-hour uninterrupted pictures for power facility inspection, disaster area conditions and other specific targets, can solve the problems of real-time monitoring of the specific targets in various extreme weathers, particularly at night, timely troubleshooting and processing potential safety hazards and comprehensively knowing disasters in the first time, saves a large amount of manpower and material resources, obviously improves social and economic benefits when applied to the power industry, and can also avoid large-area power failure and economic loss caused by other secondary disasters caused by harm at night.

The method can be popularized and applied to various extreme conditions in the power industry, such as: the remote observation of the power equipment is realized under the weather conditions of rainstorm, heavy fog, dense smoke, haze and the like, and particularly, the remote and high-definition observation of specific targets such as disaster hidden danger points, disaster areas, power facilities and the like can be carried out under the condition of no natural light at night.

The embodiment of the invention saves the cost of manpower and material resources. Can accomplish anytime, under any weather, remote control is monitored specific power equipment, need not the manual work of operation and maintenance personnel and patrols and examines, and the system discovers unusual back, can in time inform operation and maintenance personnel to handle. And (5) disaster prevention. The hidden danger points can be monitored in real time for 24 hours, effective measures can be taken when a disaster happens, and secondary disasters are avoided. And monitoring the disaster area, determining the disaster situation of the disaster area, and preparing accurate disaster relief materials according to the investigation situation.

The present invention has been described in detail, but the present invention is not limited to the above embodiments, and various changes can be made without departing from the gist of the present invention within the knowledge of those skilled in the art. Many other changes and modifications can be made without departing from the spirit and scope of the invention. It is to be understood that the invention is not to be limited to the specific embodiments, but only by the scope of the appended claims.

Claims

1. The infrared and visible light fusion visual method based on the deep neural network is characterized by comprising the following steps of:

carrying out salient region temperature measurement on the fused image;

2. The method according to claim 1, wherein the image fusion of the infrared camera image and the visible camera image is performed to obtain a fused image; the method specifically comprises the following steps:

3. The method according to claim 2, wherein the fusion of the visible light image and the infrared image is completed by using the VGG-19 deep neural network to obtain a fused image, and specifically comprises: the method for calculating the salient feature maps of the images in different wave bands by adopting the VGG-19 deep neural network so that the finally fused images can keep the salient features of the respective images comprises the following steps:

The fine layer is

The base layer image is obtained by solving the following equation:

wherein g is_x＝[-1，1]，g_y＝[-1，1]T, which are horizontal and vertical gradient operators, the detail layer image is equal to the source image minus the base layer image;

for detail layer images

And

consider a detail layer image as

W_i ^mIndicating a characteristic weight map of the detail layer image of the K layer, wherein m is a channel number of the i layer;

is an M-dimensional vector;

computing a final active weight image using a block-based averaging operator

Can be obtained from the following equation:

wherein γ represents the block size;

when live layer images are acquired

The initial weight map is obtained by the soft-max function:

k represents the number of the live layer images, K is set to be 2,

indicating that the initial weight map is in the range of (0-1); pool of known VGG networksThe layer-based operator is a downsampling algorithm, and each time the operator reduces the image to 1/s times, let s be 2, so the span of the pooling layer in the VGG network is set to 2, so the size of the weight map at different layers is 1/2i-1 times that of the detail layer image, i belongs to {1, 2, 3, 4}, and represents four layers, relu _1_1, relu _2_1, relu _3_1, and relu _4_ 1; after obtaining the initial weight map, the weight map is up-sampled, the weight map with small size is enlarged to the image with the same size as the detail layer, four pairs of weight images are obtained,

finally, detail layer image fusion is obtained by the following formula:

F_d(x,y)＝max[F_d(x,y)|i∈{1,2,3,4}]；

F(x,y)＝F_b(x,y)+F_d(x,y)；

4. The method of claim 3, wherein γ is 1.

5. The method of claim 3, wherein α is₁And alpha₂Are each equal to 0.5.

6. The method of claim 1, wherein the fused image is subjected to salient region thermometry; the method specifically comprises the following steps: extracting a saliency region from the fused image through superpixel segmentation, measuring and screening the temperature of the saliency region, early warning the saliency region with the temperature greater than a set value, and sending early warning information; meanwhile, smoke detection is carried out on the significance region larger than the set value on the basis of a deep learning model, and whether the significance region larger than the set value is in fire or not is judged.

7. The method according to claim 6, wherein the step of performing smoke detection on a region in the image range by using a deep learning-based model to determine whether the region is in fire includes:

8. The method according to claim 6, wherein the fused image is subjected to superpixel segmentation to extract a salient region, the salient region is subjected to temperature measurement and screening, and the salient region with the temperature higher than a set value is subjected to early warning and early warning information is sent; the method specifically comprises the following steps: carrying out temperature measurement on the salient region, and carrying out temperature measurement by adopting a two-point + neural network + time domain high-pass filtering combined non-uniform correction algorithm and a colorimetric temperature measurement and ratio inversion high-order fitting algorithm;

And

g and O represent a gain coefficient and an offset coefficient, respectively;

y(n)＝x(n)-f(n)；

wherein M is the spectral radiance, and T is the temperature of the object;

wherein R is the ratio of the degrees of radiation;

by calculation such that:

the following system of linear equations can be obtained,

9. The method according to claim 1, wherein identifying and alarming a disaster by an object detection technique based on generation of a countermeasure network specifically comprises:

the loss function of the generator is defined as the cross entropy of the output of the generator and '1'; the penalty function of the discriminator consists of two parts: the output of the real sample passing through the discriminator and the '1' are subjected to cross entropy; the output of the sample generated by the generator and 0 are subjected to cross entropy; the penalty function of the discriminator is the sum of the two components; after the loss functions of the discriminator and the generator are obtained, an Adam optimizer is selected to optimize the loss functions.

10. The infrared and visible light fusion based deep neural network vision system, which utilizes the infrared and visible light fusion based deep neural network distance vision method of any one of claims 1-9, and is characterized by comprising the following steps: