CN108399612B

CN108399612B - Three-light image intelligent fusion method based on bilateral filtering pyramid

Info

Publication number: CN108399612B
Application number: CN201810118085.1A
Authority: CN
Inventors: 赵毅; 张登平; 钱晨; 刘宁; 杨超; 马新华; 谢小波; 宋莽
Original assignee: Jiangsu Unikit Optical Technology Co Ltd
Current assignee: Jiangsu Unikit Optical Technology Co Ltd
Priority date: 2018-02-06
Filing date: 2018-02-06
Publication date: 2022-04-05
Anticipated expiration: 2038-02-06
Also published as: CN108399612A; WO2019153651A1

Abstract

The invention provides a bilateral filtering pyramid-based three-light image intelligent fusion method, which comprises the following steps of: firstly, erecting a three-optical camera by using a parallel optical axis design, and enabling the three-optical camera to be over against the same scene for video signal acquisition; carrying out displacement deviation and rotational distortion correction on the collected three paths of video signals; and performing pyramid chromatography processing on the bilateral filter, and fusing three pyramid information obtained by the three spectral images after the pyramid chromatography processing is finished. The invention effectively fuses the bilateral filter and the pyramid layering algorithm into a new effective overall algorithm, and transforms the original three-spectrum image fusion into a fused image containing all the characteristics of the three spectrums and outputs the fused image to the outside, so that a user can finish the acquisition of all the target characteristics in a scene by only observing the image, thereby greatly improving the observation efficiency and the use cheapness of the user.

Description

Three-light image intelligent fusion method based on bilateral filtering pyramid

Technical Field

The invention relates to the technical field of intelligent detection imaging and image processing, in particular to a bilateral filtering pyramid-based three-light image intelligent fusion method.

Background

At present, cameras with different wave bands are widely used for monitoring scenes and targets in the scenes in the application fields related to industrial machine vision, such as community security, industrial production, fire safety, forest fire prevention, security inspection explosion prevention and the like, internationally and domestically. Because target characteristics in nature can show different spectral characteristics, the monitoring process of scenes and targets by using a single camera cannot meet the technical requirements of modern military and civil use. The existing solution is to use a plurality of cameras with different wave bands to simultaneously observe a scene, for example, a visible light camera is used to position a target in the scene, and then a thermal infrared imager is used to perform thermal image analysis on the target; for example, in the operation and maintenance inspection process of the power grid, the thermal infrared imager is used for detecting the heat distribution at the electrical connection joints of the cabinet, the switch box and the like, whether abnormal high temperature occurs is judged, and then the ultraviolet camera is used for judging whether high-voltage arc discharge or electric leakage and other phenomena exist at the abnormal high temperature. The missile target tracking method has the advantages that the missile in the long-distance flight process is detected by the thermal infrared imager and the tail flame characteristics of the missile in the flight process are analyzed by the ultraviolet light camera when the missile target is tracked, so that the defect that the identification of the infrared thermal imaging target is influenced by the interference missile sent by the missile is effectively avoided, and meanwhile, the type of the flight target can be effectively distinguished according to the thermal image and the tail flame characteristics. Therefore, in the field of industrial machine vision, the design concept of multi-camera cooperative work has a quite wide application scene and is initially put into use, and the method is an important direction for developing future visual imaging.

At present, the main product solution in the industry is to separately monitor a plurality of cameras installed in the same scene, for example, simultaneously install a visible light camera and an infrared thermal imager, output video signals to a central control host, simultaneously display the two videos in the supporting software of the host, and simultaneously observe the videos by a user. The biggest drawbacks of such systems are the following: 1. the whole system has a dispersed structure, the camera and the central control host are separate devices, and the volume and the cost are too large when the system is erected on site; 2. when two paths of video images are displayed on a display at the same time, due to the fact that focal plane image elements of imaging devices in different wave bands are different in size, even if the same scene is observed, larger parallax exists, namely, the scenes in the two paths of images are only locally the same, and difficulty is brought to a target finding process; 3. because a user needs to observe two videos at the same time, eyes need to switch and observe between two videos continuously, and visual inconvenience is brought to the user; 4. the system is not easy to carry, can only be installed at a fixed position for observation, and cannot be carried about, so that the expansion of the application scene is limited to a great extent. In summary, there is a need for a multispectral fusion product that is small, lightweight, portable, and fixedly installed to fill the market gap.

The optimal scheme is to carry out an image fusion process, and the obstructed image fusion technology is proposed in the industry at present. Among them, the most widespread of military and civil video solutions is the infrared and visible light dual-spectrum fusion technology. With the technological progress and the higher and higher user demands, the conventional dual-spectrum fusion technology cannot completely cope with the complex target characteristic information appearing in the scene, so that the appearance of a high-precision intelligent fusion technology capable of covering three imaging spectrums of visible light, infrared light and ultraviolet light is particularly important.

Disclosure of Invention

The object of the present invention is to solve at least one of the technical drawbacks mentioned.

Therefore, the invention aims to provide a bilateral filtering pyramid-based three-light image intelligent fusion method, which can fuse images acquired by visible light, infrared light and ultraviolet light, and a user can acquire all target characteristics in a scene by observing the image, so that the observation efficiency is greatly improved, and the use cheapness of the user is greatly improved.

In order to achieve the above object, the present invention provides a bilateral filtering pyramid-based three-light image intelligent fusion method, which comprises the following steps:

step S1, firstly, erecting a three-optical camera by using a parallel optical axis design, and enabling the three-optical camera to be over against the same scene for video signal acquisition;

step S2, displacement deviation and rotation distortion correction are carried out on the collected three paths of video signals;

the correction method of displacement deviation and rotational distortion adopts an image registration correction algorithm under a Cartesian coordinate system and a polar coordinate system to calculate target characteristic points in the images, simultaneously calculates pixel positions where the same characteristic points appear in the three spectral images, and adjusts the angular positions and the sizes of the three spectral images to the same dimensionality by utilizing translation scaling calculation of the Cartesian coordinate system and rotation factor calculation under the polar coordinate system so as to perform subsequent fusion calculation;

step S3, carrying out pyramid chromatography processing on the bilateral filter, namely, respectively carrying out pyramid downsampling layering on image results obtained by the bilateral filter, carrying out bilateral filtering operation on layered results, and ensuring that all characteristic information in the image can be completely extracted for fusion calculation in a mode of multi-time chromatography combined operation;

step S4, when the pyramid chromatography is finished, three kinds of pyramid information obtained by the three spectrum images need to be fused; the fusion method is that the pyramid layer substrate and the substrate corresponding to the three spectrum images are subjected to weighted fusion, and the characteristics and the features are subjected to weighted fusion; and carrying out pyramid inverse operation once each weighted fusion from the topmost pyramid layer, expanding the dimension of the high-level pyramid into the next-level pyramid layer, carrying out weighted fusion calculation with the next-level pyramid layer until all pyramid layers are completely calculated, fusing the original three-spectrum image into a fused image containing all the characteristics of the three spectrums, and outputting the fused image outwards.

Further, in step S2, the three spectral images are scaled to the same size and then corrected when the displacement deviation and the rotational distortion are corrected.

Further, in step S3, the main steps of the bilateral filter performing the pyramid tomography include:

starting from the original three spectral images, the following processing is performed for each spectral image: the method comprises the steps that a fundamental frequency layer image and a detail layer image are obtained after a most basal image is subjected to bilateral filtering for the first time, then a pyramid down-sampling process is carried out on the fundamental frequency layer image obtained at the moment to obtain a second layer pyramid image, then bilateral filtering is carried out on the basal pyramid image of the second layer respectively to obtain the fundamental frequency layer image and the detail layer image of the second layer, then bilateral filtering is carried out on the fundamental frequency image of the second layer continuously, and the like until all layers of a pyramid are subjected to bilateral filtering.

Further, in step S4, when the pyramid information of the three spectrum images is fused, the base image is selected as a visible light image, and the characteristics of the infrared light image and the ultraviolet light image need to be superimposed.

Further, the bilateral filter is an effective nonlinear filter for distinguishing features from noise in an image.

Further, the calculation formula of the bilateral filter is as follows:

where k (i, j) represents the normalization coefficient:

here, the notation (i ', j') e Si, j represents that (i ', j') and (i, j) are adjacent elements in the image; typically gs is chosen to be a normalized gaussian kernel, i.e. the sum of all coefficients in the bilateral filter is 1; furthermore, a Gaussian kernel function is also adopted in the intensity domain; the overall weight k (i, j) of this template is obtained by multiplying the results of two gaussian templates of the spatial domain and the intensity domain; which should range between 0 and 1.

Further, in steps S3-S4, assuming that the ith layer image of the pyramid is G1, the construction formula of the bilateral filter pyramid is: gl (i, j) ═ w (2i,2j, σ) × Gl-1(2i,2j), (1 ≦ l ≦ N), (3);

wherein N represents the layer number of the pyramid, and x represents convolution, i/j represents the coordinates of the image, and w (2i,2j, sigma) represents a bilateral filtering kernel with variance sigma;

thus, G0, G1, …, GN constitute the base band pyramid, G0 is the same as the original image; the size of the current layer image is 1/4 of the size of the previous layer image in turn, and the total layer number of the pyramid is N + 1.

Further, interpolating the image Gl to obtain an image which is amplified by four times, wherein the image size of the image is the same as that of Gl-1; then, from equation (3):

in equation (2):

order to

The pyramid formed by LP0, LP1, … and LPN is a bilateral filtering pyramid; each layer of image is the difference between the image of the local layer of the base frequency layer pyramid and the image of the higher layer of the base frequency layer pyramid after interpolation and amplification, and the process is equivalent to band-pass filtering, so that the bilateral filtering pyramid is also called band-pass pyramid decomposition; and finally, restoring the obtained pyramids of all the base frequency layers of each layer and the obtained detail components by a pyramid reconstruction method to obtain the fused image.

Further, the formula of the pyramid reconstruction is as follows:

the bilateral filtering pyramid-based three-light image intelligent fusion method has the following beneficial effects:

1. the invention adopts the parallel optical axis design when the three-light camera is installed, so as to ensure that the scene acquired by the camera has no geometric distortion.

2. Because the three-light camera cannot observe the same scene due to the mechanical error of the structural installation, the invention firstly corrects the displacement deviation and the rotational distortion before the fusion process, eliminates the displacement deviation and the rotational distortion and ensures the high precision of the subsequent fusion process.

3. The bilateral filter is a nonlinear filter which can effectively distinguish the characteristics and the noise in the image, and can arbitrate the image pixel value in the selected filtering window through two indexes of a space domain and an intensity domain to distinguish the characteristics and the noise information due to the nonlinear filtering characteristics, so that more noise is avoided in the fusion process, and the final fusion effect is greatly improved.

4. The invention adopts a pyramid analysis algorithm based on bilateral filtering as a fusion strategy of three light images, namely, the bilateral filtering and the pyramid layering algorithm are effectively fused into a new effective overall algorithm, the original three-spectrum image fusion is changed into a fusion image containing all the characteristics of three spectrums, and the fusion image is output outwards, so that a user can finish the acquisition of all the target characteristics in a scene only by observing the image, and the observation efficiency and the use cheapness of the user are greatly improved.

5. The invention can well perform the processes of image correction and image fusion on the images from different spectral bands, eliminate scene parallax and image size mismatch of the images and videos from different cameras at the same time, and perform real-time fusion on any two or all information in the three spectra through the bilateral filtering pyramid algorithm designed for the invention.

6. According to the invention, because the bilateral filter can well separate the fundamental frequency information (energy information) and the characteristic information (detail information) in the image, high-precision image superposition can be realized during fusion; the method is combined with a pyramid layering technology, and has good screening characteristics and multiple filtering operation characteristics; the traditional bilateral filter only carries out one-time filtering operation on a common single image.

Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

FIG. 1 is a flow chart of the main body of the present invention;

FIG. 2 is a schematic diagram of the present invention for offset and rotational distortion correction;

FIG. 3 is a schematic diagram of the calculation process of the bilateral filter of the present invention;

FIG. 4 is a schematic view of the complete pyramid chromatography process of the present invention;

FIG. 5 is a schematic diagram of detail extraction of pyramid components in the pyramid chromatography process of the present invention;

FIG. 6 is a schematic diagram of energy extraction of pyramid components in the pyramid chromatography process of the present invention;

FIG. 7 is a schematic diagram of the pyramid reconstruction algorithm of the present invention.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.

The invention provides a bilateral filtering pyramid-based three-light image intelligent fusion method, which is shown by referring to the attached drawings 1-7 and comprises the following steps:

step S1, first, a three-optical camera is erected by using a parallel optical axis design, so that the three-optical camera is directly facing the same scene for video signal acquisition.

Because of the mechanical error of the three-optical camera in the structural installation, the scenes observed by the three cameras cannot be completely the same, therefore, the invention adopts the parallel optical axis design when the three-optical camera is installed, so as to ensure that the scenes obtained by the camera are not subjected to the situation of geometric distortion, and in addition, because of the installation position of the industrial structure of the camera, the three-optical camera is installed in the space and has physical displacement, so that the displacement and the rotation deviation of the scenes can occur when the unified scene is observed, and certain position deviation exists between the scene information obtained by the three-optical camera, so that the subsequent fusion processing needs to be carried out on the part of the same scene in the scene information obtained by the three-optical camera before the fusion process is carried out; because perfect squareness cannot be achieved when the three-optical-camera is structurally installed, and a certain degree of rotational distortion must exist between the three-optical-camera, the rotational distortion needs to be eliminated preferentially, so that the high precision of the subsequent fusion process can be ensured.

due to the fact that the sizes of the camera pixels and the pixel intervals of the cameras with different spectrums are different, the number of corresponding pixels of the same target focused on a focal plane through a lens is different, and the images obtained by the three corrected spectrums need to be scaled to the same size when displacement and rotation deviation correction is carried out.

Therefore, when performing the subsequent fusion process, the displacement and the rotational deviation need to be corrected first.

The correction method of displacement deviation and rotational distortion adopts an image registration correction algorithm under a Cartesian coordinate system and a polar coordinate system to calculate target characteristic points in the images, simultaneously calculates pixel positions where the same characteristic points appear in the three spectral images, and adjusts the angular positions and the sizes of the three spectral images to the same dimension by utilizing translation scaling calculation of the Cartesian coordinate system and rotation factor calculation under the polar coordinate system so as to perform subsequent fusion calculation.

And step S3, carrying out pyramid chromatography processing on the bilateral filter, namely respectively carrying out pyramid downsampling layering on the image results obtained by the bilateral filter, carrying out bilateral filtering operation on the layered results, ensuring that the obtained three-light image can separate the characteristics and the noise on each pyramid layer, and ensuring that all characteristic information in the image can be completely extracted for fusion calculation through a mode of multi-time chromatography combined operation. As shown in fig. 3-6.

The invention relates to an image fusion technology based on a bilateral filter pyramid algorithm, wherein a bilateral filter and a pyramid layering algorithm are effectively fused into a new effective overall algorithm. Wherein the bilateral filter is an effective nonlinear filter for distinguishing features from noise in the image.

As shown in fig. 3, the calculation formula of the bilateral filter is:

where k (i, j) represents the normalization coefficient:

σ s and σ r represent standard deviation parameters of the two Gaussian kernels, which control the extent of expansion of the two Gaussian kernels. σ s determines the size of the neighborhood and therefore must be proportional to the size of the image, where the invention selects 2.5% of the diagonal size of the image. The choice of σ r is more critical because it represents the amplitude of the so-called detail. If the range of signal fluctuations is less than σ r, then the signal fluctuations are considered to be detail, i.e., smoothed by the bilateral filter, and separated into detail layers. Conversely, if the range of the ripple is greater than σ r, then this detail will be well preserved to the baseband layer due to the nonlinear nature of the bilateral filter. Here the invention selects 20% of the gray levels that the human eye can resolve, i.e. 25 as the value of σ r. The value takes the resolving power of human eyes to gray scale into consideration, and has good adaptability to different scenes.

The bilateral filter can well separate fundamental frequency information (energy information) and characteristic information (detail information) in the image, so that high-precision image superposition can be realized during fusion. The traditional bilateral filter only carries out one-time filtering operation on a common single image. Based on the good screening characteristics of the pyramid hierarchical structure, the pyramid hierarchical structure combines the pyramid hierarchical structure with the pyramid hierarchical structure.

Fig. 4 is a schematic diagram of a complete pyramid chromatography process for analyzing a picture, where the bottom of the pyramid is an original picture, and each layer is above, that is, the content of the current original picture is subjected to both detail extraction and energy extraction, and the image dimension is reduced to one fourth of the original picture, and fig. 5 and 6 are schematic diagrams of pyramid components of detail extraction and energy extraction in the pyramid chromatography process, respectively.

The bilateral filter performs pyramid chromatography processing mainly comprising the following steps:

Since the pyramid downsampling process is a dimensionality reduction operation of reducing four times each time, the number of pyramid layers is determined according to the size of an initial image, for example, the dimension of the initial image is 640 × 480, when downsampling is performed at a speed of reducing 4 times each time, the total number of pixels of an image to the 7 th layer is 75, and the image cannot be continuously divided by 4, and since the total number of pixels to the fifth layer is 4800, the image content cannot be distinguished by naked eyes, the total number of pyramid layers can be 5, and other number of layers can be selected according to the situation. The calculation process is as follows:

setting the original image as G0, taking G0 as the 0 th layer of the pyramid, also called as a basal layer, and carrying out bilateral filtering and alternate row-alternate downsampling on the original image to obtain the first layer of the base frequency layer pyramid; low-pass filtering and down-sampling are carried out on the first layer of image to obtain a second layer of the base frequency layer pyramid; repeating the above processes to form a base frequency layer pyramid. Assuming that the layer 1 image of the pyramid is G1, the construction process of the bilateral filtering pyramid is:

assuming that the ith layer image of the pyramid is G1, the construction formula of the bilateral filtering pyramid is:

Gl(i,j)＝w(2i,2j,σ)*Gl-1(2i,2j)， (1≤l≤N)， (3)；

thus, G0, G1, …, GN constitute the base band pyramid, G0 is the same as the original image; the size of the current layer image is 1/4 of the size of the previous layer image in turn, and the total layer number of the pyramid is N + 1. Therefore, the pyramid decomposition of the base frequency layer of the image is realized by sequentially carrying out low-pass filtering on the bottom layer image and then carrying out 2-down sampling on the filtering result in an interlaced and alternate mode.

Secondly, interpolating the image Gl to obtain an image which is amplified by four times, wherein the size of the image is the same as that of Gl-1; then, from equation (3):

in equation (2):

order to

The pyramid formed by LP0, LP1, … and LPN is a bilateral filtering pyramid; each layer of image is the difference between the image of the local layer of the base frequency layer pyramid and the image of the higher layer of the base frequency layer pyramid after interpolation and amplification, and the process is equivalent to band-pass filtering, so that the bilateral filtering pyramid is also called band-pass pyramid decomposition; finally, restoring the obtained pyramids of all the fundamental frequency layers of each layer and the obtained detail components by a pyramid reconstruction method to obtain a fused image, as shown in fig. 7.

The formula for pyramid reconstruction is:

step S4, when the pyramid tomography is finished, three kinds of pyramid information obtained from the three spectrum images need to be fused. As shown in fig. 7.

The fusion method is that the pyramid layer substrate and the substrate corresponding to the three spectrum images are subjected to weighted fusion, and the characteristics and the features are subjected to weighted fusion; and carrying out pyramid inverse operation once each weighting fusion from the topmost pyramid layer, expanding the dimension of the high-level pyramid into the next secondary high level, carrying out weighting fusion calculation with the secondary high level until all pyramid layers are completely calculated, fusing the original three-spectrum image into a fused image containing all characteristics of the three spectrums, and outputting the fused image outwards.

It should be noted that the three-optical camera may be a visible light camera, an infrared camera, an ultraviolet camera, or a camera with other light. When the pyramid information of the three spectrum images is fused, the base image is selected as a visible light image, and the characteristics of an infrared light image and an ultraviolet light image are required to be superposed.

The bilateral filtering pyramid-based three-light image intelligent fusion method can be a technology for carrying out real-time high-precision fusion on images and videos obtained under three imaging spectrums of visible light, infrared light and ultraviolet light. The technology can well carry out image correction and image fusion processes on images from different spectral bands. The images and videos from different cameras at the same time are subjected to scene parallax elimination and image size mismatch phenomenon elimination, any two or all information in three spectrums are fused in real time through the bilateral filtering pyramid algorithm designed for the invention, the fusion strategy of the technology is extremely high in precision, and the fusion result is excellent.

Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made in the above embodiments by those of ordinary skill in the art without departing from the principle and spirit of the present invention. The scope of the invention is defined by the appended claims and their full range of equivalents.

Claims

1. A bilateral filtering pyramid-based three-light image intelligent fusion method is characterized by comprising the following steps:

the calculation formula of the bilateral filter is as follows:

where k (i, j) represents the normalization coefficient:

here, the symbol (i ', j') ∈ S_i,jThe representations (i ', j') and (i, j) are neighboring elements in the image; in general g_sIs chosen to be a normalized gaussian kernel, i.e. the sum of all coefficients in the bilateral filter is 1; furthermore, a Gaussian kernel function is also adopted in the intensity domain; the overall weight k (i, j) of this template is obtained by multiplying the results of two gaussian templates of the spatial domain and the intensity domain; it should range between 0 and 1;

step S4, when the pyramid chromatography is finished, three kinds of pyramid information obtained by the three spectrum images need to be fused; the fusion method is that the pyramid layer substrate and the substrate corresponding to the three spectrum images are subjected to weighted fusion, and the characteristics and the features are subjected to weighted fusion; the pyramid inverse operation is carried out once each weighting fusion from the topmost pyramid layer, the dimension of the high-level pyramid is expanded to enter the next-level pyramid layer and the weighting fusion calculation is carried out with the next-level pyramid layer until all pyramid layers are completely calculated, and at the moment, the original three-spectrum image is fused into a fused image containing all the characteristics of the three spectrums and is output outwards, so that a user can finish obtaining all target characteristics in a scene only by observing the image;

wherein, in steps S3-S4, the image of the first layer of the pyramid is assumed to be G₁The construction formula of the bilateral filtering pyramid is as follows:

G_l(i,j)＝w(2i,2j,σ)*G_l-1(2i,2j)，(1≤l≤N)， (3)；

thus, G₀、G₁、…、G_NA base-frequency layer pyramid, G, is formed₀Same as the original image; the size of the current layer image is 1/4 of the size of the previous layer image in sequence, and the total layer number of the pyramid is N + 1;

image G_lInterpolating to obtain image with four times amplification and image size G_l-1Are the same in size; then, from equation (3):

in equation (2):

order to

From LP₀、LP₁、…、LP_NThe formed pyramid is a bilateral filtering pyramid; each layer of image is the difference between the image of the local layer of the base frequency layer pyramid and the image of the higher layer of the base frequency layer pyramid after interpolation and amplification, and the process is equivalent to band-pass filtering, so that the bilateral filtering pyramid is also called band-pass pyramid decomposition; and finally, restoring the obtained pyramids of all the base frequency layers of each layer and the obtained detail components by a pyramid reconstruction method to obtain the fused image.

2. The bilateral filtering pyramid-based three-light image intelligent fusion method of claim 1, wherein: in step S2, the three spectral images are scaled to the same size and then corrected when the displacement deviation and the rotational distortion are corrected.

3. The bilateral filtering pyramid-based three-light image intelligent fusion method of claim 1, wherein: in step S3, the main steps of the bilateral filter in performing pyramid tomography are:

4. The bilateral filtering pyramid-based three-light image intelligent fusion method of claim 1, wherein: in step S4, when the pyramid information of the three spectrum images is fused, the base image is selected as a visible light image, and the characteristics of the infrared light image and the ultraviolet light image are to be superimposed.

5. The bilateral filtering pyramid-based three-light image intelligent fusion method of claim 1, wherein: the bilateral filter is an effective nonlinear filter that distinguishes features from noise in an image.

6. The bilateral filtering pyramid-based three-light image intelligent fusion method of claim 1, wherein: the formula for pyramid reconstruction is: