WO2023125087A1

WO2023125087A1 - Image processing method and related apparatus

Info

Publication number: WO2023125087A1
Application number: PCT/CN2022/139807
Authority: WO
Inventors: 罗谌持; 方光祥
Original assignee: 华为技术有限公司
Priority date: 2021-12-30
Filing date: 2022-12-17
Publication date: 2023-07-06

Abstract

Provided are an image processing method and a related apparatus: the method comprises: obtaining a first infrared image, a visible light image, and a second infrared image; wherein the first infrared image, the visible light image, and the second infrared image are obtained by photographing the same scene at the T0 moment, the T1 moment, and the T2 moment, respectively, and T0 < T1 < T2; calculating to obtain a target infrared image on the basis of the first infrared image and the second infrared image, the target infrared image comprising a third infrared image and/or a fourth infrared image; the third infrared image is an image obtained by converting the first infrared image from the T0 moment to the T1 moment, and the fourth infrared image is an image obtained by converting the second infrared image from the T2 moment to the T1 moment; fusing the target infrared image and the visible light image to obtain a fused image. In the present application, the registration precision of the infrared image and the visible light image are improved, thereby improving the quality of the fused image.

Description

Image processing method and related device

This application claims the priority of the Chinese patent application with the application number 202111650872.9 and the application title "A Device for Three-Frame Fusion of Infrared and Visible Light Based on Infrared Optical Flow" submitted to the State Intellectual Property Office of China on December 30, 2021, and Priority is claimed to a Chinese patent application with application number 202210207925.8 and application title "Image Processing Method and Related Devices" filed with the State Intellectual Property Office of China on March 3, 2022, the entire contents of which are incorporated herein by reference.

technical field

The present application relates to the technical field of image processing, and in particular to an image processing method and related devices.

Background technique

In a low-light environment, the image collected by a color camera based on visible light imaging is usually blurry, while the image collected by an infrared camera based on infrared imaging is clear, but the color is not in color (the image captured by an infrared camera is usually a grayscale image). In order to obtain a clearer color image in a low-light environment, the industry usually adopts a technical solution combining visible light and infrared light imaging.

The technical scheme of combining visible light and infrared light imaging is usually: after visible light and infrared light are respectively imaged to obtain visible light image and infrared image, then the visible light image and infrared image are fused to obtain the final output fusion image. In related technologies, when multiple images are fused, image registration based on feature matching usually needs to be performed on the multiple images first, but this kind of image registration scheme usually has lower accuracy when registering visible light images and infrared images. Poor, resulting in poor clarity of the image after fusion of visible light image and infrared image.

Contents of the invention

The present application discloses an image processing method and a related device, which can obtain a clear color image based on fusion of an obtained infrared image and a visible light image.

In a first aspect, the present application provides an image processing method, the method comprising:

Acquiring a first infrared image, a visible light image, and a second infrared image; wherein, the aforementioned first infrared image, the aforementioned visible light image, and the aforementioned second infrared image are obtained by shooting the same scene at time T0, T1, and T2 respectively by the shooting device, T0<T1<T2;

The target infrared image is calculated based on the aforementioned first infrared image and the aforementioned second infrared image, the aforementioned target infrared image includes a third infrared image and/or a fourth infrared image; the aforementioned third infrared image is obtained from the aforementioned first infrared image from the aforementioned T0 The moment is converted to the image obtained at the aforementioned T1 moment, and the aforementioned fourth infrared image is an image obtained by converting the aforementioned second infrared image from the aforementioned T2 moment to the aforementioned T1 moment;

The aforementioned target infrared image and the aforementioned visible light image are fused to obtain a fused image.

This application first acquires the first infrared image, the visible light image and the second infrared image (these three images are obtained by the shooting device at T0 time, T1 time and T2 time respectively in the same scene), and then based on the two infrared images conversion The infrared image is corresponding to the moment when the visible light image is captured, so as to realize the registration of the captured infrared image and the visible light image. Compared with the existing image processing based on different modalities (extracting the feature points of infrared images and visible light images respectively, and realizing registration based on the feature points of the two) to realize the registration scheme of infrared images and visible light images, This application realizes the registration of the infrared image and the visible light image through the conversion processing of the infrared image of the same mode (based on two captured infrared images), which improves the registration accuracy of the infrared image and the visible light image, so that the final fused image is more accurate. clear. Especially in low-light environments, clear and natural color images can be obtained.

In a possible implementation manner, the aforementioned calculation based on the aforementioned first infrared image and the aforementioned second infrared image to obtain the target infrared image includes:

Using the optical flow method, according to the relationship between the aforementioned T0, the aforementioned T1 and the aforementioned T2, as well as the aforementioned first infrared image and the aforementioned second infrared image, the aforementioned target infrared image is calculated and obtained.

Optionally, the aforementioned target infrared image includes the aforementioned third infrared image; the aforementioned adopts the optical flow method, and according to the relationship between the aforementioned T0, the aforementioned T1 and the aforementioned T2, as well as the aforementioned first infrared image and the aforementioned second infrared image, it is calculated to obtain The aforementioned target infrared images include:

calculating an optical flow F1 from the aforementioned first infrared image to the aforementioned second infrared image and an optical flow F2 from the aforementioned second infrared image to the aforementioned first infrared image;

Calculate the optical flow F3 based on the aforementioned T0, the relationship between the aforementioned T1 and the aforementioned T2, the aforementioned optical flow F1 and the aforementioned optical flow F2, and the aforementioned optical flow F3 is the optical flow from the aforementioned third infrared image to the aforementioned first infrared image;

The aforementioned third infrared image is obtained by performing optical flow reverse mapping based on the aforementioned optical flow F3 and the aforementioned first infrared image.

Based on the shooting time information of the above two infrared images and the above three images, this application uses the optical flow method to convert the captured infrared image to the corresponding infrared image at the time when the visible light image is captured, so as to realize the infrared image and visible light image registration. Compared with the existing image processing based on different modalities (extracting the feature points of infrared images and visible light images respectively, and realizing registration based on the feature points of the two) to realize the registration scheme of infrared images and visible light images, In the present application, the registration of the infrared image and the visible light image is realized by processing the infrared image of the same mode (based on two captured infrared images), and the registration accuracy of the infrared image and the visible light image is improved.

In a possible implementation manner, the aforementioned fusion of the aforementioned target infrared image and the aforementioned visible light image to obtain a fused image includes:

Extracting the color features of the aforementioned visible light image;

Extracting texture features of the aforementioned target infrared image;

Based on the aforementioned color feature and the aforementioned texture feature, the aforementioned fused image is obtained.

In this application, the visible light image contributes color features, and the infrared image obtained after the above conversion contributes texture features, and image fusion based on the color features and texture features can obtain a clear and natural color image.

The aforementioned target infrared image and the aforementioned visible light image are fused through the first neural network to obtain the aforementioned fused image; the aforementioned first neural network includes a color extraction neural network and a texture extraction neural network, and the aforementioned color extraction neural network is used to extract the color of the aforementioned visible light image Features: the aforementioned texture extraction neural network is used to extract the texture features of the aforementioned target infrared image.

In this application, the trained neural network is used to extract the color features of visible light images and the texture features of infrared images, which can make the extracted features more accurate and make the fused images more natural and clear.

In a possible implementation manner, the resolution of the color extraction neural network is not higher than a preset first resolution threshold and the number of layers of the color extraction neural network is not lower than a preset first network depth threshold.

In a possible implementation manner, the resolution of the texture extraction neural network is higher than a preset second resolution threshold, and the layer number of the texture extraction neural network is lower than a preset second network depth threshold.

For the above-mentioned first neural network provided by the present application, since the resolution of the color extraction neural network in the first neural network is low, and the number of layers of the texture extraction neural network in the first neural network is low, then the entire first neural network The computing power requirement is relatively low, compared with the existing neural network (for example, Unet neural network) which has high computing power requirement, the solution of this application has better hardware adaptability. In addition, due to the low resolution of the color extraction neural network, the visible light image input to the color extraction neural network may be a downsampled image, part of the noise is eliminated during the downsampling process, and the noise in the downsampled image is reduced. Therefore, the use of low-resolution color extraction neural network can enhance the anti-interference ability to noise.

In a possible implementation manner, the aforementioned method is implemented by an image processing model, and the aforementioned image processing model includes the aforementioned first neural network and the second neural network;

The aforementioned second neural network is used to obtain the optical flow F1 from the aforementioned first infrared image to the aforementioned second infrared image and the optical flow F2 from the aforementioned second infrared image to the aforementioned first infrared image; the aforementioned optical flow F1 and the aforementioned optical flow F2 is used to calculate and obtain the infrared image of the aforementioned target;

The aforementioned first neural network and the aforementioned second neural network included in the aforementioned image processing model are obtained through end-to-end training.

In this application, the end-to-end training can make the first neural network fault-tolerant to the calculation error of the optical flow in the second neural network (the neural network that extracts the optical flow), and finally makes the trained image processing model more robust .

In a possible implementation manner, the training images of the aforementioned image processing model are collected in an environment where the illuminance is lower than a preset illuminance threshold.

In this application, since the training images of the image processing model are captured in a low-light environment, the image processing model trained in this way can learn the dark details, color and texture of the image, so that it can be fused to obtain a clear color image in a low-light environment .

In a second aspect, the present application provides a photographing device, which includes a lens, a dimmer, a drive module, and an imaging module; the dimmer is located between the lens and the imaging module, and the driver module and the dimmer connect;

The aforementioned lens is used to gather the light incident on the aforementioned lens onto the aforementioned dimming sheet;

The aforementioned dimmer includes an infrared bandpass filter, an infrared cutoff filter and a shading sheet, the aforementioned infrared bandpass filter is used to allow infrared light to pass through and filter visible light, and the aforementioned infrared cutoff filter is used to allow visible light to pass through and filter the infrared light, the aforementioned shading sheet is used to prevent the light from passing through;

The aforementioned driving module is used to drive the movement of the aforementioned dimmer, so that the light collected on the aforementioned dimmer is incident on the aforementioned infrared bandpass filter during the first period, and is incident on the aforementioned infrared cutoff filter during the second period. , incident on the aforementioned shading sheet during the third period and the fourth period;

The aforementioned imaging module is used to receive the infrared light passing through the aforementioned infrared bandpass filter during the aforementioned first period, and to obtain a first infrared image based on the received infrared light during the aforementioned third period; and to receive the infrared image during the aforementioned second period Visible light passing through the aforementioned infrared cut filter, and obtaining a visible light image based on the received visible light in the aforementioned fourth period; the aforementioned first period, the aforementioned second period, the aforementioned third period and the aforementioned fourth period do not overlap.

In the photographing device provided by this application, the dimming film can flexibly adjust its own motion direction, speed and mode of motion (based on the drive module) according to the actual business needs, so that the dimming film can be gathered to the The light on the light-adjusting film hits a corresponding filter or a certain light-shielding film in the corresponding time period, so as to select different light filters in different time periods according to the demand, that is, finally, the shooting device can be used according to It is actually required to capture corresponding infrared images or visible light images at corresponding moments, which makes shooting very convenient and flexible.

In a possible implementation manner, the aforementioned dimmer is circular, the aforementioned infrared bandpass filter, the aforementioned infrared cut filter, and the aforementioned light-shielding plate are fan-shaped; the aforementioned drive module is used to drive the aforementioned dimmer to rotate.

In a possible implementation manner, the aforementioned dimmer is polygonal, the aforementioned infrared bandpass filter, the aforementioned infrared cut filter, and the aforementioned light-shielding plate are triangular; the aforementioned driving module is used to drive the aforementioned dimmer to rotate.

In a possible implementation manner, the aforementioned dimmer is rectangular, and the aforementioned infrared bandpass filter, the aforementioned infrared cutoff filter, and the aforementioned light shield are rectangular; the aforementioned drive module is used to drive the aforementioned dimmer to move.

In a possible implementation manner, the aforementioned infrared bandpass filter is adjacent to the aforementioned light-shielding sheet, and the aforementioned infrared cut-off filter is adjacent to the aforementioned light-shielding sheet.

In a possible implementation manner, the length of the first time period indicates the exposure time of the first infrared image; the length of the second time period indicates the exposure time of the visible light image.

In a possible implementation manner, the length of the first period of time is related to the size of the shading sheet, and the shading sheet is adjacent to the infrared bandpass filter.

In a possible implementation manner, the light collected on the aforementioned dimmer is struck on the first infrared cut filter during the aforementioned second period;

The length of the aforementioned second period of time is related to the size of the second light-shielding sheet, and the aforementioned second light-shielding sheet is adjacent to the first infrared cut-off filter; the aforementioned first infrared cut-off filter is one of the aforementioned at least one infrared cut-off filter One; the second shading sheet is one of the at least one shading sheet; the first shading sheet is different or the same as the second shading sheet.

In a possible implementation manner, the length of the aforementioned first period of time or the length of the aforementioned second period of time is related to the moving speed of the aforementioned dimming film.

In a possible implementation manner, the moving speed of the dimmer chip is controlled by the driving module.

In a possible implementation manner, the end time of the first period is the start time of the third period; the end time of the second period is the start time of the fourth period.

In a possible implementation manner, the end time of the aforementioned first period is T0, and the end time of the aforementioned second period is T1;

The aforementioned drive module is also used to drive the movement of the aforementioned dimming sheet, so that the light gathered on the aforementioned dimming sheet is incident on the aforementioned infrared bandpass filter during the fifth period, so that the aforementioned imaging module obtains a second infrared image; the aforementioned The end time of the fifth period is T2; wherein, T0<T1<T2; the aforementioned first infrared image, the aforementioned visible light image and the aforementioned second infrared image are obtained by shooting the same scene by the photographing device;

The foregoing photographing device further includes a processor, and the foregoing processor is configured to execute the method described in any one of the foregoing first aspects.

In a possible implementation manner, the aforementioned dimmer includes two aforementioned infrared bandpass filters, one aforementioned infrared cut filter and at least two aforementioned light shielding filters.

In a possible implementation manner, the aforementioned one infrared cut filter is located between the aforementioned two infrared bandpass filters.

In the third aspect, an image processing device of the present application, the device includes:

An acquisition unit, configured to acquire a first infrared image, a visible light image, and a second infrared image; wherein, the aforementioned first infrared image, the aforementioned visible light image, and the aforementioned second infrared image are captured by the photographing device at time T0, T1, and T2, respectively The same scene is obtained, T0<T1<T2;

A computing unit, configured to calculate a target infrared image based on the aforementioned first infrared image and the aforementioned second infrared image, where the aforementioned target infrared image includes a third infrared image and/or a fourth infrared image; the aforementioned third infrared image is the aforementioned first infrared image The infrared image is converted from the aforementioned T0 moment to the image obtained at the aforementioned T1 moment, and the aforementioned fourth infrared image is an image obtained by converting the aforementioned second infrared image from the aforementioned T2 moment to the aforementioned T1 moment;

The fusion unit is configured to fuse the aforementioned target infrared image and the aforementioned visible light image to obtain a fusion image.

In a possible implementation manner, the aforementioned calculation unit is specifically used for:

In a possible implementation manner, the aforementioned target infrared image includes the aforementioned third infrared image; the aforementioned computing unit is specifically configured to:

Calculate the optical flow F1 from the aforementioned first infrared image to the aforementioned second infrared image and the optical flow F2 from the aforementioned second infrared image to the aforementioned first infrared image;

In a possible implementation manner, the aforementioned fusion unit is specifically used for:

Extracting the color features of the aforementioned visible light image;

Extracting texture features of the aforementioned target infrared image;

In a possible implementation manner, the operations performed by the aforementioned device are realized by an image processing model, and the aforementioned image processing model includes the aforementioned first neural network and the second neural network;

In a fourth aspect, the present application provides an image processing device, including a processor and a memory, configured to implement the method described in the above first aspect and its possible implementation manners. The memory is coupled to the processor, and when the processor executes the computer program stored in the memory, the image processing apparatus can implement the method described in the first aspect or any possible implementation manner of the first aspect.

The device may further include a communication interface, which is used for the device to communicate with other devices. Exemplarily, the communication interface may be a transceiver, circuit, bus, module or other type of communication interface. The communication interface includes a receiving interface and a sending interface, the receiving interface is used for receiving messages, and the sending interface is used for sending messages.

In one possible implementation, the device may include:

memory for storing computer programs;

A processor for reading a computer program in the memory to perform the following operations:

It should be noted that the computer program in the memory in this application can be stored in advance or can be stored after being downloaded from the Internet when using the device. This application does not specifically limit the source of the computer program in the memory. The coupling in the embodiments of the present application is an indirect coupling or connection between devices, units or modules, which may be in electrical, mechanical or other forms, and is used for information exchange between devices, units or modules.

In a fifth aspect, the present application provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, any one of the above-mentioned first aspect and its possible implementation modes can be realized the method described.

In a sixth aspect, the present application provides a computer program product, including a computer program. When the computer program is executed by a processor, the computer is made to perform the method described in any one of the above first aspects.

It can be understood that the devices described in the third aspect and the fourth aspect provided above, the computer storage medium described in the fifth aspect, and the computer program product described in the sixth aspect are all used to implement any one of the above first aspects provided method. Therefore, the beneficial effects that it can achieve can refer to the beneficial effects in the corresponding method, and will not be repeated here.

Description of drawings

The drawings that need to be used in the embodiments of the present application will be introduced below.

Fig. 1 is a schematic diagram of the photographing device and its photographing principle provided by the embodiment of the present application;

FIG. 2A, FIG. 2B and FIG. 2C are structural schematic diagrams of the dimmer provided in the embodiment of the present application;

FIG. 2D is a schematic diagram of the image shooting moment provided by the embodiment of the present application;

FIG. 3 is a schematic structural diagram of a dimmer provided in an embodiment of the present application;

FIG. 4 is a schematic flowchart of an image processing method provided in an embodiment of the present application;

FIG. 5 is a schematic diagram of the field of view space of the imaging device provided by the embodiment of the present application;

6A and 6B are schematic structural diagrams of the image fusion neural network provided by the embodiment of the present application;

FIG. 7 is a schematic structural diagram of an image processing model provided by an embodiment of the present application;

FIG. 8 is a schematic diagram of images captured by the imaging device provided in the embodiment of the present application at different times;

FIG. 9 is a schematic diagram of a logical structure of an image processing device provided by an embodiment of the present application;

FIG. 10 is a schematic diagram of a hardware structure of an image processing device provided by an embodiment of the present application.

Detailed ways

Embodiments of the present application are described below in conjunction with the accompanying drawings.

The concepts of the technical terms involved in this application are firstly introduced below.

1. Optical flow.

Macroscopically, when the human eye observes a moving object, the scene of the object forms a series of continuously changing images on the retina of the human eye. This series of continuously changing information continuously "flows" through the retina, like a "flow" of light. ", so it is called optical flow. Optical flow expresses the change of the image, and because it contains the information of the target's motion, it can be used by the observer to determine the motion of the target.

Microscopically, optical flow is the instantaneous speed of pixel movement of spatially moving objects on the imaging plane. Generally speaking, optical flow is generated due to the movement of the foreground object itself in the scene, the movement of the camera, or the joint movement of both.

The representation of optical flow is also digital. Exemplarily, the optical flow from a certain pixel of the first image to a certain pixel of the second image may be expressed as (u, v). If both the first image and the second image are in the same pixel coordinate system, for the sake of understanding, it is assumed that the time difference between the shooting moment of the first image and the shooting moment of the second image is very short, which can be regarded as a unit time difference, then , u represents the number of pixels that the certain pixel in the second image moves relative to the certain pixel in the first image in the horizontal direction, and v represents the number of pixels that the certain pixel in the second image moves vertically relative to the first image The number of pixels by which a certain pixel in an image is moved.

2. Optical flow method.

The optical flow method uses the changes of pixels in the image sequence in the time domain and the correlation between adjacent frames to find the correspondence between the previous frame and the current frame, thereby calculating the motion information of objects between adjacent frames. a method of

3. Optical flow reverse mapping.

Optical flow reverse mapping is an implementation of the optical flow method. The reverse mapping of the optical flow usually refers to the known optical flow from the first image to the second image and the second image, and the first image is obtained by converting the second image through the optical flow. For example, assume that the optical flow from a certain pixel point A of the first image to the pixel point A of the second image is expressed as (u, v), and the coordinates of the pixel point A in the second image are (x1, y1 ). Then, if the coordinates of the pixel point A in the first image are defined as (x2, y2), then x2=x1-u, y2=y1-v. Here is just a simple example to understand the principle of optical flow reverse mapping. In the specific implementation, optical flow reverse mapping also includes other details of conventional processing, such as interpolation processing, etc., which will not be described here.

4. Different modal images.

A heteromodal image generally refers to multiple images obtained by imaging separately based on light of different spectra (or frequencies). For example, the spectra (or frequencies) of infrared light and visible light are different, infrared images can be obtained based on infrared light imaging, and visible light images can be obtained based on visible light imaging. Then, the infrared image and the visible light image are images of different modes. The expressions of heteromodal images are usually different. For example, an infrared image is usually a grayscale image, and the color of a pixel in the grayscale image is represented by a grayscale value, and the grayscale value ranges from 0 to 255. The visible light image is a color image, and the colors of the pixels of the color image are represented by at least three color values. Each of the three color values ranges from 0 to 255.

5. Image registration.

Image registration usually refers to the process of matching and superimposing two or more acquired images. Image registration is usually implemented based on feature matching. Taking the registration of two images as an example, the image registration process is as follows: firstly, feature extraction is performed on the two images to be registered to obtain feature points, and the matching feature point pairs are found by performing similarity measurement; The matched feature point pairs obtain the coordinate transformation parameters of the image space; finally, the matching of the two images is completed by the coordinate transformation parameters.

In order to obtain a clearer color image (that is, a color image with a relatively high signal-to-noise ratio) in a low-light environment, a shooting technology solution combining visible light and infrared light imaging may be used. The scheme obtains a visible light image based on visible light imaging and an infrared image based on infrared light imaging, and then fuses the visible light image and the infrared image to obtain a final output fusion image. In theory, the fused image can combine the color information of the visible light image and the texture information of the infrared image, so the fused image should be a relatively clear color image. However, in the above-mentioned fusion process, since the infrared image and the visible light image are images of different modalities, and the expressions of pixels in the images of different modalities are usually different, when using the method based on feature matching in the prior art When registering heteromodal images, the accuracy is usually lower. Especially in the case of relatively fast movement between the camera and the object to be photographed, the registration accuracy of heteromodal images based on feature matching will be further reduced. However, the low registration accuracy of the visible light image and the infrared image will cause problems such as ghosting in the final fused image, that is, the definition of the final fused image is still poor.

In order to solve the above technical problems, the present application provides a photographing device and an image processing method.

Firstly, the photographing device and its photographing principle provided by the present application will be introduced below, for example, refer to FIG. 1 . As shown in FIG. 1 , the photographing device 100 provided by the present application includes an infrared light emitter 101 , a lens 102 , a dimmer 103 and an imaging module 104 .

The infrared light emitter 101 can be used to emit infrared light, which will strike the subject 120 .

The lens 102 may be a convex lens, which is used to gather light incident on the lens 102 (for example, light reflected by the object 120 ) onto the dimmer sheet 103 . The light collected on the dimmer sheet 103 can be a light beam or a light spot.

The dimmer 103 is located between the lens 102 and the imaging module 104 and can be used to select the light gathered by the lens 102 to adjust the light incident to the imaging module 104 .

The imaging module 104 may include a photoelectric conversion unit and an image processing unit. Wherein, the photoelectric conversion unit is used to receive the light passing through the dimmer 103 through the photosensitive surface, and convert the light image on the photosensitive surface into an electrical signal having a corresponding proportional relationship with the light image. The image processing unit is used to process the converted electrical signal to obtain an image. Exemplarily, the photoelectric conversion unit may be a photosensitive element such as an image sensor, and the image processing unit may be an image signal processor (image signal processor) or the like.

In the embodiment of the present application, in a possible implementation, the imaging module 104 may include one or more sensors. If a sensor is included, the sensor can either process infrared light to obtain an infrared image, or process visible light to obtain a visible light image. If multiple sensors are included, one of the sensors can be used to process infrared light to obtain an infrared image, and the other sensor can be used to process visible light to obtain a visible light image.

In addition, regarding the shooting principle of the shooting device 100 , as shown in FIG. 1 , after the infrared light emitted by the infrared light emitter 101 hits the object 120 to be photographed, it will be reflected (or scattered). The reflected (or scattered) light is collected by the lens 102 of the photographing device 100 to the dimmer 103 , filtered by the dimmer 103 and then incident to the imaging module 104 . The incident light is processed and imaged by the imaging module 104 . In addition, it should be understood that, in addition to the infrared light emitted by the infrared light emitter 101, the visible light in the natural environment and the reflected (or scattered) light after hitting the subject 120 will also be gathered to adjust Light sheet 103. The light (the light collected on the dimmer 103 ) is also filtered by the dimmer 103 , and only the light allowed by the dimmer 103 can enter the imaging module 104 .

The structure and working principle of the dimmer will be described in detail below.

The dimmer 103 may include multiple light filters and at least one light-shielding film. Wherein, the plurality of filters include at least one infrared bandpass filter and at least one infrared cut filter. The arbitrary infrared bandpass filter is used to allow infrared light to pass through and filter visible light; the arbitrary infrared cut filter is used to allow visible light to pass through and filter infrared light; the arbitrary shading film is used to prevent light from passing through , which filters both visible light and infrared light.

It should be understood that the above-mentioned photographing device 100 further includes a driving module, and the driving module is connected with the above-mentioned dimmer 103 . The driving module can drive and control the movement of the dimmer 103, so that the light gathered on the dimmer 103 hits a filter or a shading at different time periods, so as to realize the dimming of the dimmer 103. The light on sheet 103 is selected and controlled for purposes.

In a possible implementation manner, several structural pairs of the dimming sheet 103 in this implementation manner are specifically introduced in light adjustment and FIG. 2C .

The structure of the dimmer 103 shown in Fig. 1 can be as shown in Fig. 2A, and this dimmer 103 comprises two infrared band-pass filters 1031 (respectively infrared band-pass filter 1031-1 and infrared band-pass filter 1031-1 filter 1031-2), an infrared cut filter 1032 and three light shields 1033 (respectively 1033-1, 1033-2 and 1033-3). In the example shown in Figure 2A, a shading sheet 1033 is arranged between any adjacent two of the above-mentioned three optical filters (two infrared bandpass optical filters 1031 and one infrared cut optical filter 1032) .

The structure of the dimmer 103 shown in FIG. 1 may also be as shown in FIG. 2B . The dimmer 103 includes an infrared band-pass filter 1031, an infrared cut-off filter 1032 and two shading films 1033 (respectively 1033-1 and 1033-2).

The structure of the dimmer 103 shown in FIG. 1 may be as shown in FIG. 2C . The dimmer 103 includes an infrared band-pass filter 1031 , an infrared cut-off filter 1032 and a shading film 1033 .

Taking the structure shown in FIG. 2A as an example, and referring to FIG. 2D , the working principles of the dimming sheet 103 and the photographing device will be introduced.

During the working process of the photographing device, the propagation path of the light incident on the lens 102 after being collected by the lens 102 remains unchanged. Therefore, the light collected on the dimmer 103 usually hits a certain part of the dimmer 103. Physically fixed orientation (see Figure 1). In this way, when the dimming sheet 103 is driven by the drive module to rotate with the center point O of the dimming sheet 103, the light gathered to the dimming sheet 103 will be adjusted at different time periods (that is, time periods). In different fan-shaped areas on the light sheet 103. Moreover, since the area of the light beam or light spot collected on the dimming sheet 103 is generally small, the light collected on the dimming sheet 103 usually only hits one fan-shaped area on the dimming sheet 103 in a period of time. Different fan-shaped areas correspond to different light filters or light-shielding plates. Therefore, through the above-mentioned method, the dimmer 103 can select and filter the light collected on the dimmer 103 . During specific implementation, the aforementioned driving module may be, for example, a driving device such as a motor, which is not limited in this embodiment of the present application.

For example, in FIG. 2A and FIG. 2D , firstly, the light gathered to the dimmer 103 hits the fan-shaped area corresponding to the infrared bandpass filter 1031-1 in the first period of time. Then, in the first time period, after the light collected on the dimmer 103 passes through the infrared bandpass filter 1031 - 1 , only infrared light can pass through and enter the imaging module 104 . Wherein, the length of the first period can be understood as the exposure time of infrared light (the length of the exposure time period). Then, after the end of the first period, it is assumed that the dimmer 103 is rotated, so that the light gathered on the dimmer 103 hits the fan-shaped area corresponding to the light shield 1033-2 during the second period. Then, in the second time period, the light shielding sheet 1033 - 2 prevents the light collected on the light adjusting sheet 103 from hitting the imaging module 104 . That is to say, during the second period, the dimmer 103 prevents any light (including visible light and infrared light) from incident on the imaging module 104 . This also means that when the light collected on the dimming film 103 hits the light shielding film 1033-2 (that is, when the second period begins), it can be equivalent to a shutter, which will cause the exposure of infrared light in the above-mentioned first period. Therefore, the start moment of the above-mentioned second period can be understood as the end moment of the first period. It should be understood that when the light collected on the dimming sheet 103 hits the light shielding sheet 1033-2, the imaging module 104 will also be triggered to process the infrared light received in the first period in the second period (passing through the dimming panel 103-2). Infrared light from sheet 103) and obtain an infrared image, for example, obtain a first infrared image. Wherein, the end moment of the first period can be understood as the shooting moment or imaging moment of the first infrared image. Wherein, the length of the above-mentioned second period may be referred to as the light-shielding duration of the light-shielding sheet 1033-2.

Then, when the second period is over, the dimmer 103 is rotated, so that the light collected on the dimmer 103 hits the corresponding fan-shaped area of the infrared cut filter 1032 in the third period. Then, in the third period, after the light collected on the dimmer 103 passes through the infrared cut filter 1032 , only visible light can pass through and enter the imaging module 104 . Similarly, the third period can be understood as the exposure time of visible light. Next, when the dimmer 103 continues to rotate, the light gathered to the dimmer 103 can hit the fan-shaped area corresponding to the dimmer 1033-3 in the fourth period. Similarly, when the fourth period starts, the imaging module 104 can be triggered to process the visible light (visible light passing through the dimmer 103 ) received in the third period to obtain a visible light image in the fourth period, and the third The end moment of the period can be understood as the shooting moment of the available light image. Similarly, the above-mentioned length of the fourth period of time may be referred to as the light-shielding duration of the light-shielding sheet 1033-3.

Then, the dimming sheet 103 can continue to rotate under the control of the driving module, so that the light gathered to the dimming sheet 103 hits the infrared bandpass filter 1031-2 and the shading sheet 1033 in the fifth period and the sixth period respectively. -1 corresponds to the fan-shaped area, so that the camera can obtain the second infrared image. It should be understood that the exposure time of the second infrared image is the length of the fifth time period, and the shooting time of the second infrared image is the end time of the fifth time period, which will not be repeated here.

Then, the dimmer sheet 103 can continue the above rotation under the control of the driving module, so that the photographing device can obtain an infrared image or a visible light image.

It should be understood that some of the above first to sixth periods of time may be consecutive, but not overlapping with each other. It should also be understood that the duration of each of the first to sixth time periods may be the same or different.

Based on the above introduction, it can be known that, for the example shown in FIG. 2A , two infrared images and one visible light image can be obtained by rotating the dimming plate 103 once. For the example shown in FIG. 2B , one infrared image and one visible light image can be obtained by one revolution of the dimmer sheet 103 . Of course, for the example in FIG. 2B , two infrared images and one visible light image can also be obtained by driving the dimmer 103 to rotate one and a half times. Specifically, the dimmer 103 can be driven counterclockwise (as shown in FIG. 2B ) to rotate one and a half times, then the light gathered on the dimmer 103 can hit the infrared bandpass filter 1031, the shading film 1033- 2. Infrared cut-off filter 1032, shading sheet 1033-1, infrared bandpass filter 1031, and shading sheet 1033-2. Then, based on the foregoing description, it can be seen that two infrared images and one visible light image can be obtained during the process of the dimming sheet 103 shown in FIG. 2B rotating one and a half times.

In the example described above in FIG. 2A and FIG. 2B , the driving module can drive the dimmer 103 to always move in one direction during the working process, for example, the driving module drives the dimmer 103 to always rotate clockwise, or Always rotate counterclockwise. In actual implementation, when the dimmer 103 is working, the driving module can flexibly control the direction of movement of the dimmer based on business requirements. For example, in the example shown in FIG. 2C , the driving module can control and drive the dimmer 103 to rotate clockwise sometimes and counterclockwise sometimes. Specifically, the driving module can drive the dimmer 103 to rotate clockwise, so that the light gathered on the dimmer 103 can hit the infrared band-pass filter 1031, the light-shielding film 1033, and the infrared cut-off filter 1032 in sequence; Then, the driving module can drive the dimmer 103 to rotate counterclockwise, so that the light collected on the dimmer 103 strikes the light shield 1033 and the infrared bandpass filter 1031 again in sequence. Then, the driving module can drive the dimming chip 103 to rotate clockwise again. It should be understood that during the above-mentioned movement process of the dimmer 103 , the photographing device can sequentially obtain an infrared image, a visible light image, an infrared image, and a visible light image, etc., which will not be repeated here.

In one embodiment, the size of the shading plate can be used to adjust the exposure time of the infrared bandpass filter or the infrared cutoff filter. The following description will be made in conjunction with the example shown in FIG. 2A .

As shown in FIG. 2A , the exposure duration corresponding to the corresponding filter can be adjusted by adjusting the angle of the central angle corresponding to each sector. For example, adjusting the size of the central angle 1 corresponding to the infrared bandpass filter 1031-1 can adjust the infrared light exposure time corresponding to the infrared bandpass filter 1031-1, and adjust the corresponding infrared light exposure time of the infrared bandpass filter 1031-2. The size of the central angle 2 can adjust the infrared light exposure time corresponding to the infrared band-pass filter 1031-2, and the size of the central angle 3 corresponding to the infrared cut-off filter 1032 can adjust the visible light corresponding to the infrared cut-off filter 1032. exposure time. Because the sum of the central angles of each sector is 360 °, then, if the central angle of the light-shielding sheet 1033 is enlarged, the central angle of the optical filter (infrared bandpass optical filter 1031 and infrared cut-off optical filter 1032 are collectively referred to as optical filter) The corresponding strain will be small. Therefore, the exposure time of the corresponding filter can also be adjusted by adjusting the size of the central angle of each light shield.

For example (see FIG. 2A ), the central angle 1 of the infrared bandpass filter 1031-1 can be increased (or decreased) by reducing (or increasing) the central angle of the shading film 1033-1, so as to Increase (or decrease) the exposure time of the infrared light corresponding to the infrared bandpass filter 1031-1. Alternatively, the central angle 2 of the infrared bandpass filter 1031-2 can be increased (or reduced) by reducing (or increasing) the central angle of the shading sheet 1033-3 to increase (or decrease) the infrared bandpass The exposure time of the infrared light corresponding to the filter 1031-2. Alternatively, the central angle 3 of the infrared cut-off filter 1032 can be increased (or reduced) by reducing (or increasing) the central angle of the shading plate 1033-2, so as to increase (or reduce) the infrared cut-off filter 1032 corresponds to the exposure time of visible light.

For another example, in FIG. 2A, the central angle 2 of the infrared bandpass filter 1031-2 can be increased (or decreased) by reducing (or increasing) the central angle of the shading plate 1033-1, so as to Increase (or decrease) the exposure time of the infrared light corresponding to the infrared bandpass filter 1031-2. The central angle 3 of the infrared cut filter 1032 can be increased (or reduced) by reducing (or increasing) the central angle of the shading sheet 1033-3, so as to increase (or reduce) the corresponding Visible light exposure time. The central angle 1 of the infrared bandpass filter 1031-1 can be increased (or decreased) by reducing (or increasing) the central angle of the shading sheet 1033-2 to increase (or decrease) the infrared bandpass filter The exposure time of the infrared light corresponding to the slice 1031-1. The longer the exposure time of infrared light, the more details in the dark part of the image can be obtained, that is, the texture of the image is clearer. The longer the visible light exposure time, the more color information of the image can be obtained, that is, the color of the image is more natural.

It was introduced above that the shading duration of the shading sheet and the exposure duration of each filter can be controlled by adjusting the size of the shading sheet. It should be noted that the exposure duration of each filter in the dimmer and the shading duration of each shade can also be adjusted by controlling the movement speed of the dimmer. For example, in the example of the circular dimmer shown in Figure 2A-Figure 2C, assuming that the rotation speed of the dimmer is t seconds, one circle is 360°, then 1° rotation requires t/360 Second. Then, if the fan-shaped central angle of a filter is θ°, the exposure time corresponding to the filter is (θ*t)/360 seconds. Similarly, if the fan-shaped central angle of a shading sheet is φ°, the shading duration of the shading sheet is (φ*t)/360 seconds. Wherein, the moving speed of the dimming film can be controlled by the driving module. Specifically, the driving module can control the dimmer 103 to rotate at a certain speed at a constant speed. For example, 20 revolutions per second, 30 revolutions per second, or 50 revolutions per second, etc. Or, the driving module can control the dimming sheet 103 to rotate at the first speed in a certain period of time and at the second speed in another period of time according to business requirements. For example, during the period of time when the light collected on the dimming film hits the filter, it rotates at the first speed; during the time period when the light collected on the dimming film hits the light shielding film, it rotates at the second speed; wherein The first speed is different from the second speed.

It should be understood that on the basis of adjusting the exposure duration of each filter and the shading duration of each shading sheet by adjusting the size of the shading sheet or controlling the moving speed of the shading sheet, the shooting device can conveniently control the shooting moment of the image. The following is based on the example described above when introducing the working principle of the dimmer shown in FIG. 2A , and will be introduced in conjunction with FIG. 2D . As shown in Figure 2D, assuming that the shooting device needs to capture the first infrared image, the visible light image and the second infrared image at T0 time, T1 time and T2 time respectively, the movement of the dimmer can be controlled based on the driving module, so that the above-mentioned first The end time of a period is T0, the end time of the third period is T1, and the end time of the fifth period is T2. That is to say, the photographing device 100 can conveniently photograph the above-mentioned first infrared image, visible light image and second infrared image.

Based on the above description, it can be seen that the dimmer 103 in the photographing device 100 can flexibly adjust its own motion direction, speed and mode of motion (based on the drive module) according to actual business needs, so that the dimmer 103 can The light collected on the dimmer 103 hits a corresponding filter or a certain light-shielding film in a corresponding time period, so as to select different filters in different time periods according to requirements, that is, finally The photographing device 100 can photograph corresponding infrared images or visible light images at corresponding moments according to actual needs, making photographing very convenient and flexible.

In a possible implementation, the dimmer 103 shown in FIG. 1 can also be polygonal, for example, it can be triangular, quadrangular or hexagonal, etc., correspondingly, the above-mentioned infrared bandpass filter 1031, infrared cutoff The optical filter 1032 and the shading sheet 1033 can be triangular, quadrilateral or hexagonal respectively. It should be understood that the specific dimming function and working principle of the polygonal structure of the dimming sheet are similar to those of the above circular structure, and will not be repeated here. In these implementations, the rotation (for example, rotation direction, rotation speed, etc.) of the dimmer 103 can also be driven and controlled based on the driving module, so that the shooting device 100 can capture corresponding infrared images or images at corresponding times according to actual needs. Visible light images will not be repeated here.

In a possible implementation manner, the dimmer 103 shown in FIG. 1 can also be rectangular, and correspondingly, the above-mentioned infrared bandpass filter 1031, infrared cutoff filter 1032 and shading sheet 1033 can also be rectangular, See, for example, FIG. 3 . In this embodiment, the driving module can drive the dimmer 103 to move during operation, so that the light collected on the dimmer 103 strikes a filter or a light shield at different time periods.

The shooting principle when the dimming sheet 103 is rectangular will be introduced below with reference to FIG. 3 .

Driven by the driving module, the dimmer 103 shown in FIG. 3 can move in one direction (for example, the first direction), so that the light gathered to the dimmer 103 can be illuminated sequentially at different time intervals, labeled 1, 2, 3, 4 and 5 corresponding to the rectangular area. Afterwards, the driving module can drive the dimmer 103 to move in the direction opposite to the above-mentioned first direction, so that the light collected on the dimmer 103 hits the rectangular areas corresponding to the

labels

4, 3, 2 and 1 in sequence. Through the above-mentioned driving and control of the driving module, it is assumed that during the movement of the dimmer 103 in the first direction, the light gathered to the dimmer 103 can be sequentially respectively in the first period, the second period, the third period, and the second period. The fourth period and the fifth period are played on the infrared bandpass filter 1031-1, the shading sheet 1033-1, the infrared cut-off filter 1032, the shading sheet 1033-2 and the infrared bandpass filter 1031-2; after that, During the movement of the dimming sheet 103 in the opposite direction to the first direction, the light gathered to the dimming sheet 103 can hit the light shielding sheet 1033-2 in the sixth period, the seventh period and the eighth period respectively, On the infrared cut filter 1032 and the shading sheet 1033-1. Therefore, in the above process, the photographing device 100 can process and obtain the first infrared image at the end of the first time period (or the beginning of the second time period), and obtain the first infrared image at the end of the third time period (or the beginning of the fourth time period). The light image is processed to obtain the second infrared image at the end of the fifth period (or the beginning of the sixth period), and the second visible light image is obtained at the end of the seventh period (or the beginning of the eighth period). Subsequent processes are similar and will not be repeated here.

Likewise, the length of the period during which the light gathered to the dimmer 103 hits each filter can be used to indicate the exposure time of the corresponding infrared image or visible light image. For example, in the above example, the length of the first time period may be used to indicate the exposure time of the first infrared image, and the length of the third time period may be used to indicate the exposure time of the first visible light image. Therefore, it can be understood that the width of each filter can affect the exposure time of the corresponding infrared image or visible light image. Therefore, when the overall width of the light-shielding sheet 103 is fixed, the exposure duration of the corresponding light-shielding sheet can be adjusted by adjusting the width of the rectangular area of each light-shielding sheet. Taking Fig. 3 as an example, the width of the shading film 1033-1 can be narrowed (or widened) to increase (or reduce) the width of the infrared bandpass filter 1031-1 to increase (or reduce) the width of the infrared bandpass filter The exposure time of the infrared light corresponding to the slice 1031-1. The same is true for other filters, which will not be repeated here.

Similarly, the exposure duration of each filter and the shading duration of each shading sheet can also be controlled by controlling the moving speed of the dimmer 103 , which will not be repeated here. The moving speed of the dimmer 103 can also be controlled by the driving module. The application does not limit the magnitude of the moving speed of the dimmer 103 or whether the moving speed changes.

Based on the above description, it can be seen that the dimmer 103 shown in FIG. 3 can also flexibly select and control the light gathered on the dimmer 103 according to business requirements, and can flexibly control the exposure time of various lights. , so that finally the photographing device 100 can photograph corresponding infrared images or visible light images at a corresponding time according to actual needs, making the photographing very convenient and flexible.

It should be noted that the above-mentioned dimmer 103 is not limited to the form described above, and can also be in other forms, as long as the dimmer including the above-mentioned light filter and light-shielding film falls within the protection scope of the present application. The light filter and light-shielding film in the dimmer 103 are not limited to the shapes described above, and may also be in other shapes, such as circular or polygonal. It should also be noted that if the dimmer 103 is polygonal or rectangular, then the circular dimmer 103 shown in FIG. 1 should be replaced with a polygonal or rectangular dimmer 103 , which will not be repeated here.

In addition, the photographing device 100 provided in the present application may also include other components other than those shown in FIG. 1 , which is not limited in the present application. For example, the camera 100 includes a processor. The processor may be used to execute the image processing method provided in the embodiment of the present application (for an example, refer to the description in FIG. 4 below, which will not be repeated here). Specifically, the shooting device can shoot the same scene based on the above-mentioned dimmer, driving module, imaging module and other modules, and obtain the first infrared image, the visible light image and the second infrared image respectively at T0 time, T1 time and T2 time, and then The final fused image is obtained by executing the image processing method provided by the embodiment of the present application based on the processor (see the description in FIG. 4 below for an example, which will not be repeated here). For another example, the photographing device 100 may also include a casing (not shown in FIG. 1 ), and the above-mentioned infrared light emitter 101, lens 102, dimmer 103 and imaging module 104 may be exemplarily according to the relative positions shown in FIG. 1 fixed in the housing. For another example, the photographing device 100 may also include a flashlight, a memory card, etc., which will not be repeated here.

The above-mentioned shooting device 100 can be any device with the above-mentioned shooting structure and functions, for example, it can be a camera, various cameras (such as monitoring or security cameras), or a smart phone, a tablet computer, a handheld computer, a smart wearable device (Including smart bracelets, smart watches and smart glasses, etc.) and other forms of user equipment (User Equipment, UE), mobile station (Mobile station, MS), terminal equipment (Terminal Equipment) and so on.

An image processing method provided by this application is introduced below. The subject of execution of the method may be an image processing device. In actual implementation, the image processing device may be a photographing device that captures images, or may be other devices with computing capabilities (such as a server, a processing chip, etc.) that are different from the photographing device. Exemplarily, if the subject of execution is the photographing device, the imaging module in the photographing device may send the captured image to a processor of the photographing device for processing after acquiring an image. If the execution subject is another device with computing capability, then, after the photographing device captures an image through the imaging module, it sends the captured image to the device with computing capability for processing.

Referring to Figure 4, Figure 4 is a schematic flow chart of the image processing method provided by the present application, which includes but is not limited to the following steps:

S401. Acquire a first infrared image, a visible light image, and a second infrared image; wherein, the first infrared image, the visible light image, and the second infrared image are the same image captured by the shooting device at T0, T1, and T2 respectively The scene is obtained, T0<T1<T2.

The above-mentioned "scene" refers to the shooting scene of the shooting device or the viewing space of shooting. The viewing space (or shooting scene) includes one or more target objects to be photographed, and the target objects may be any objects in the viewing space, for example, animals, plants, vehicles, people and so on. In order to facilitate understanding of the viewing space, reference may be made to FIG. 5 by way of example. The view space shown in FIG. 5 is similar to a polyhedron, that is, the space of the polyhedron formed by faces ABCD, sides AE, BH, CG, DF and face EFGH is the view space of the camera. Exemplarily, the surfaces ABCD and EFGH can move relative to each other. FIG. 5 is only an example, and does not constitute a limitation to the embodiment of the present application.

Wherein, there may be relative motion between the target object in the above-mentioned view space and the photographing device. In one case, it may be that the photographing device is fixed, and the target object moves within the range of the field of view of the photographing device. For example, in a traffic monitoring scene, the photographing device may be fixed on a light pole, and the target object in the visual field captured by the photographing device may be a moving vehicle on the road. In another possible situation, it is also possible that the target object does not move, but the photographing device moves. In this case, the shooting device can keep its field of view range covering all or part of the field of view space (or shooting scene) during the moving process. For example, in a surrounding shooting scene, the shooting device may rotate around the stage to shoot, but the shooting angle of the shooting device is always facing the center of the stage, so that its field of view covers all or part of the shooting scene. In other cases, both the object of interest and the camera device in the field of view being photographed may be moving. Likewise, in this process, it is necessary to keep the viewing angle range of the shooting device to cover all or part of the shooting scene or viewing space.

It should be understood that since in this method, the captured infrared image and the visible light image need to be fused, therefore,

It should be noted that, in this embodiment, the image captured by the photographing device may include a complete target object in the captured field of view, or may only include a part of a certain target object. For example, if the captured space contains a train, only a portion of the train may be included in the captured image. This can be adjusted according to actual needs, which is not limited in this application.

It should also be noted that, in this step, the above-mentioned "the shooting device shoots the same scene at T0 time, T1 time and T2 time respectively" may mean that the field of view spaces captured by the shooting device at the three time points are completely the same, or It may be that the field of view spaces photographed by the photographing device at the three moments are partly the same. For example, when the photographing device is stationary during the process of photographing three images, the field of view space photographed by the photographing device at the three moments may be completely the same. When the photographing device moves during the process of photographing the three images, the view spaces photographed by the photographing device at the three moments may only be partly the same. This is because in this method, the fusion of the infrared image and the visible light image needs to be realized based on the images captured at three moments, so at least one of the infrared images and the visible light image in the three captured images has part of the same content (otherwise, it cannot for fusion). Therefore, when the above-mentioned photographing device photographs the same scene at three moments, it may mean that the field of view spaces photographed by the photographing device at the three moments are partly the same.

Exemplarily, the photographing device may be, for example, the photographing device 100 shown in FIG. 1 above. In one embodiment, the photographing device may also be a device comprising two photographing modules (for example, two sets of lenses and sensors are respectively provided), wherein one photographing module has the function of taking infrared images, and the other photographing module has the function of taking visible light Image function. In this implementation, the infrared image capturing module can capture the first infrared image at T0 and the second infrared image at T2 through the preset configuration, and the visible light capturing module can capture the visible light image at T1 . In a specific implementation, the present application does not limit the specific form and structure of the photographing device.

In a specific implementation, for the above acquisition of the first infrared image, visible light image and second infrared image, in a possible implementation manner, if the execution subject is the photographing device, then the first infrared image can be captured by the photographing device , a visible light image and a second infrared image. In another possible implementation, if the execution subject is the above-mentioned other device with computing capability, then, after the photographing device obtains the first infrared image, the visible light image and the second infrared image, it can send the obtained images to The other device having computing capabilities.

S402. Calculate and obtain a target infrared image based on the first infrared image and the second infrared image, the target infrared image includes a third infrared image and/or a fourth infrared image; the third infrared image is the conversion of the first infrared image from time T0 to The image obtained at time T1, and the fourth infrared image is an image obtained by converting the second infrared image from time T2 to time T1.

In a specific implementation, the image processing device obtains the above-mentioned first infrared image, visible light image and second infrared image, and obtains the shooting time T0 of the first infrared image, the shooting time T1 of the visible light image, and the shooting time of the second infrared image T2. Exemplarily, if the image processing device is the photographing device, the image processing device may record the shooting moments of the above-mentioned first infrared image, visible light image and second infrared image, so as to obtain the T0, T1 and T2. Or, for example, if the image processing device is a processing device with computing capability different from the shooting device, then the shooting device may send the recorded T0, T1 and T2 to the image processing device.

Taking the target infrared image including the third infrared image and the fourth infrared image as an example, the specific process of calculating the target infrared image based on the first infrared image and the second infrared image will be introduced below.

Specifically, the image processing device calculates the optical flow F1 from the first infrared image to the second infrared image, and calculates the optical flow F2 from the second infrared image to the first infrared image.

In a possible implementation, the optical flow extraction neural network can be used to calculate the optical flow F1 and the optical flow F2, the input of the optical flow extraction neural network is the first infrared image and the second infrared image, and the output is the optical flow Flow F1 and Optical Flow F2. Exemplarily, the optical flow extraction neural network may be a Unet architecture neural network. Exemplarily, the neural network for extracting optical flow can be obtained by training a virtual data set disclosed by Scene Flow.

In another possible implementation manner, an existing optical flow calculation method may be used to calculate the above optical flow F1 and optical flow F2, and the present application does not limit the specific optical flow calculation method. Existing optical flow calculation methods may be, for example, gradient-based methods, matching-based methods, frequency-domain (energy)-based methods, phase-based methods, Lucas-Kanada algorithms, and the like.

Further, since T0<T1<T2, and the relationship between T0, T1 and T2 satisfies T1=(1-k)*T0+k*T2. Among them, 0<k<1. After obtaining the above optical flow F1 and optical flow F2, the image processing device can calculate the infrared image corresponding to the time T1 based on the optical flow F1, the optical flow F2 and the above T1=(1-k)*T0+k*T2 (that is, The optical flow F3 from the third infrared image) to the first infrared image, and the optical flow F4 from the corresponding infrared image (that is, the fourth infrared image) to the second infrared image at time T1 is calculated. Exemplarily:

F3＝-(1-k)*k*F1+k ² *F2

F4=(1-k) ² *F1-k*(1-k)*F2

For example, assuming the above k=1/2, then F3=0.25*(F2-F1), F4=0.25*(F1-F2).

After the optical flow F3 and the optical flow F4 are obtained, an optical flow reverse mapping method may be used to reversely map the first infrared image based on the optical flow F3 to obtain the third infrared image. Similarly, the above-mentioned fourth infrared image is obtained by performing reverse mapping on the above-mentioned second infrared image based on the optical flow F4.

It can be seen from the above process that the image processing device can obtain the above-mentioned target infrared image by means of optical flow reverse mapping based on the above-mentioned relationship between T0, T1 and T2, and the optical flow between the first infrared image and the second infrared image . It should be noted that the calculation of the target infrared image based on the first infrared image and the second infrared image is not limited to the above-described method using optical flow reverse mapping. The optical flow method in other implementation manners may also be used for calculation, which is not limited in the present application.

In the above process, since the converted third infrared image is the image corresponding to the first infrared image at time T1, that is, the third infrared image can be understood as the image corresponding to the shooting scene at time T1, and the above-mentioned visible light image is also the image corresponding to the shooting scene. The image corresponding to the scene at time T1, then switching the first infrared image from time T0 to time T1 is equivalent to realizing the registration of the first infrared image and the visible light image. However, in the existing technical solutions, image registration is generally directly performed on infrared images and visible light images based on feature matching. Since infrared images and visible light images are images of different modalities, processing is based on two different modal images. (extracting the feature points of the two) to achieve registration accuracy is poor. However, the present application adopts the conversion between infrared images of the same modality (for example, based on the above-mentioned relationship between T0, T1 and T2, and the optical flow between the first infrared image and the second infrared image, the first infrared image Switching from time T0 to time T1) to realize the registration of the infrared image and the visible light image can improve the registration accuracy of the infrared image and the visible light image, thereby improving the quality of the final fused image.

S403. Fusion the target infrared image and the visible light image to obtain a fusion image.

In a possible implementation manner, the fusion of the visible light image and the infrared image of the target may be achieved through an image fusion neural network. The image fusion neural network may include a color extraction neural network and a texture extraction neural network. Wherein, the color extraction neural network is used to extract the color features of the visible light image. The texture extraction neural network is used to extract the texture features of the target infrared image. Then, image fusion is performed based on the extracted color features and texture features to obtain a fusion image.

In the following, with reference to FIG. 6A and FIG. 6B , the case where only the third infrared image is included in the target infrared image, and the case where the target infrared image includes the third infrared image and the fourth infrared image are respectively introduced in detail.

Referring to FIG. 6A exemplarily, taking the target infrared image including the third infrared image as an example, it shows a schematic flowchart of the image fusion neural network. As shown in FIG. 6A , the above visible light image is input to the color extraction neural network. Then, the color feature of the visible light image is extracted through the color extraction neural network, and the grayscale image of the visible light image is also extracted. Then, a color index table is constructed based on the extracted color features and the grayscale image. Specifically, the color index table may include the values of several colors contained in the visible light image and the index of each color in the several colors, so that the corresponding color value can be found in the color index table based on the index of the color. Exemplarily, the color value may be the value of the primary three primary colors RGB. Or, exemplary, the color value may be RGB plus a constant value and so on. The present application does not limit the specific expression of the color.

In addition, as shown in FIG. 6A , after the third infrared image is input into the texture extraction neural network, the texture feature of the third infrared image is extracted through the texture extraction neural network, and a texture guide map is generated based on the texture feature. The texture guide map includes a color index for each pixel in the final output fused image.

After obtaining the above color index table and texture guide map, the image fusion neural network generates a fusion image based on the color index table and texture guide map. Specifically, based on the color index of each pixel in the texture guide map, look up the color value of the pixel in the color index table and fill the found color value into the corresponding position of the pixel to generate the fused image.

Exemplarily, the above-mentioned color extraction neural network may be a small-scale deep neural network with a lower resolution. In a possible implementation, the resolution of the color extraction neural network is not higher than a preset first resolution threshold, and the number of layers of the color extraction neural network is not lower than a preset first network depth threshold.

The aforementioned preset first resolution threshold may be, for example, a video graphics array (video graphics array, VGA) resolution or the like. Exemplarily, the low resolution may be, for example, a quarter video graphics array (quarter video graphics array, QVGA) resolution. Or, exemplarily, the low resolution may also be VGA resolution. Using a low-resolution color extraction neural network can enhance the immunity to noise. However, if the accuracy of the extracted color boundary needs to be improved, the resolution of the color extraction neural network can be adaptively improved. The aforementioned preset first network depth threshold may be 20, thus, the number of layers of the color extraction neural network may be 20-30 layers.

Exemplarily, the above-mentioned texture extraction neural network is usually a relatively high-resolution shallow neural network. In a possible implementation manner, the resolution of the texture extraction neural network is higher than a preset second resolution threshold, and the number of layers of the texture extraction neural network is lower than a preset second network depth threshold. Exemplarily, the preset second resolution threshold may be VGA resolution. The preset second network depth threshold may be any integer between 5 and 10. Exemplarily, the resolution of the texture extraction neural network may be the resolution of the original image of the captured infrared image, and the number of layers of the texture extraction neural network may be 3 to 5 layers and so on.

It should be understood that, for the case where only the fourth infrared image is included in the target infrared image, the same manner as above can be used to obtain the final fused image, which will not be repeated here.

As shown in FIG. 6B , taking the target infrared image including the third infrared image and the fourth infrared image as an example, it shows a schematic flow chart of the image fusion neural network.

Similarly, the image fusion neural network may include a color extraction neural network and a texture extraction neural network. Wherein, the color extraction neural network is used to extract the color features of the visible light image. The texture extraction neural network is used to extract texture features of the third infrared image and extract texture features of the fourth infrared image. Then, image fusion is performed based on the extracted color features and the texture features of the above two infrared images to obtain a fusion image.

Similarly, in FIG. 6B , the color index table is obtained based on the visible light image through the color extraction neural network. However, the process of generating the texture-guided map of the fusion image by the texture extraction neural network is slightly different: in the process, the texture extraction neural network performs texture fusion on the texture features of the third infrared image and the texture features of the fourth infrared image to obtain the texture of the fusion image. Texture guide map. Then, the fused image is obtained based on the obtained color index table and the texture guide map of the fused image, and this process will not be repeated.

The image fusion neural network provided by this application has better hardware adaptability than existing neural networks (eg, Unet neural network). Specifically, the general-purpose Unet network is usually used to process pixel-level tasks, and its computational complexity is 100K TOPS (TOPS means trillion operations per second, which is the abbreviation of Tera Operations Per Second). For 4K images, the theoretical computing power requirement of the Unet network under 30FPS (FPS means the number of frames per second, which is the abbreviation of Frames Per Second) is about 8M*100K*30=24TOPS. In the image fusion neural network provided by this application, the resolution of the color extraction neural network can be 256*256, and the computing power requirement is 0.2TOPS; the resolution of the texture extraction neural network is 8M, and the computing power requirement is 1.2TOPS. It can be seen that the total computing power of the image fusion neural network provided by this application is less than 2TOPS, which is much smaller than that of the Unet network. That is, the image fusion neural network provided by this application requires far less hardware computing power than the Unet network, so it has better hardware adaptability.

In a possible implementation manner, the above-mentioned fusion of the target infrared image and the visible light image may also be achieved by using an existing image fusion technology, and this application does not limit the specific image fusion method.

In a possible implementation manner, the present application provides an image processing model to implement the above image processing method. For example, refer to FIG. 7 , which shows a schematic structural diagram of the image processing model. As shown in FIG. 7 , the image processing model 700 (it should be understood that the "image processing model" herein may also be referred to as "image processing algorithm", or "image processing module") includes an image acquisition module 710, an optical flow extraction neuron Network 720 , optical flow reverse mapping processing module 730 and image fusion neural network 740 . in:

The image acquisition module 710 is configured to acquire the above-mentioned first infrared image, visible light image and second infrared image. For specific acquisition, firstly, reference may be made to the corresponding description in the above step S401, which will not be repeated here.

The optical flow extraction neural network 720 is used to extract the optical flow of the first infrared image and the second infrared image. For specific implementation, reference may be made to the description in the above step S402, which will not be repeated here.

The optical flow reverse mapping processing module 730 is configured to perform image reverse mapping based on the extracted optical flow of the first infrared image and the second infrared image, so as to obtain the aforementioned target infrared image. For specific implementation, reference may be made to the description in the above step S402, which will not be repeated here.

The image fusion neural network 740 is used to fuse the above-mentioned visible light image and target infrared image to obtain a fusion image. For specific implementation, reference may be made to the description in the above step S403, which will not be repeated here.

In a specific implementation, since the processing operations in the optical flow reverse mapping processing module between the optical flow extraction neural network and the image fusion neural network included in the above image processing model are linearly differentiable, the entire image processing model can be end-to-end training. That is, during the entire training process, the input is three images (for example, the above-mentioned first infrared image, visible light image and second infrared image, and the three images are captured by the shooting device at the time T0, T1 and T2 respectively of the same scene obtained), the output is the fused image, and then the output fused image is fed back to each neural network for gradual correction of parameters. This kind of end-to-end training can make the image fusion neural network fault-tolerant to the calculation error of the optical flow in the optical flow extraction neural network, and finally make the trained image processing model more robust.

In a possible implementation manner, the training images of the above-mentioned image processing model may be infrared images and visible light images collected in an environment where the illuminance is lower than a preset illuminance threshold. The environment whose illuminance is lower than the preset illuminance threshold may be a low-illuminance environment. Exemplarily, the low-illuminance environment may be an environment with an illuminance below 10 lux (Lux). That is, the preset illumination threshold may be 10, for example. Alternatively, the low-illuminance environment may be specifically determined according to an illuminance standard, and the preset illuminance threshold may be determined according to a low-illuminance value in the illuminance standard, which is not limited in the present application. Since the training images of the image processing model are taken in low-light environment, the image processing model trained in this way can learn the dark details, color and texture of the image, so that it can be fused to obtain a clear color image in a low-light environment.

To sum up, this application first acquires the first infrared image, the visible light image and the second infrared image (these three images are obtained by the shooting device at T0 time, T1 time and T2 time respectively in the same scene), and then based on the two Infrared images and the shooting time information of the above three images, the optical flow method is used to convert the captured infrared images to the corresponding infrared images at the time when the visible light images are captured, so as to realize the registration of the captured infrared images and visible light images. Compared with the existing image processing based on different modalities (extracting the feature points of infrared images and visible light images respectively, and realizing registration based on the feature points of the two) to realize the registration scheme of infrared images and visible light images, This application realizes the registration of infrared images and visible light images through the processing of infrared images of the same mode (based on two captured infrared images), which improves the registration accuracy of infrared images and visible light images, thus making the final fused image clearer .

In addition, in the above scheme provided by the present application, in the shooting scene of a fast-moving object, the target infrared image includes the third infrared image and the fourth infrared image (that is, the scheme based on the fusion of three images), compared to the target infrared image Only the third infrared image is included in the image, or only the fourth infrared image is included in the target infrared image (that is, the scheme based on the fusion of two images), which can avoid the incomplete clear area in the fusion image, so that the color and Fusion images with richer details and more natural. This situation will be described in detail below with reference to FIG. 8 .

As shown in FIG. 8 , it is assumed that the position of the shooting device is fixed, and there is a target object moving along one direction in the shooting scene. Exemplarily, it is assumed that the target object includes six parts labeled 1, 2, 3, 4, 5 and 6. Since the photographing device has a limited shooting field angle range, it is assumed that the photographing device can only capture two parts of the above six parts of the target object in each shot. Then, with the continuous movement of the target object, the parts labeled 1 and 2 of the target object are captured at time T0, the parts labeled 2 and 3 of the target object are captured at time T1, and the target is captured at time T2 The parts of the object numbered 3 and 4. There is a period of time during which the shading sheet is blocked between the time T0 and the time T1 and between the time T1 and the time T2. Since the target object is moving, the target object appears to be displaced in the two images captured before and after the time period covered by the shading sheet, causing one of the two images to lack the target object compared to the other image a part of. The image obtained by shooting can be referred to FIG. 8 .

Then, based on the foregoing description, the first infrared image captured at time T0 is converted to time T1 to obtain a third infrared image corresponding to the first infrared image at time T1. Moreover, the second infrared image captured at the time T2 is converted to the time T1 to obtain a fourth infrared image corresponding to the second infrared image at the time T1. As can be seen in FIG. 8 , the third infrared image retains the part marked 2 of the subject, and compared with the first infrared image, the third infrared image lacks the part marked 3 of the subject. In addition, since the third infrared image is an infrared image (mainly contributing texture features), if a fusion scheme of two images is adopted, for example, assuming that the target infrared image only includes the third infrared image, that is, only the third infrared image If the three-infrared image is fused with the visible light image captured at time T1, the fused image will lack the clear texture features of the part labeled 3 of the object to be photographed, resulting in insufficient clarity in some areas of the fused image. It should be understood that the size of the "partial area" may be related to the movement speed of the target object and the length of the shooting time interval of the above three images. If the movement of the target object is slow, or the shooting interval of the above three images is short, the above Unsharp areas will be smaller and will not affect the overall effect of the image.

If the above-mentioned fusion scheme of the three images (that is, the target infrared image contains the third infrared image and the fourth infrared image) is adopted, then, as can be seen in Fig. 8, the fourth infrared image includes the subject The part labeled 3, therefore, the fusion of the third infrared image, the visible light image captured at time T1 and the fourth infrared image can make the final fusion image obtain a complete and clear texture, even All areas of the obtained image are clear, so a clearer and more natural color image can be obtained compared to the fusion scheme based on the above two-frame images.

The above describes the image processing method provided by the embodiment of the present application. It can be understood that, in order to realize the above-mentioned corresponding functions, the image processing device includes hardware structures and/or software modules corresponding to each function. Combining the units and steps of each example described in the embodiments disclosed herein, the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software drives hardware depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.

The embodiment of the present application can divide the device into functional modules according to the above method examples. For example, each functional module can be divided corresponding to each function, or two or more functions can be integrated into one module. The above-mentioned integrated modules can be implemented in the form of hardware or in the form of software function modules. It should be noted that the division of modules in this embodiment of the present application is schematic, and is only a logical function division, and there may be other division methods in actual implementation.

In the case of dividing each functional module corresponding to each function, FIG. 9 shows a specific logical structural diagram of a device, which may be the image processing device in the above method embodiment. The image processing device 900 includes:

An acquisition unit 901, configured to acquire a first infrared image, a visible light image, and a second infrared image; wherein, the aforementioned first infrared image, the aforementioned visible light image, and the aforementioned second infrared image are obtained by the shooting device at time T0, T1, and T2, respectively. Shoot the same scene to get, T0<T1<T2;

The calculation unit 902 is configured to calculate a target infrared image based on the aforementioned first infrared image and the aforementioned second infrared image, where the aforementioned target infrared image includes a third infrared image and/or a fourth infrared image; the aforementioned third infrared image is the aforementioned first infrared image An infrared image is converted from the aforementioned T0 moment to the aforementioned image obtained at the aforementioned T1 moment, and the aforementioned fourth infrared image is an image obtained by converting the aforementioned second infrared image from the aforementioned T2 moment to the aforementioned T1 moment;

The fusion unit 903 is configured to fuse the aforementioned target infrared image and the aforementioned visible light image to obtain a fused image.

In a possible implementation manner, the foregoing computing unit 902 is specifically configured to:

In a possible implementation manner, the aforementioned target infrared image includes the aforementioned third infrared image; the aforementioned computing unit 902 is specifically configured to:

In a possible implementation manner, the foregoing fusion unit 903 is specifically used for:

Extracting the color features of the aforementioned visible light image;

Extracting texture features of the aforementioned target infrared image;

In a possible implementation manner, the resolution of the color extraction neural network is lower than a preset first resolution threshold and the number of layers of the color extraction neural network is higher than a preset first network depth threshold.

For specific operations and beneficial effects of each unit in the apparatus 900 shown in FIG. 9 , refer to the corresponding descriptions in FIG. 4 and its specific method embodiment above, and details are not repeated here.

FIG. 10 is a schematic diagram of a specific hardware structure of the device provided by the present application, and the device may be the image processing device described in the above-mentioned embodiments. The image processing device 1000 includes: a processor 1001 , a memory 1002 and a communication interface 1003 . The processor 1001 , the communication interface 1003 and the memory 1002 may be connected to each other or through a bus 1004 .

Exemplarily, the memory 1002 is used to store computer programs and data of the image processing apparatus 1000, and the memory 1002 may include but not limited to random access memory (random access memory, RAM), read-only memory (read-only memory, ROM) , erasable programmable read-only memory (EPROM) or portable read-only memory (compact disc read-only memory, CD-ROM), etc.

The communication interface 1003 includes a sending interface and a receiving interface, and there may be multiple communication interfaces 1003, which are used to support the image processing apparatus 1000 to communicate, for example, to receive or send data or messages.

Exemplarily, the processor 1001 may be a central processing unit, a general processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, a transistor logic device, a hardware component or any combination thereof. The processor can also be a combination that implements computing functions, such as a combination of one or more microprocessors, a combination of a digital signal processor and a microprocessor, and the like. The processor 1001 can be used to read the program stored in the above-mentioned memory 1002, so that the image processing apparatus 1000 executes the image processing method described in the above-mentioned FIG. 4 and its specific embodiments.

In a possible implementation manner, the processor 1001 may be configured to read the program stored in the above-mentioned memory 1002, and perform the following operations: acquire a first infrared image, a visible light image, and a second infrared image; wherein, the first infrared image, The visible light image and the second infrared image are obtained by shooting the same scene at the time T0, T1 and T2 respectively by the shooting device, T0<T1<T2; the target infrared image is calculated based on the first infrared image and the second infrared image image, the target infrared image includes a third infrared image and/or a fourth infrared image; the third infrared image is an image obtained by converting the first infrared image from the T0 moment to the T1 moment, and the fourth infrared image is converting the second infrared image from the T2 time to the image obtained at the T1 time; fusing the target infrared image and the visible light image to obtain a fused image.

For the specific operations and beneficial effects of each unit in the image processing apparatus 1000 shown in FIG. 10 , refer to the corresponding description in FIG. 4 and its specific method embodiment above, and will not repeat them here.

The embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and the computer program is executed by a processor to implement the above-mentioned any embodiment in FIG. 4 and its specific method embodiments. described method.

An embodiment of the present application further provides a computer program product. When the computer program product is read and executed by a computer, the method described in any one of the above-mentioned FIG. 4 and its specific method embodiments.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, rather than limiting them; although the application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: It is still possible to modify the technical solutions described in the foregoing embodiments, or perform equivalent replacements for some or all of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the various embodiments of the present application. scope.

Claims

An image processing method, characterized in that the method comprises:

Acquiring a first infrared image, a visible light image, and a second infrared image; wherein, the first infrared image, the visible light image, and the second infrared image are the same image captured by the shooting device at T0, T1, and T2, respectively. The scene is obtained, T0<T1<T2;

A target infrared image is calculated based on the first infrared image and the second infrared image, and the target infrared image includes a third infrared image and/or a fourth infrared image; the third infrared image is the first infrared image The infrared image is converted from the T0 moment to the image obtained at the T1 moment, and the fourth infrared image is an image obtained by converting the second infrared image from the T2 moment to the T1 moment;

The target infrared image and the visible light image are fused to obtain a fused image.
The method according to claim 1, wherein the calculation of the target infrared image based on the first infrared image and the second infrared image comprises:

The infrared image of the target is calculated and obtained according to the T0, the relationship between the T1 and the T2, and the first infrared image and the second infrared image by using an optical flow method.
The method according to claim 2, wherein the target infrared image comprises the third infrared image;

The optical flow method is used to calculate the target infrared image according to the T0, the relationship between the T1 and the T2, and the first infrared image and the second infrared image, including:

calculating an optical flow F1 from the first infrared image to the second infrared image and an optical flow F2 from the second infrared image to the first infrared image;

Based on the relationship among T0, T1 and T2, the optical flow F1 and the optical flow F2, the optical flow F3 is calculated, and the optical flow F3 is from the third infrared image to the first infrared image. Optical flow of an infrared image;

Performing optical flow reverse mapping based on the optical flow F3 and the first infrared image to obtain the third infrared image.
The method according to any one of claims 1-3, wherein the merging the target infrared image and the visible light image to obtain a fused image comprises:

extracting color features of the visible light image;

Extracting texture features of the target infrared image;

The fused image is obtained based on the color feature and the texture feature.
The method according to claim 4, wherein said merging said target infrared image and said visible light image to obtain a fused image comprises:

The target infrared image and the visible light image are fused by a first neural network to obtain the fused image; the first neural network includes a color extraction neural network and a texture extraction neural network, and the color extraction neural network is used to extract The color feature of the visible light image; the texture extraction neural network is used to extract the texture feature of the target infrared image.
The method according to claim 5, wherein the resolution of the color extraction neural network is not higher than the preset first resolution threshold and the number of layers of the color extraction neural network is not lower than the preset first resolution threshold A network depth threshold.
The method according to claim 5 or 6, wherein the resolution of the texture extraction neural network is higher than the preset second resolution threshold and the number of layers of the texture extraction neural network is lower than the preset second resolution threshold Two network depth thresholds.
The method according to any one of claims 5-7, wherein the method is implemented by an image processing model, and the image processing model includes the first neural network and the second neural network;

The second neural network is used to obtain an optical flow F1 from the first infrared image to the second infrared image and an optical flow F2 from the second infrared image to the first infrared image; The flow F1 and the optical flow F2 are used to calculate and obtain the infrared image of the target;

The first neural network and the second neural network included in the image processing model are obtained through end-to-end training.
The method according to claim 8, characterized in that, the training images of the image processing model are collected in an environment where the illuminance is lower than a preset illuminance threshold.
A photographing device, characterized in that the photographing device comprises a lens, a dimmer, a driving module, and an imaging module; the dimmer is located between the lens and the imaging module, and the driving module is connected to the imaging module. Dimmer connection;

The lens is used to gather the light incident on the lens onto the dimmer sheet;

The dimmer includes an infrared bandpass filter, an infrared cutoff filter and a shading sheet, the infrared bandpass filter is used to allow infrared light to pass through and filter visible light, and the infrared cutoff filter is used to Let visible light pass through and filter infrared light, and the shading sheet is used to prevent light from passing through;

The driving module is used to drive the dimmer to move, so that the light collected on the dimmer is incident on the infrared bandpass filter during the first period, and incident on the infrared bandpass filter during the second period. On the cut-off filter, it is incident on the light-shielding film during the third period and the fourth period;

The imaging module is used to receive infrared light passing through the infrared bandpass filter during the first period, and obtain a first infrared image based on the received infrared light during the third period; receiving visible light passing through the infrared cut filter in the second period, and obtaining a visible light image based on the received visible light in the fourth period; the first period, the second period, the third period and The fourth time periods do not overlap.
The photographing device according to claim 10, wherein the dimmer is circular, the infrared band-pass filter, the infrared cut-off filter and the light shield are fan-shaped; the driving The module is used to drive the dimmer to rotate.
The photographing device according to claim 10, wherein the dimmer is polygonal, and the infrared bandpass filter, the infrared cut-off filter and the light-shielding film are triangular or quadrilateral; The driving module is used to drive the dimmer to rotate.
The photographing device according to claim 10, wherein the dimmer is a rectangle, the infrared bandpass filter, the infrared cut-off filter and the light shield are rectangles; the driving module Used to drive the dimmer to move.
The photographing device according to any one of claims 10-13, wherein the infrared bandpass filter is adjacent to the light-shielding sheet, and the infrared cut-off filter is adjacent to the light-shielding sheet.
The photographing device according to any one of claims 10-14, wherein the length of the first time period indicates the exposure time of the first infrared image; the length of the second time period indicates the exposure time of the visible light image exposure time.
The photographing device according to any one of claims 10-15, wherein the length of the first period of time is related to the size of the shading sheet, and the shading sheet is adjacent to the infrared bandpass filter .
The photographing device according to any one of claims 10-16, characterized in that, the length of the first time period or the length of the second time period is related to the moving speed of the dimming film.
The photographing device according to any one of claims 10-17, characterized in that the moving speed of the dimmer is controlled by the driving module.
The photographing device according to any one of claims 10-18, wherein the end time of the first period is the start time of the third time period; the end time of the second time period is the start time of the fourth time period. The start moment of the period.
The photographing device according to any one of claims 10-19, characterized in that,

The end time of the first period is T0, and the end time of the second period is T1;

The driving module is also used to drive the dimmer to move, so that the light collected on the dimmer is incident on the infrared bandpass filter during the fifth period, so that the imaging module can obtain the first Two infrared images; the end time of the fifth period is T2; wherein, T0<T1<T2; the first infrared image, the visible light image and the second infrared image are obtained by shooting the same scene by the shooting device;

The photographing device further includes a processor configured to execute the method according to any one of claims 1-9.
The photographing device according to any one of claims 10-20, wherein the dimmer includes two infrared bandpass filters, one infrared cut filter and at least two of the infrared filters. Blackout film.
An image processing device, characterized in that the device comprises:

An acquisition unit, configured to acquire a first infrared image, a visible light image, and a second infrared image; wherein, the first infrared image, the visible light image, and the second infrared image are obtained by the photographing device at time T0, T1, and Take the same scene at T2, T0<T1<T2;

A calculation unit, configured to calculate a target infrared image based on the first infrared image and the second infrared image, where the target infrared image includes a third infrared image and/or a fourth infrared image; the third infrared image is The image obtained by converting the first infrared image from the moment T0 to the moment T1, and the fourth infrared image is an image obtained by converting the second infrared image from the moment T2 to the moment T1 ;

a fusion unit, configured to fuse the target infrared image and the visible light image to obtain a fusion image.
The device according to claim 22, wherein the computing unit is specifically used for:

The infrared image of the target is calculated and obtained according to the T0, the relationship between the T1 and the T2, and the first infrared image and the second infrared image by using an optical flow method.
The device according to claim 23, wherein the target infrared image comprises the third infrared image; the calculation unit is specifically used for:

calculating an optical flow F1 from the first infrared image to the second infrared image and an optical flow F2 from the second infrared image to the first infrared image;

Based on the relationship among T0, T1 and T2, the optical flow F1 and the optical flow F2, the optical flow F3 is calculated, and the optical flow F3 is from the third infrared image to the first infrared image. Optical flow of an infrared image;

Performing optical flow reverse mapping based on the optical flow F3 and the first infrared image to obtain the third infrared image.
The device according to any one of claims 22-24, wherein the fusion unit is specifically used for:

extracting color features of the visible light image;

Extracting texture features of the target infrared image;

The fused image is obtained based on the color feature and the texture feature.
The device according to claim 25, wherein the fusion unit is specifically used for:

The target infrared image and the visible light image are fused by a first neural network to obtain the fused image; the first neural network includes a color extraction neural network and a texture extraction neural network, and the color extraction neural network is used to extract The color feature of the visible light image; the texture extraction neural network is used to extract the texture feature of the target infrared image.
The device according to claim 26, wherein the resolution of the color extraction neural network is not higher than the preset first resolution threshold and the number of layers of the color extraction neural network is not lower than the preset first resolution threshold A network depth threshold.
The device according to claim 26 or 27, wherein the resolution of the texture extraction neural network is higher than the preset second resolution threshold and the number of layers of the texture extraction neural network is lower than the preset second resolution threshold Two network depth thresholds.
The device according to any one of claims 26-28, wherein the operations performed by the device are implemented by an image processing model, and the image processing model includes the first neural network and the second neural network;

The second neural network is used to obtain an optical flow F1 from the first infrared image to the second infrared image and an optical flow F2 from the second infrared image to the first infrared image; The flow F1 and the optical flow F2 are used to calculate and obtain the infrared image of the target;

The first neural network and the second neural network included in the image processing model are obtained through end-to-end training.
The device according to claim 29, wherein the training images of the image processing model are collected in an environment where the illuminance is lower than a preset illuminance threshold.
An image processing device, characterized in that it includes a processor and a memory; wherein the memory is used to store a computer program, and the processor is used to call the computer program, so that the device executes claims 1-9 any one of the methods described.
A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the method according to any one of claims 1-9 is implemented.