CN115082968A - Behavior identification method based on infrared light and visible light fusion and terminal equipment - Google Patents

Behavior identification method based on infrared light and visible light fusion and terminal equipment Download PDF

Info

Publication number
CN115082968A
CN115082968A CN202211013357.4A CN202211013357A CN115082968A CN 115082968 A CN115082968 A CN 115082968A CN 202211013357 A CN202211013357 A CN 202211013357A CN 115082968 A CN115082968 A CN 115082968A
Authority
CN
China
Prior art keywords
image
pixel
visible light
mix
vis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211013357.4A
Other languages
Chinese (zh)
Other versions
CN115082968B (en
Inventor
李月忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Ruijin Intelligent Technology Co ltd
Original Assignee
Tianjin Ruijin Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Ruijin Intelligent Technology Co ltd filed Critical Tianjin Ruijin Intelligent Technology Co ltd
Priority to CN202211013357.4A priority Critical patent/CN115082968B/en
Publication of CN115082968A publication Critical patent/CN115082968A/en
Application granted granted Critical
Publication of CN115082968B publication Critical patent/CN115082968B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10048Infrared image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Artificial Intelligence (AREA)
  • Image Processing (AREA)

Abstract

The application is suitable for the technical field of behavior recognition, and provides a behavior recognition method based on infrared light and visible light fusion and a terminal device, wherein the method comprises the following steps: acquiring at least one registered image group to be identified, wherein the registered image group comprises a registered visible light image and an infrared light image; denoising the visible light images in the registered image groups aiming at each registered image group to obtain a denoised image of the visible light images; fusing the infrared light image in the registered image group with the de-noising image of the visible light image to obtain a fused image of the registered image group under a preset constraint condition; under the preset constraint condition, the pixel difference between the fused image and the infrared light image is minimum, and the gradient difference between the fused image and the de-noised image is minimum; and determining the behavior class of the target object in the registered image group based on the fused images of all the registered image groups under the preset constraint condition. The scheme can improve the accuracy of behavior recognition.

Description

Behavior identification method based on infrared light and visible light fusion and terminal equipment
Technical Field
The application belongs to the technical field of behavior recognition, and particularly relates to a behavior recognition method based on infrared light and visible light fusion and a terminal device.
Background
Animal behaviors refer to actions of animals adapted to the environment under the stimulation of the external environment, and the behaviors of the animals may have certain influence on the self reproduction of the animals or the behaviors of other animals, so that the study of the animal behaviors is helpful for understanding the behavior characteristics or requirements of the animals, and the animal caretakers can be assisted to realize the management of the animals. The basis for studying animal behavior is the accurate identification of the behavior of the animal.
The traditional behavior identification method includes the steps that an infrared light image containing animal thermal radiation information is collected through a non-contact infrared light camera, or a visible light image containing object appearance information is collected through a visible light camera device, animal behaviors are identified based on the infrared light image alone, or the animal behaviors are identified based on the visible light image alone, and the behavior identification accuracy of the behavior identification mode is low.
Disclosure of Invention
In view of this, the embodiment of the present application provides a behavior identification method and a terminal device based on fusion of infrared light and visible light, so as to solve the technical problem that the behavior identification accuracy of the existing behavior identification method is low.
In a first aspect, an embodiment of the present application provides a behavior identification method based on fusion of infrared light and visible light, including:
acquiring at least one registered image group to be identified; the registered image group comprises a visible light image and an infrared light image which are registered;
for each registered image group, carrying out denoising treatment on a visible light image in the registered image group to obtain a denoising image of the visible light image;
fusing the infrared light image and the de-noised image to obtain a fused image of the registered image group under a preset constraint condition; under the preset constraint condition, the pixel difference between the fused image and the infrared light image is minimum, and the gradient difference between the fused image and the de-noised image is minimum;
and determining the behavior class of the target object in the registered image group based on the fused images of all the registered image groups under the preset constraint condition.
In an optional implementation manner of the first aspect, the denoising the visible light image in the registered image group to obtain a denoised image of the visible light image includes:
determining a horizontal second-order gradient and a vertical second-order gradient of each pixel in the visible light image by adopting a first gradient function based on the gray value of each pixel in the visible light image; the first gradient function is:
VIS h i () =[1/2(vis i -vis r i() )+1/2(vis i -vis l i() )] 2
VIS v i () =[1/2(vis i -vis b i() )+1/2(vis i -vis o i() )] 2
wherein the content of the first and second substances,VIS h i() is the first in the visible light imageiThe horizontal second order gradient of an individual pixel,VIS v i() is the first in the visible light imageiThe vertical second order gradient of a pixel,vis i is the first in the visible light imageiThe gray-scale value of each pixel,vis r(i) to be located at the secondiRight side of each pixel and the firstiThe gray values of the pixels adjacent to the individual pixels,vis l(i) to be located at the secondiLeft side of each pixel and the second sideiThe gray values of the pixels adjacent to the individual pixels,vis b(i) to be located at the secondiUnder each pixel and with the firstiThe gray values of the pixels adjacent to the individual pixels,vis o(i) to be located at the secondiAbove each pixel and with the secondiPixels adjacent to each otherThe gray value of (a);
for each pixel in the visible light image, performing quadratic operation on the sum of the horizontal second-order gradient and the vertical second-order gradient of the pixel to obtain a comprehensive gradient of the pixel;
determining the sum of the comprehensive gradients of all pixels in the visible light image as a denoising adjustment factor;
determining the column vector of the de-noised image by adopting a preset de-noising function based on the column vector of the visible light image, the de-noising adjustment factor and a preset regularization weight; the preset denoising function is as follows:
DeN=Vis+λ*DeN vis
wherein the content of the first and second substances,DeNis a column vector of the de-noised image,Visis a column vector of the visible light image, λ is the preset regularization weight,DeN vis and adjusting the factor for the denoising.
In an optional implementation manner of the first aspect, the fusing the infrared light image and the denoised image to obtain a fused image of the registered image group under a preset constraint condition includes:
determining a column vector to be adjusted of the fused image by adopting a preset constraint function based on the column vector of the infrared light image and the column vector of the de-noised image; the preset constraint function is as follows:
Figure 769879DEST_PATH_IMAGE001
wherein the content of the first and second substances,MIX * for the column vector to be adjusted of the fused image,InFis a column vector of the infrared light image +MIX * Is a gradient vector of the fused image and,DeN * calculating the (| luminance) of the de-noised imageMIX * -InF|| 2 For representingMIX * -InFL2 norm, | vMIX * - DeN * || 1 For representing +MIX * - DeN * The norm of L1, λ is a preset regularization weight;
the value of each element in the gradient vector of the fused image is determined by the following formula:
MIX * 1 =[1/2(MIX * i -MIX * r i() )+1/2(MIX * i -MIX * l i() )] 2
MIX * 2 =[1/2(MIX * i -MIX * b i() )+1/2(MIX * i -MIX * o i() )] 2
Figure 787513DEST_PATH_IMAGE002
wherein the content of the first and second substances,MIX * i for the second in the column vector to be adjusted of the fused imageiThe gray value of the pixel to which the individual element corresponds,MIX * r i() to be located at the secondiPixel right side corresponding to each element and corresponding to the secondiThe gray values of the pixels adjacent to the pixel to which the respective element corresponds,MIX * l i() to be located at the secondiLeft side of pixel corresponding to each element and corresponding to the secondiThe gray values of pixels adjacent to the pixel to which the respective element corresponds,MIX * b i() to be located at the secondiPixel under the corresponding element and corresponding to the secondiThe gray values of the pixels adjacent to the pixel to which the respective element corresponds,MIX * o i() to be located at the secondiPixel above the pixel corresponding to each element and corresponding to the second elementiOf pixels adjacent to the pixel corresponding to the elementGray value;
and carrying out standardization processing on the column vector to be adjusted of the fusion image to obtain the column vector of the fusion image.
In an optional implementation manner of the first aspect, the normalizing the to-be-adjusted column vector of the fused image to obtain the column vector of the fused image includes:
standardizing the column vector to be adjusted of the fused image based on a preset standardization formula to obtain the column vector of the fused image; the preset standardized formula is as follows:
Figure 78817DEST_PATH_IMAGE003
wherein the content of the first and second substances,MIX * i for the second in the column vector to be adjusted of the fused imageiThe gray value of the pixel to which the individual element corresponds,MIX i as the column vector of the fused imageiThe gray value of the pixel corresponding to each element.
In an optional implementation manner of the first aspect, the determining, based on fused images of all the registered image groups under a preset constraint condition, a behavior class of a target object in the registered image groups includes:
importing all the fusion images into a context attention network to obtain dynamic behavior data of the target object in the registered image group; the dynamic behavior data is described by a position change vector between the target object and the environmental object in every two adjacent fusion images;
and importing the dynamic behavior data into a behavior recognition model to obtain the action type of the target object.
In a second aspect, an embodiment of the present application provides a terminal device, including:
a first acquisition unit for acquiring at least one registered image group to be identified; the registered image group comprises a visible light image and an infrared light image which are registered;
the image denoising unit is used for denoising the visible light images in the registered image groups according to each registered image group to obtain denoised images of the visible light images;
the image fusion unit is used for fusing the infrared light image and the de-noising image to obtain a fusion image of the registered image group under a preset constraint condition; under the preset constraint condition, the pixel difference between the fused image and the infrared light image is minimum, and the gradient difference between the fused image and the de-noised image is minimum;
and the behavior identification unit is used for determining the behavior category of the target object in the registered image group based on the fused images of all the registered image groups under the preset constraint condition.
In a third aspect, an embodiment of the present application provides a terminal device, where the terminal device includes a processor, a memory, and a computer program stored in the memory and executable on the processor, and the processor, when executing the computer program, implements the behavior recognition method according to the first aspect or any one of the options of the first aspect.
In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, where a computer program is stored, and the computer program, when executed by a processor, implements the behavior recognition method according to the first aspect or any one of the alternatives of the first aspect.
In a fifth aspect, embodiments of the present application provide a computer program product, which, when run on a terminal device, causes the terminal device to execute the method for behavior recognition according to the first aspect or any one of the alternatives of the first aspect.
In a sixth aspect, an embodiment of the present application provides a behavior recognition system, which includes an image pickup device and the terminal device according to the second or third aspect, wherein the image pickup device is connected to the terminal device.
The behavior identification method based on infrared light and visible light fusion, the terminal device, the computer readable storage medium and the computer program product provided by the embodiment of the application have the following beneficial effects:
according to the behavior identification method based on the fusion of the infrared light and the visible light, the denoising image of the visible light image in each registered image group is obtained by denoising the visible light image in each registered image group; fusing the infrared light image in each registered image group with the de-noised image to obtain a fused image of each registered image group under a preset constraint condition; and finally, determining the behavior class of the target object in the registered image group based on the fused images of all the registered image groups under the preset constraint condition. Under the preset constraint condition, the pixel difference between the fused image and the infrared light image is minimum, and the gradient difference between the fused image and the de-noised image is minimum, so that the fused image and the infrared light image have similar pixel intensity, and the fused image and the visible light image have similar gradient (namely edge), so that the fused image can simultaneously keep the thermal radiation information of an object in the infrared light image and the appearance information of the object in the visible light image, namely the fused image can be regarded as the infrared light image with detailed scene description, and therefore, the target object is subjected to behavior recognition based on the fused image, and the accuracy of the behavior recognition can be improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a schematic structural diagram of a behavior recognition system according to an embodiment of the present application;
fig. 2 is a schematic flowchart of a behavior recognition method based on fusion of infrared light and visible light according to an embodiment of the present application;
fig. 3 is a flowchart of specific implementation of S22 in a behavior identification method based on fusion of infrared light and visible light according to an embodiment of the present application;
fig. 4 is a flowchart of specific implementation of S23 in a behavior identification method based on fusion of infrared light and visible light according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a terminal device according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of a terminal device according to another embodiment of the present application.
Detailed Description
It is to be understood that the terminology used in the embodiments of the present application is for the purpose of describing particular embodiments of the present application only and is not intended to be limiting of the application. In the description of the embodiments of the present application, "a plurality" means two or more than two, "at least one", "one or more" means one, two or more than two, unless otherwise specified. The terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a definition of "a first" or "a second" feature may explicitly or implicitly include one or more of the features.
Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather "one or more but not all embodiments" unless specifically stated otherwise. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.
The traditional behavior identification method is that an infrared image pickup device collects an infrared image containing thermal radiation information of a target object, or a visible light image containing appearance information of the target object is collected through a visible light image pickup device, and then the behavior of the target object is identified based on the infrared image or the visible light image alone. However, the visible light image is easily affected by the illumination change, so that some texture information is lost, and the infrared light image is less affected by the illumination change but lacks detail information, so that the accuracy of identifying the target object behavior is reduced by identifying the target object behavior based on the infrared light image or the visible light image alone.
In order to solve the technical problem, an embodiment of the present application first provides a behavior recognition system. Please refer to fig. 1, which is a schematic structural diagram of a behavior recognition system according to an embodiment of the present application. As shown in fig. 1, the behavior recognition system may include an infrared camera 11, a visible light camera 12, and a terminal device 13. Wherein, the infrared camera 11 and the visible light camera 12 are both connected with the terminal equipment 13.
The infrared camera 11 is used for collecting infrared light images, and the visible light camera 12 is used for collecting visible light images. In the embodiment of the present application, the main optical axis of the infrared imaging device 11 coincides with the main optical axis of the visible light imaging device 12, and the visual field range of the infrared imaging device 11 is the same as the visual field range of the visible light imaging device 12, that is, the infrared light image acquired by the infrared imaging device 11 at a certain time is the same as the scene corresponding to the visible light image acquired by the visible light imaging device 12 at the certain time.
As shown in fig. 1, when a target object passes through the field of view of the infrared camera 11 and the visible light camera 12, the infrared camera 11 may collect a plurality of temporally consecutive infrared light images 111 containing the target object, and the visible light camera 12 may collect a plurality of temporally consecutive visible light images 121 containing the target object. The infrared light image includes heat radiation information of the target object, and the visible light image includes appearance information (i.e., texture information) of the target object.
Illustratively, the target object may be a living body, e.g., a human or an animal, etc.
The terminal device 13 is configured to acquire the infrared light image sequence acquired by the infrared camera 11 from the infrared camera 11, acquire the visible light image sequence acquired by the visible light camera 12 from the visible light camera 12, and register the infrared light image sequence and the visible light image sequence. The registering of the infrared light image sequence and the visible light image sequence specifically means that the infrared light image and the visible light image having the same scene in the infrared light image sequence and the visible light image sequence are paired, that is, the infrared light image and the visible light image respectively acquired by the infrared camera 11 and the visible light camera 12 at the same time are paired. Based on the method, after the infrared light image sequence and the visible light image sequence are registered, a plurality of registered image groups can be obtained, each registered image group comprises a visible light image and an infrared light image, and the visible light image and the infrared light image in each registered image group have the same scene.
For example, as shown in fig. 1, if the infrared camera 11 captures an infrared light image 1111 at a first time and the visible light camera 12 captures a visible light image 1211 at the first time, the infrared light image 1111 and the visible light image 1211 may form a registered image group.
It will be appreciated that since the infrared light images in the infrared light image sequence are consecutive in time and the visible light images in the visible light image sequence are consecutive in time, the plurality of registered image sets obtained by the terminal device are also consecutive in time.
It can be understood that, after the terminal device 13 obtains a plurality of registered image groups, both the infrared light image and the visible light image in each registered image group may be preprocessed into grayscale images, and the preprocessed registered image group is used as the registered image group to be identified. Because the visible light image and the infrared light image in the registered image group to be identified are both gray level images, the range of the gray level value of each pixel in the visible light image in the registered image group to be identified is 0-255, and similarly, the range of the gray level value of each pixel in the infrared light image in the registered image group to be identified is 0-255.
Optionally, the terminal device 13 may store the registered image group to be recognized in its local memory for behavior recognition of a subsequent target object, that is, the terminal device 13 may also be configured to perform each step in a subsequent method embodiment, please refer to the related description in the method embodiment, which will not be detailed here in detail.
In a specific application, the terminal device 13 may include a smart phone, a tablet computer, a notebook computer, or a desktop computer, and the specific type of the terminal device 13 is not particularly limited in this embodiment.
In a specific application, the connection modes between the infrared camera 11 and the visible light camera 12 and the terminal equipment 13 can be wired connection or wireless connection.
The wired connection may include a wired connection based on a Universal Serial Bus (USB), a High Definition Multimedia Interface (HDMI), or the like. The wireless connection may include a wireless connection based on bluetooth, wireless fidelity (WIFI), or mobile communication technology, among others. By way of example, the mobile communication technology may include, but is not limited to, a fifth generation mobile communication technology (5G for short), a fourth generation mobile communication technology (4G for short), and the like.
Based on the behavior recognition system provided in the foregoing embodiment, an embodiment of the present application further provides a behavior recognition method based on fusion of infrared light and visible light, where an execution subject of the behavior recognition method is the terminal device 13 in the embodiment corresponding to fig. 1. In a specific application, a target script file may be configured to the terminal device 13, and the target script file describes the behavior recognition method based on the fusion of infrared light and visible light provided in the embodiment of the present application, so that the terminal device 13 executes the target script file when the behavior recognition of the target object is required, and further executes each step in the behavior recognition method based on the fusion of infrared light and visible light provided in the embodiment of the present application.
Please refer to fig. 2, which is a schematic flowchart of a behavior recognition method based on infrared light and visible light fusion according to an embodiment of the present application. As shown in FIG. 2, the method may include S21-S24, which are detailed as follows:
s21: at least one registered image set to be identified is acquired.
In an embodiment of the present application, the terminal device may obtain at least one registered image group to be identified from its local memory. Each registered image set includes a visible light image and an infrared light image that have been registered. Illustratively, the resolution of the visible light image and the resolution of the infrared light image may each be m × n, i.e., the visible light image and the infrared light image may each include m rows × n columns of pixels.
S22: and for each registered image group, carrying out denoising treatment on the visible light image in the registered image group to obtain a denoising image of the visible light image.
In a possible implementation manner, when denoising the visible light image in each registered image group, in order to avoid losing gradient information (i.e., texture information) of the visible light image, the terminal device may obtain a denoised image of the visible light image by using S221 to S224 shown in fig. 3.
S221: and determining the horizontal second-order gradient and the vertical second-order gradient of each pixel in the visible light image by adopting a first gradient function based on the gray value of each pixel in the visible light image.
In particular, the first gradient function may be:
VIS h i () =[1/2(vis i -vis r i() )+1/2(vis i -vis l i() )] 2
VIS v i () =[1/2(vis i -vis b i() )+1/2(vis i -vis o i() )] 2
wherein, the first and the second end of the pipe are connected with each other,VIS h i() is the first in visible light imageiThe horizontal second order gradient of an individual pixel,VIS v i() is the first in visible light imageiThe vertical second order gradient of a pixel,vis i is the first in visible light imageiThe gray-scale value of each pixel,vis r(i) to be located atiRight side of each pixel andithe gray values of the pixels adjacent to the individual pixels,vis l(i) to be located atiLeft side of the pixel and the firstiThe gray values of the pixels adjacent to the individual pixels,vis b(i) to be located atiUnder each pixel and with the firstiThe gray values of the pixels adjacent to the individual pixels,vis o(i) to be located atiAbove the pixel and with the secondiThe gray values of the pixels adjacent to the individual pixels.
It should be noted that, for each pixel in line 1 in the visible light image,vis o(i) =vis i (ii) a For each pixel in the last row in the visible image,vis l(i) =vis i (ii) a For each pixel in column 1 in the visible image,vis l(i) =vis i (ii) a For each pixel in the last column of the visible image,vis r(i) =vis i
it is understood that the order of the pixels in the visible light image is obtained by sequencing the pixels in the visible light image from left to right and from top to bottom. For example, if the visible light image includes 3 × 3 pixels, the 3 pixels in the 1 st row are the 1 st pixel, the 2 nd pixel and the 3 rd pixel of the visible light image in sequence from left to right, the 3 pixels in the 2 nd row are the 4th pixel, the 5th pixel and the 6 th pixel of the visible light image in sequence from left to right, and the 3 pixels in the 3 rd row are the 7 th pixel, the 8 th pixel and the 9 th pixel of the visible light image in sequence from left to right.
For example, taking the 5th pixel (i.e., the 2 nd row and 2 nd column pixel) in the visible light image as an example, the pixel located at the right side of the 5th pixel and adjacent to the 5th pixel is the 6 th pixel (i.e., the 2 nd row and 2 nd column pixel) in the visible light image, the pixel located at the left side of the 5th pixel and adjacent to the 5th pixel is the 4th pixel (i.e., the 2 nd row and 1 st column pixel) in the visible light image, the pixel located below the 5th pixel and adjacent to the 5th pixel is the 8 th pixel (i.e., the 3 rd row and 2 nd column pixel) in the visible light image, and the pixel located above the 5th pixel and adjacent to the 5th pixel is the 2 nd pixel (i.e., the 1 st row and 2 nd column pixel) in the visible light image.
S222: and performing quadratic operation on the sum of the horizontal second-order gradient and the vertical second-order gradient of each pixel in the visible light image to obtain the comprehensive gradient of the pixel.
In this implementation manner, for each pixel in the visible light image, the terminal device may calculate a sum of a horizontal second-order gradient and a vertical second-order gradient of the pixel, and perform quadratic operation on the sum of the horizontal second-order gradient and the vertical second-order gradient of the pixel to obtain a comprehensive gradient of the pixel.
S223: and determining the sum of the comprehensive gradients of all pixels in the visible light image as a denoising adjustment factor.
After the terminal device obtains the comprehensive gradient of each pixel in the visible light image, the sum of the comprehensive gradients of all the pixels in the visible light image can be calculated, and the sum of the comprehensive gradients of all the pixels in the visible light image is determined as a denoising adjustment factor.
The denoising adjustment factor is used for adjusting the gray value of each pixel in the visible light image.
In this implementation, sinceVIS h i() May embodyiThe second order gradient of the individual pixels in the horizontal direction,VIS v i() may embodyiThe second-order gradient of each pixel in the vertical direction can enable the visible light image to retain more gradient information relative to the first-order gradient, and therefore, the denoising adjustment factor obtained based on S221-S223 adjusts the gray value of each pixel in the visible light image, and the gradient of the visible light image can be well retained while denoising the visible light imageDegree information.
S224: and determining the column vector of the de-noised image by adopting a preset de-noising function based on the column vector of the visible light image, the de-noising adjustment factor and a preset regularization weight.
It is understood that the column vector of the visible light image may be obtained by arranging the respective pixels in the visible light image into a column in order from small to large.
The value of the preset regularization weight may be set according to actual requirements, and is not particularly limited herein.
Specifically, the preset denoising function may be:
DeN=Vis+λ*DeN vis
wherein the content of the first and second substances,DeNto be the column vector of the de-noised image,Visis a column vector of the visible light image, lambda is a preset regularization weight,DeN vis and adjusting the factor for the denoising.
It should be noted that, in the following description,DeN∈R mn×1Vis∈R mn×1
it can be understood that, after the terminal device obtains the column vector of the denoised image, the column vector of the denoised image can be restored to the denoised image with the resolution of m × n.
In another possible implementation manner, the terminal device may further perform denoising processing on the visible light images in each registered image group by using a median filter or a mean filter, and the like, so as to obtain a denoised image of the visible light images in each registered image group.
S23: and fusing the infrared light image and the de-noised image to obtain a fused image of the registered image group under a preset constraint condition.
It should be noted that, under the preset constraint condition, the pixel difference between the fused image and the infrared light image is the minimum, and the gradient difference between the fused image and the denoised image is the minimum. That is, by means of the constraint of the preset constraint condition, the gradient information of the visible light image can be transferred to the corresponding position of the infrared light image, so that the fused image and the infrared light image have similar pixel intensity, and the fused image and the visible light image have similar gradient, and thus, the fused image can simultaneously retain the heat radiation information of the object in the infrared light image and the appearance information of the object in the visible light image.
In one possible implementation, the preset constraint condition may be described by a preset constraint function. Based on this, S23 can be specifically realized by S231 to S232 shown in fig. 4, which are detailed as follows:
s231: and determining the column vector to be adjusted of the fused image by adopting a preset constraint function based on the column vector of the infrared light image and the column vector of the de-noised image.
It is understood that the column vector of the infrared light image may be obtained by arranging the respective pixels in the infrared light image in order of small to large as a column.
Specifically, the preset constraint function may be:
Figure 960230DEST_PATH_IMAGE001
wherein the content of the first and second substances,MIX * in order to fuse the column vectors of the image to be adjusted,InFis a column vector of the infrared light image +MIX * In order to fuse the gradient vectors of the image,DeN * to de-noize the column vector of the image, | luminanceMIX * -InF|| 2 For representingMIX * -InFL2 norm, | vMIX * - DeN * || 1 For representing +MIX * - DeN * Is the preset regularization weight, λ is the L1 norm of.
It should be noted that the number of elements included in the gradient vector of the fused image is the same as the number of pixels included in the visible light image or the infrared light image, and each element in the gradient vector of the fused image may correspond to one pixel in the visible light image or the infrared light image.
Alternatively, the value of each element in the gradient vector of the fused image may be determined by the following formula:
MIX * 1 =[1/2(MIX * i -MIX * r i() )+1/2(MIX * i -MIX * l i() )] 2
MIX * 2 =[1/2(MIX * i -MIX * b i() )+1/2(MIX * i -MIX * o i() )] 2
Figure 972180DEST_PATH_IMAGE002
wherein the content of the first and second substances,MIX * i for fusing the second in the column vector to be adjusted of the imageiValue of an element, i.e. ofiThe gray value of the pixel corresponding to each element;MIX * r i() in the column vector to be adjusted for fusing the images, is located at the secondiRight side of pixel corresponding to each element and corresponding to the secondiThe gray value of the pixel adjacent to the pixel corresponding to the element;MIX * l i() in the column vector to be adjusted for fusing the images, is located at the secondiLeft side of pixel corresponding to each element and corresponding to the secondiThe gray value of the pixel adjacent to the pixel corresponding to the element;MIX * b i() in the column vector to be adjusted for fusing the images, is located at the secondiUnder the pixel corresponding to each element and corresponding to the secondiThe gray value of the pixel adjacent to the pixel corresponding to the element;MIX * o i() in the column vector to be adjusted for fusing the images, is located at the secondiAbove the pixel corresponding to each element and corresponding to the secondiThe gray values of the pixels adjacent to the pixel corresponding to the element.
S232: and carrying out standardization processing on the column vector to be adjusted of the fusion image to obtain the column vector of the fusion image.
Because the values of each element in the column vector to be adjusted of the fused image obtained by the preset constraint function are not necessarily between 0 and 255, the vector to be adjusted of the fused image needs to be standardized, so that the values of all the elements in the finally obtained column vector of the fused image are within 0 to 255.
In a possible implementation manner, the terminal device may perform standardization processing on the to-be-adjusted column vector of the to-be-fused image based on a preset standardization formula to obtain the column vector of the to-be-fused image.
Specifically, the preset standardized formula is as follows:
Figure 968955DEST_PATH_IMAGE003
wherein the content of the first and second substances,MIX * i for fusing the second in the column vector to be adjusted of the imageiThe gray value of the pixel to which the individual element corresponds,MIX i as the column vector of the fused imageiThe gray value of the pixel corresponding to each element.
It is understood that, after the terminal device obtains the column vector of the fused image, the column vector of the fused image may be restored to the fused image with the resolution of m × n.
By performing the above steps of S21-S23, for each registered image group, a fused image corresponding to the registered image group can be obtained, that is, a fused image of each registered image group under a preset constraint condition can be obtained.
S24: and determining the behavior class of the target object in the registered image group based on the fused images of all the registered image groups under the preset constraint condition.
Since the plurality of registered image sets are temporally continuous, the fused images corresponding to each of the plurality of registered image sets are also temporally continuous.
Optionally, the terminal device may be configured with a contextual attention network and a behavior recognition model.
Wherein the contextual attention network is configured to determine a change in relative position between the target object and the environmental object in the registered set of images, thereby obtaining dynamic behavior data of the target object. That is, the dynamic behavior data of the target object may be described by the position change vector between the target object and the environmental object in each of the two adjacent fused images.
The behavior recognition model is used for determining the action type of the target object based on the dynamic behavior data of the target object. In a specific application, the behavior recognition model may be obtained by training a classification model based on a preset sample set by using a deep learning method. For example, the preset sample set may include a plurality of sample data, each of which may include dynamic behavior data and an action type of one sample object. When the classification model is trained, the dynamic behavior data of the sample object in each piece of sample data can be used as the input of the classification model, and the action type of the sample object in each piece of sample data can be used as the output of the classification model, so that the classification model learns the corresponding relation between the dynamic behavior data and the action type in the training process. The terminal device may determine the classification model trained by using the sample data as a behavior recognition model.
Based on this, S24 may specifically include the following steps:
step a, importing all the fusion images into a context attention network to obtain dynamic behavior data of the target object in the registered image group.
And b, importing the dynamic behavior data into a behavior recognition model to obtain the action type of the target object.
In the context awareness network, the terminal device performs object recognition, key point recognition, human body recognition, and the like on each fused image. The environmental object in the fusion image can be identified through object identification, the target object in the fusion image can be identified through human body identification, the action change of the target object can be identified through key point identification, and finally, the context attention can be paid through a convolutional neural network, so that the dynamic behavior data of the target object can be obtained.
As can be seen from the above, the behavior identification method based on infrared light and visible light fusion provided in this embodiment obtains the denoised image of the visible light image in each registered image group by performing denoising processing on the visible light image in each registered image group; fusing the infrared light image in each registered image group with the de-noised image to obtain a fused image of each registered image group under a preset constraint condition; and finally, determining the behavior class of the target object in the registered image group based on the fused images of all the registered image groups under the preset constraint condition. Under the preset constraint condition, the pixel difference between the fused image and the infrared light image is minimum, and the gradient difference between the fused image and the de-noised image is minimum, so that the fused image and the infrared light image have similar pixel intensity, and the fused image and the visible light image have similar gradient (namely edge), so that the fused image can simultaneously keep the thermal radiation information of an object in the infrared light image and the appearance information of the object in the visible light image, namely the fused image can be regarded as the infrared light image with detailed scene description, and therefore, the behavior recognition accuracy of a target object can be improved based on the fused image.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.
Based on the behavior identification method based on the fusion of the infrared light and the visible light provided by the embodiment, the embodiment of the application further provides the embodiment of the terminal device for realizing the embodiment of the method. Please refer to fig. 5, which is a schematic structural diagram of a terminal device according to an embodiment of the present application. For convenience of explanation, only the portions related to the present embodiment are shown. As shown in fig. 5, the terminal device 50 may include: a first acquisition unit 51, an image denoising unit 52, an image fusion unit 53, and a behavior recognition unit 54. Wherein:
the first acquisition unit 51 is configured to acquire at least one registered image group to be identified; the registered image group includes a visible light image and an infrared light image which are registered.
The image denoising unit 52 is configured to perform denoising processing on the visible light image in the registered image group for each registered image group, so as to obtain a denoised image of the visible light image.
The image fusion unit 53 is configured to fuse the infrared light image and the denoised image to obtain a fusion image of the registered image group under a preset constraint condition; under the preset constraint condition, the pixel difference between the fused image and the infrared light image is minimum, and the gradient difference between the fused image and the de-noised image is minimum.
The behavior recognition unit 54 is configured to determine a behavior class of the target object in the registered image group based on the fused images of all the registered image groups under the preset constraint condition.
Optionally, the image denoising unit 52 may include a first determining unit, a first calculating unit, a second determining unit, and a third determining unit. Wherein:
the first determining unit is used for determining a horizontal second-order gradient and a vertical second-order gradient of each pixel in the visible light image by adopting a first gradient function based on the gray value of each pixel in the visible light image; the first gradient function is:
VIS h i () =[1/2(vis i -vis r i() )+1/2(vis i -vis l i() )] 2
VIS v i () =[1/2(vis i -vis b i() )+1/2(vis i -vis o i() )] 2
wherein the content of the first and second substances,VIS h i() is the first in the visible light imageiThe horizontal second order gradient of an individual pixel,VIS v i() is the first in the visible light imageiThe vertical second order gradient of a pixel,vis i is the first in the visible light imageiThe gray-scale value of each pixel,vis r(i) to be located at the secondiRight side of each pixel and the firstiThe gray values of the pixels adjacent to the individual pixels,vis l(i) to be located at the secondiLeft side of each pixel and the second sideiThe gray values of the pixels adjacent to the individual pixels,vis b(i) to be located at the secondiUnder each pixel and with the firstiThe gray values of the pixels adjacent to the individual pixels,vis o(i) to be located at the secondiAbove each pixel and with the secondiThe gray values of pixels adjacent to the individual pixels.
The first calculation unit is used for performing quadratic operation on the sum of the horizontal second-order gradient and the vertical second-order gradient of each pixel in the visible light image to obtain the comprehensive gradient of the pixel.
The second determining unit is used for determining the sum of the comprehensive gradients of all pixels in the visible light image as a denoising adjustment factor.
The third determining unit is used for determining the column vector of the de-noised image by adopting a preset de-noising function based on the column vector of the visible light image, the de-noising adjustment factor and a preset regularization weight; the preset denoising function is as follows:
DeN=Vis+λ*DeN vis
wherein the content of the first and second substances,DeNis a column vector of the de-noised image,Visis a column vector of the visible light image, λ is the preset regularization weight,DeN vis and adjusting the factor for denoising.
Alternatively, the image fusion unit 53 may include a fourth determination unit and a normalization unit. Wherein:
the third determining unit is used for determining the column vector to be adjusted of the fusion image by adopting a preset constraint function based on the column vector of the infrared light image and the column vector of the de-noised image; the preset constraint function is as follows:
Figure 273903DEST_PATH_IMAGE001
wherein, the first and the second end of the pipe are connected with each other,MIX * for the column vector to be adjusted of the fused image,InFis a column vector of the infrared light imageMIX * Is the gradient vector of the fused image,DeN * calculating the (| luminance) of the de-noised imageMIX * -InF|| 2 For representingMIX * -InFL2 norm, | vMIX * - DeN * || 1 For representing +MIX * - DeN * The norm of L1, λ is a preset regularization weight;
the value of each element in the gradient vector of the fused image is determined by the following formula:
MIX * 1 =[1/2(MIX * i -MIX * r i() )+1/2(MIX * i -MIX * l i() )] 2
MIX * 2 =[1/2(MIX * i -MIX * b i() )+1/2(MIX * i -MIX * o i() )] 2
Figure 90549DEST_PATH_IMAGE002
wherein the content of the first and second substances,MIX * i for the second in the column vector to be adjusted of the fused imageiThe gray value of the pixel to which the individual element corresponds,MIX * r i() to be located at the secondiPixel right side corresponding to each element and corresponding to the secondiThe gray values of the pixels adjacent to the pixel to which the respective element corresponds,MIX * l i() to be located at the secondiLeft side of pixel corresponding to each element and corresponding to the secondiThe gray values of pixels adjacent to the pixel to which the respective element corresponds,MIX * b i() to be located at the secondiPixel under the corresponding element and corresponding to the secondiThe gray values of the pixels adjacent to the pixel to which the respective element corresponds,MIX * o i() to be located at the secondiPixel above the pixel corresponding to each element and corresponding to the second elementiThe gray values of the pixels adjacent to the pixel corresponding to the element.
The standardization unit is used for carrying out standardization processing on the column vector to be adjusted of the fusion image to obtain the column vector of the fusion image.
Optionally, the normalization unit is specifically configured to:
standardizing the column vector to be adjusted of the fused image based on a preset standardization formula to obtain the column vector of the fused image; the preset standardized formula is as follows:
Figure 234086DEST_PATH_IMAGE003
wherein the content of the first and second substances,MIX * i for the second in the column vector to be adjusted of the fused imageiThe gray value of the pixel to which the individual element corresponds,MIX i as the column vector of the fused imageiThe gray value of the pixel corresponding to each element.
Alternatively, the behavior recognizing unit may include a dynamic behavior determining unit and an action type determining unit. Wherein:
the dynamic behavior determining unit is used for importing all the fusion images into a context attention network to obtain dynamic behavior data of the target object in the registered image group; the dynamic behavior data is described by a position change vector between the target object and the environmental object in every two adjacent fusion images.
And the action type determining unit is used for importing the dynamic behavior data into a behavior recognition model to obtain the action type of the target object.
It should be noted that, for the information interaction, the execution process, and other contents between the above units, the specific functions and the technical effects brought by the method embodiments of the present application are based on the same concept, and specific reference may be made to the method embodiment part, which is not described herein again.
It is obvious to those skilled in the art that, for convenience and simplicity of description, the foregoing division of each functional unit is merely illustrated, and in practical applications, the foregoing function distribution may be performed by different functional units according to needs, that is, the internal structure of the terminal device is divided into different functional units to perform all or part of the above-described functions. Each functional unit in the embodiments may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units are only used for distinguishing one functional unit from another, and are not used for limiting the protection scope of the application. The specific working process of the units in the system may refer to the corresponding process in the foregoing method embodiment, and is not described herein again.
Referring to fig. 6, fig. 6 is a schematic structural diagram of a terminal device according to an embodiment of the present disclosure. As shown in fig. 6, the terminal device 6 provided in this embodiment may include: a processor 60, a memory 61 and a computer program 62 stored in the memory 61 and executable on the processor 60, for example a program corresponding to a behavior recognition method based on fusion of infrared light and visible light. The processor 60 executes the computer program 62 to implement the steps of the above-mentioned behavior recognition method embodiment based on the fusion of infrared light and visible light, such as S21-S24 shown in FIG. 2. Alternatively, the processor 60, when executing the computer program 62, implements the functions of each module/unit in the terminal device embodiments described above, such as the functions of the units 51 to 54 shown in fig. 5.
Illustratively, the computer program 62 may be divided into one or more modules/units, which are stored in the memory 61 and executed by the processor 60 to accomplish the present application. One or more of the modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program 62 in the terminal device 6. For example, the computer program 62 may be divided into a first obtaining unit, an image denoising unit, an image fusion unit, and a behavior recognition unit, and specific functions of each unit refer to the related description in the embodiment corresponding to fig. 5, which is not described herein again.
Those skilled in the art will appreciate that fig. 6 is merely an example of a terminal device 6 and does not constitute a limitation of terminal device 6, and may include more or less components than those shown, or some components in combination, or different components.
The processor 60 may be a Central Processing Unit (CPU), other general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The storage 61 may be an internal storage unit of the terminal device 6, such as a hard disk or a memory of the terminal device 6. The memory 61 may also be an external storage device of the terminal device 6, such as a plug-in hard disk, a Smart Memory Card (SMC), a Secure Digital (SD) card, or a flash memory card (flash card) provided on the terminal device 6. Further, the memory 61 may also include both an internal storage unit of the terminal device 6 and an external storage device. The memory 61 is used for storing computer programs and other programs and data required by the terminal device. The memory 61 may also be used to temporarily store data that has been output or is to be output.
The embodiments of the present application further provide a computer-readable storage medium, in which a computer program is stored, and when the computer program is executed by a processor, the steps in the above-mentioned method embodiments can be implemented.
Embodiments of the present application provide a computer program product, which, when running on a terminal device, enables the terminal device to implement the steps in the above method embodiments.
In the above embodiments, the description of each embodiment has its own emphasis, and parts that are not described or illustrated in a certain embodiment may refer to the description of other embodiments.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims (9)

1. A behavior identification method based on infrared light and visible light fusion is characterized by comprising the following steps:
acquiring at least one registered image group to be identified; the registered image group comprises a registered visible light image and an infrared light image;
for each registered image group, carrying out denoising treatment on a visible light image in the registered image group to obtain a denoising image of the visible light image;
fusing the infrared light image and the de-noised image to obtain a fused image of the registered image group under a preset constraint condition; under the preset constraint condition, the pixel difference between the fused image and the infrared light image is minimum, and the gradient difference between the fused image and the de-noised image is minimum;
and determining the behavior category of the target object in the registered image group based on the fused images of all the registered image groups under the preset constraint condition.
2. The behavior recognition method according to claim 1, wherein the denoising the visible light image in the registered image group to obtain a denoised image of the visible light image comprises:
determining a horizontal second-order gradient and a vertical second-order gradient of each pixel in the visible light image by adopting a first gradient function based on the gray value of each pixel in the visible light image; the first gradient function is:
VIS h i () =[1/2(vis i -vis r i() )+1/2(vis i -vis l i() )] 2
VIS v i () =[1/2(vis i -vis b i() )+1/2(vis i -vis o i() )] 2
wherein the content of the first and second substances,VIS h i() is the first in the visible light imageiThe horizontal second order gradient of an individual pixel,VIS v i() is the first in the visible light imageiThe vertical second order gradient of a pixel,vis i is the first in the visible light imageiThe gray-scale value of each pixel,vis r(i) to be located at the secondiRight side of each pixel and the firstiThe gray values of the pixels adjacent to the individual pixels,vis l(i) to be located at the secondiLeft side of each pixel and the second sideiThe gray values of the pixels adjacent to the individual pixels,vis b(i) to be located at the secondiUnder each pixel and with the firstiThe gray values of the pixels adjacent to the individual pixels,vis o(i) to be located at the secondiAbove each pixel and with the secondiThe gray values of the pixels adjacent to each pixel;
for each pixel in the visible light image, performing quadratic operation on the sum of the horizontal second-order gradient and the vertical second-order gradient of the pixel to obtain a comprehensive gradient of the pixel;
determining the sum of the comprehensive gradients of all pixels in the visible light image as a denoising adjustment factor;
determining the column vector of the de-noised image by adopting a preset de-noising function based on the column vector of the visible light image, the de-noising adjustment factor and a preset regularization weight; the preset denoising function is as follows:
DeN=Vis+λ*DeN vis
wherein the content of the first and second substances,DeNis a column vector of the de-noised image,Visis a column vector of the visible light image, λ is the preset regularization weight,DeN vis and adjusting the factor for denoising.
3. The behavior recognition method according to claim 1, wherein the fusing the infrared light image and the denoised image to obtain a fused image of the registered image group under a preset constraint condition comprises:
determining a column vector to be adjusted of the fused image by adopting a preset constraint function based on the column vector of the infrared light image and the column vector of the de-noised image; the preset constraint function is as follows:
Figure 635342DEST_PATH_IMAGE001
wherein the content of the first and second substances,MIX * for the column vector to be adjusted of the fused image,InFis a column vector of the infrared light image +MIX * Is the gradient vector of the fused image,DeN * calculating the (| luminance) of the de-noised imageMIX * -InF|| 2 For representingMIX * -InFL2 norm, | vMIX * - DeN * || 1 For representing +MIX * - DeN * The norm of L1, λ is a preset regularization weight;
the value of each element in the gradient vector of the fused image is determined by the following formula:
MIX * 1 =[1/2(MIX * i -MIX * r i() )+1/2(MIX * i -MIX * l i() )] 2
MIX * 2 =[1/2(MIX * i -MIX * b i() )+1/2(MIX * i -MIX * o i() )] 2
Figure 755745DEST_PATH_IMAGE002
wherein the content of the first and second substances,MIX * i for the first in the column vector to be adjusted of the fused imageiThe gray value of the pixel to which the individual element corresponds,MIX * r i() to be located at the secondiPixel right side corresponding to each element and corresponding to the secondiThe gray values of the pixels adjacent to the pixel to which the respective element corresponds,MIX * l i() to be located at the secondiLeft side of pixel corresponding to each element and corresponding to the secondiThe gray values of pixels adjacent to the pixel to which the respective element corresponds,MIX * b i() to be located at the secondiPixel under the corresponding element and corresponding to the secondiThe gray values of the pixels adjacent to the pixel to which the respective element corresponds,MIX * o i() to be located at the secondiPixel above the pixel corresponding to each element and corresponding to the second elementiThe gray value of the pixel adjacent to the pixel corresponding to the element;
and carrying out standardization processing on the column vector to be adjusted of the fusion image to obtain the column vector of the fusion image.
4. The behavior recognition method according to claim 3, wherein the normalizing the to-be-adjusted column vector of the fused image to obtain the column vector of the fused image includes:
standardizing the column vector to be adjusted of the fused image based on a preset standardization formula to obtain the column vector of the fused image; the preset standardized formula is as follows:
Figure 863378DEST_PATH_IMAGE003
wherein the content of the first and second substances,MIX * i for the second in the column vector to be adjusted of the fused imageiThe gray value of the pixel to which the individual element corresponds,MIX i as the column vector of the fused imageiThe gray value of the pixel corresponding to each element.
5. The behavior recognition method according to claim 4, wherein the determining the behavior class of the target object in the registered image group based on the fused images of all the registered image groups under the preset constraint condition comprises:
importing all the fusion images into a context attention network to obtain dynamic behavior data of the target object in the registered image group; the dynamic behavior data is described by a position change vector between the target object and the environmental object in every two adjacent fusion images;
and importing the dynamic behavior data into a behavior recognition model to obtain the action type of the target object.
6. A terminal device, comprising:
a first acquisition unit for acquiring at least one registered image group to be identified; the registered image group comprises a visible light image and an infrared light image which are registered;
the image denoising unit is used for denoising the visible light images in the registered image groups according to each registered image group to obtain denoised images of the visible light images;
the image fusion unit is used for fusing the infrared light image and the de-noising image to obtain a fusion image of the registered image group under a preset constraint condition; under the preset constraint condition, the pixel difference between the fused image and the infrared light image is minimum, and the gradient difference between the fused image and the de-noised image is minimum;
and the behavior recognition unit is used for determining the behavior category of the target object in the registered image group based on the fused images of all the registered image groups under the preset constraint condition.
7. A terminal device, characterized in that it comprises a processor, a memory and a computer program stored in the memory and executable on the processor, the processor implementing the behavior recognition method according to any one of claims 1 to 5 when executing the computer program.
8. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out a method for behavior recognition according to any one of claims 1 to 5.
9. A behavior recognition system comprising an infrared camera, a visible light camera, and a terminal device according to claim 6 or 7, the infrared camera and the visible light camera being connected to the terminal device.
CN202211013357.4A 2022-08-23 2022-08-23 Behavior identification method based on infrared light and visible light fusion and terminal equipment Active CN115082968B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211013357.4A CN115082968B (en) 2022-08-23 2022-08-23 Behavior identification method based on infrared light and visible light fusion and terminal equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211013357.4A CN115082968B (en) 2022-08-23 2022-08-23 Behavior identification method based on infrared light and visible light fusion and terminal equipment

Publications (2)

Publication Number Publication Date
CN115082968A true CN115082968A (en) 2022-09-20
CN115082968B CN115082968B (en) 2023-03-28

Family

ID=83244892

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211013357.4A Active CN115082968B (en) 2022-08-23 2022-08-23 Behavior identification method based on infrared light and visible light fusion and terminal equipment

Country Status (1)

Country Link
CN (1) CN115082968B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115908221A (en) * 2023-03-08 2023-04-04 荣耀终端有限公司 Image processing method, electronic device, and storage medium
CN116844241A (en) * 2023-08-30 2023-10-03 武汉大水云科技有限公司 Coloring-based infrared video behavior recognition method and system and electronic equipment

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105989585A (en) * 2015-03-05 2016-10-05 深圳市朗驰欣创科技有限公司 Method and system for fusing infrared image and visible light image
CN107481214A (en) * 2017-08-29 2017-12-15 北京华易明新科技有限公司 A kind of twilight image and infrared image fusion method
CN107578432A (en) * 2017-08-16 2018-01-12 南京航空航天大学 Merge visible ray and the target identification method of infrared two band images target signature
CN110084774A (en) * 2019-04-11 2019-08-02 江南大学 A kind of method of the gradient transmitting and minimum total variation blending image of enhancing
CN112115979A (en) * 2020-08-24 2020-12-22 深圳大学 Fusion method and device of infrared image and visible image
CN114120176A (en) * 2021-11-11 2022-03-01 广州市高科通信技术股份有限公司 Behavior analysis method for fusion of far infrared and visible light video images
CN114818989A (en) * 2022-06-21 2022-07-29 中山大学深圳研究院 Gait-based behavior recognition method and device, terminal equipment and storage medium
CN114862710A (en) * 2022-04-26 2022-08-05 中国人民解放军陆军工程大学 Infrared and visible light image fusion method and device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105989585A (en) * 2015-03-05 2016-10-05 深圳市朗驰欣创科技有限公司 Method and system for fusing infrared image and visible light image
CN107578432A (en) * 2017-08-16 2018-01-12 南京航空航天大学 Merge visible ray and the target identification method of infrared two band images target signature
CN107481214A (en) * 2017-08-29 2017-12-15 北京华易明新科技有限公司 A kind of twilight image and infrared image fusion method
CN110084774A (en) * 2019-04-11 2019-08-02 江南大学 A kind of method of the gradient transmitting and minimum total variation blending image of enhancing
CN112115979A (en) * 2020-08-24 2020-12-22 深圳大学 Fusion method and device of infrared image and visible image
CN114120176A (en) * 2021-11-11 2022-03-01 广州市高科通信技术股份有限公司 Behavior analysis method for fusion of far infrared and visible light video images
CN114862710A (en) * 2022-04-26 2022-08-05 中国人民解放军陆军工程大学 Infrared and visible light image fusion method and device
CN114818989A (en) * 2022-06-21 2022-07-29 中山大学深圳研究院 Gait-based behavior recognition method and device, terminal equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HAO ZHANG 等: "Rethinking the Image Fusion: A Fast Unified Image Fusion Network based on Proportional Maintenance of Gradient and Intensity", 《THE THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115908221A (en) * 2023-03-08 2023-04-04 荣耀终端有限公司 Image processing method, electronic device, and storage medium
CN115908221B (en) * 2023-03-08 2023-12-08 荣耀终端有限公司 Image processing method, electronic device and storage medium
CN116844241A (en) * 2023-08-30 2023-10-03 武汉大水云科技有限公司 Coloring-based infrared video behavior recognition method and system and electronic equipment
CN116844241B (en) * 2023-08-30 2024-01-16 武汉大水云科技有限公司 Coloring-based infrared video behavior recognition method and system and electronic equipment

Also Published As

Publication number Publication date
CN115082968B (en) 2023-03-28

Similar Documents

Publication Publication Date Title
CN115082968B (en) Behavior identification method based on infrared light and visible light fusion and terminal equipment
Li et al. Benchmarking single-image dehazing and beyond
CN111310775B (en) Data training method, device, terminal equipment and computer readable storage medium
CN108229490B (en) Key point detection method, neural network training method, device and electronic equipment
CN112446270B (en) Training method of pedestrian re-recognition network, pedestrian re-recognition method and device
CN108229509B (en) Method and device for identifying object class and electronic equipment
CN111079764B (en) Low-illumination license plate image recognition method and device based on deep learning
CN111931764B (en) Target detection method, target detection frame and related equipment
CN110705419A (en) Emotion recognition method, early warning method, model training method and related device
CN113822951B (en) Image processing method, device, electronic equipment and storage medium
CN113052295B (en) Training method of neural network, object detection method, device and equipment
US20230401838A1 (en) Image processing method and related apparatus
CN110222718A (en) The method and device of image procossing
CN114529946A (en) Pedestrian re-identification method, device, equipment and storage medium based on self-supervision learning
CN111950700A (en) Neural network optimization method and related equipment
CN112507897A (en) Cross-modal face recognition method, device, equipment and storage medium
CN115577768A (en) Semi-supervised model training method and device
CN110390254B (en) Character analysis method and device based on human face, computer equipment and storage medium
CN111507288A (en) Image detection method, image detection device, computer equipment and storage medium
CN113642639A (en) Living body detection method, living body detection device, living body detection apparatus, and storage medium
CN112613373A (en) Image recognition method and device, electronic equipment and computer readable storage medium
CN111968154A (en) HOG-LBP and KCF fused pedestrian tracking method
CN115512203A (en) Information detection method, device, equipment and storage medium
CN115547488A (en) Early screening system and method based on VGG convolutional neural network and facial recognition autism
CN113361422A (en) Face recognition method based on angle space loss bearing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant