CN113706393A

CN113706393A - Video enhancement method, device, equipment and storage medium

Info

Publication number: CN113706393A
Application number: CN202010430940.XA
Authority: CN
Inventors: 张传昊; 滕健; 刘阳兴; 林染染
Original assignee: Wuhan TCL Group Industrial Research Institute Co Ltd
Current assignee: Wuhan TCL Group Industrial Research Institute Co Ltd
Priority date: 2020-05-20
Filing date: 2020-05-20
Publication date: 2021-11-26
Anticipated expiration: 2040-05-20
Also published as: CN113706393B

Abstract

The application is applicable to the technical field of image processing, and provides a video enhancement method, a video enhancement device, video enhancement equipment and a storage medium. The method comprises the steps of obtaining a target video frame in a video to be processed, and carrying out image enhancement processing on the target video frame to obtain a target enhanced image corresponding to the target video frame; determining an enhanced image corresponding to a video frame to be processed according to the target enhanced image; taking the video frame to be processed as a target video frame, taking the enhanced image corresponding to the video frame to be processed as a target enhanced image, returning to the step of determining the enhanced image corresponding to the video frame to be processed according to the target enhanced image until the enhanced images respectively corresponding to the preset number of video frames in the video to be processed are obtained; and determining an enhanced video corresponding to the video to be processed according to the enhanced image corresponding to the video frame in the video to be processed.

Description

Video enhancement method, device, equipment and storage medium

Technical Field

The present application belongs to the field of image processing technologies, and in particular, to a video enhancement method, apparatus, device, and storage medium.

Background

In video shooting, under the influence of the quality of video shooting equipment and shooting environment, the quality of pictures in a shot video is often poor, for example, the pictures are dark as a whole and have low contrast, so that the identifiability of an interested object in the video is very low, and the application value of the video is greatly reduced. Therefore, enhancement processing of video to selectively highlight features of objects of interest in the video has become a necessary means to improve video quality.

The current video enhancement mode is usually realized based on an image enhancement technology, the method does not consider the incidence relation among multi-frame images in the video, and phenomena such as brightness jitter and flicker are easy to occur in the enhanced video, so that the video enhancement effect is unstable.

Disclosure of Invention

In view of this, embodiments of the present application provide a video enhancement method, an apparatus, a device, and a storage medium, so as to solve the technical problem that a video enhancement method in the prior art is prone to brightness jitter.

In a first aspect, an embodiment of the present application provides a video enhancement method, including:

acquiring a target video frame in a video to be processed;

carrying out image enhancement processing on the target video frame to obtain a target enhanced image corresponding to the target video frame;

determining an enhanced image corresponding to a video frame to be processed according to the target enhanced image; the video frame to be processed is a video frame to be subjected to image enhancement processing determined according to the target video frame;

taking the video frame to be processed as a target video frame, taking the enhanced image corresponding to the video frame to be processed as a target enhanced image, returning to the step of determining the enhanced image corresponding to the video frame to be processed according to the target enhanced image until the enhanced images respectively corresponding to the preset number of video frames in the video to be processed are obtained;

and determining an enhanced video corresponding to the video to be processed according to the enhanced image corresponding to the video frame in the video to be processed.

In a possible implementation manner of the first aspect, the video frame to be processed is a video frame adjacent to the target video frame.

In a possible implementation manner of the first aspect, performing image enhancement processing on a target video frame to obtain a target enhanced image corresponding to the target video frame includes:

acquiring a brightness channel of a target video frame;

determining a plurality of illumination components of the target video frame according to the brightness channel of the target video frame;

determining fusion illumination components corresponding to the target video frame according to the plurality of illumination components of the target video frame;

and processing the target video frame according to the fusion illumination component to generate a target enhanced image corresponding to the target video frame.

In a possible implementation manner of the first aspect, determining a plurality of illumination components of the target video frame according to a luminance channel of the target video frame includes:

carrying out low-pass filtering processing on the brightness channel to obtain a first illumination component;

and performing variational operation on the brightness channel to obtain a second illumination component.

In a possible implementation manner of the first aspect, determining a fusion illumination component corresponding to a target video frame according to a plurality of illumination components of the target video frame includes:

determining a first gradient of the first illumination component and a second gradient of the second illumination component;

and performing fusion processing on the first illumination component and the second illumination component based on the first gradient and the second gradient to obtain a fusion illumination component.

In a possible implementation manner of the first aspect, processing a target video frame according to a fusion illumination component to generate a target enhanced image corresponding to the target video frame includes:

determining a reflection component of the brightness channel according to the fused illumination component and the brightness channel;

correcting the reflection component to generate a corrected reflection component;

performing nonlinear stretching on the fusion illumination component to generate a stretched fusion illumination component;

obtaining an enhanced brightness channel according to the stretched fusion illumination component and the corrected reflection component;

and processing the target video frame according to the enhanced brightness channel to generate a target enhanced image corresponding to the target video frame.

In a possible implementation manner of the first aspect, processing a target video frame according to an enhanced luminance channel to generate a target enhanced image corresponding to the target video frame includes:

acquiring a saturation channel and a chrominance channel of a target video frame;

generating an RGB color image according to the enhanced brightness channel, the saturation channel and the chrominance channel;

and taking the RGB color image as a target enhanced image corresponding to the target video frame.

In a possible implementation manner of the first aspect, generating an RGB color image according to an enhanced luminance channel, a saturation channel, and a chrominance channel includes:

dividing an enhanced luminance channel into a plurality of image blocks;

aiming at each image block, performing contrast enhancement processing on the brightness of the image block to obtain an enhanced image block corresponding to the image block;

splicing adjacent enhanced image blocks to obtain an updated enhanced brightness channel;

and generating an RGB color image according to the updated enhanced brightness channel, the saturation channel and the chrominance channel.

In a possible implementation manner of the first aspect, determining, according to a target enhanced image, an enhanced image corresponding to a video frame to be processed includes:

determining a motionless point set and a motion point set which are relative to a target video frame in pixel points of a video frame to be processed;

aiming at each immobile point in the immobile point set, taking the pixel value of a pixel point corresponding to the immobile point in the target enhanced image as the pixel value of the immobile point;

aiming at each motion point in the motion point set, determining a corresponding pixel point of the motion point in the target enhanced image, and determining the pixel value of the motion point according to the pixel values of a plurality of pixel points in a preset area where the corresponding pixel point is located;

and obtaining an enhanced image of the video frame to be processed according to the pixel values of all the motionless points in the motionless point set and the pixel values of all the moving points in the moving point set.

In a possible implementation manner of the first aspect, determining a stationary point set and a moving point set of a pixel point of a to-be-processed video frame relative to a target video frame includes:

acquiring an optical flow vector of a pixel point for each pixel point in a video frame to be processed, and determining the motion speed of the pixel point according to the optical flow vector of the pixel point;

determining pixel points with the motion speed larger than a preset value as motion points, and determining a motion point set according to all the motion points;

and determining pixel points with the motion speed less than or equal to a preset value as motionless points, and determining a motionless point set according to all motionless points.

In a possible implementation manner of the first aspect, determining a pixel value of a motion point according to pixel values of a plurality of pixel points in a preset region where a corresponding pixel point is located includes:

determining a plurality of pixel points in a preset area where the corresponding pixel points are located;

determining an average value of pixel values of a plurality of pixel points;

the average value is taken as the pixel value of the moving point.

In a second aspect, an embodiment of the present application provides a video enhancement apparatus, including:

the acquisition module is used for acquiring a target video frame in a video to be processed;

the first enhancement module is used for carrying out image enhancement processing on the target video frame to obtain a target enhanced image corresponding to the target video frame;

the first determining module is used for determining an enhanced image corresponding to a video frame to be processed according to the target enhanced image; the video frame to be processed is a video frame to be subjected to image enhancement processing determined according to the target video frame;

the execution module is used for taking the video frame to be processed as a target video frame, taking the enhanced image corresponding to the video frame to be processed as a target enhanced image, returning to the step of executing the step of determining the enhanced image corresponding to the video frame to be processed according to the target enhanced image until the enhanced images respectively corresponding to the preset number of video frames in the video to be processed are obtained;

and the second determining module is used for determining the enhanced video corresponding to the video to be processed according to the enhanced image corresponding to the video frame in the video to be processed.

In a third aspect, an embodiment of the present application provides a video processing apparatus, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the steps of any one of the methods in the first aspect are implemented.

In a fourth aspect, the present application provides a computer-readable storage medium, where a computer program is stored, and when executed by a processor, the computer program implements the steps of any one of the methods in the first aspect.

In a fifth aspect, the present application provides a computer program product, which when run on a terminal device, causes the terminal device to execute the method of any one of the above first aspects.

In the video enhancement method provided by the embodiment of the application, the enhanced image corresponding to each to-be-processed video frame is obtained based on the target enhanced image of the target video frame, so that the pixel values of all pixel points on the enhanced images corresponding to the to-be-processed video frame and the target video frame respectively have relevance, and therefore, brightness jitter cannot be generated between the enhanced images corresponding to the to-be-processed video frame and the target video frame respectively; each group of adjacent video frames in the enhanced video obtained by the method are enhanced images corresponding to a group of video frames to be processed in the video to be processed and a target video frame, so that the pixel values of all pixel points on the adjacent video frames in the enhanced video are also related, and the brightness jitter is avoided.

It is understood that the beneficial effects of the second aspect to the fifth aspect can be referred to the related description of the first aspect, and are not described herein again.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

Fig. 1 is a schematic flowchart of a video enhancement method according to an embodiment of the present application;

FIG. 2 is a schematic diagram of a model of color information of a target video frame according to an embodiment of the present application;

FIG. 3 is a schematic flow chart of acquiring an enhanced image of a target according to an embodiment of the present application;

FIG. 4 is a schematic diagram of obtaining an enhanced image of a target according to an embodiment of the present application;

FIG. 5 is a schematic flow chart illustrating a process for determining a fusion illumination component of a target video frame according to an embodiment of the present application;

FIG. 6 is a schematic flow chart of generating a target enhanced image according to another embodiment of the present application;

FIG. 7 is a schematic flow chart of generating a target enhanced image according to yet another embodiment of the present application;

fig. 8 is a schematic flowchart of determining an enhanced image corresponding to a video frame to be processed according to an embodiment of the present application;

fig. 9 is a schematic structural diagram of a video enhancement apparatus according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of a video processing device according to an embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather "one or more but not all embodiments" unless specifically stated otherwise. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.

The video enhancement technology has wide application in the computer vision field of urban traffic, video monitoring, intelligent vehicles and the like. Currently, video enhancement is usually implemented by performing enhancement processing on an entire video segment through individual image enhancement on each video frame in a video, and the image enhancement method for the video frame includes one or more of the following methods: histogram equalization, chroma mapping, a Retinex theory-based method, a wavelet change method and image fusion.

However, the image enhancement algorithms individually enhance each frame of video frame in the video, and do not consider the association relationship between multiple frames of video frames in the video, which easily causes the brightness change of the pixels at the same position between adjacent video frames to be large, thereby causing the problems of brightness jitter or flicker.

The following describes the technical solutions of the present application and how to solve the above technical problems with specific embodiments. It is worth mentioning that the specific embodiments listed below may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments.

Fig. 1 is a schematic flowchart of a video enhancement method according to an embodiment of the present application, where an execution subject of the embodiment is a video processing device; the video processing device includes but is not limited to mobile terminals such as smart phones, tablet computers, wearable devices and the like, and can also be desktop computers, robots, servers and the like. The video enhancement method as shown in fig. 1 may include:

and S10, acquiring a target video frame in the video to be processed.

In this embodiment, the video to be processed may be a video acquired by the video processing device from a mobile terminal such as a video capture device, may also be a video acquired by the video processing device from a server, and may also be a video pre-stored in the video processing device.

The video acquisition equipment can be a video camera device, a mobile terminal with a video camera function, a tablet computer, an intelligent robot and the like.

The server may be a cloud server.

Optionally, the video to be processed may be an acquired complete video, or a video segment within a preset time range in the acquired video.

For example, the video processing device obtains a 24-hour monitoring video of 3 months and 20 days sent by the monitoring device, and captures a video segment with a start time of 18:00 and an end time of 19:00 as a video to be processed.

In this embodiment, the video to be processed includes N consecutive video frames, where N is greater than or equal to 2 and is an integer; the target video frame can be any one frame of video frame in the video to be processed; for example, the target video frame is the r-th frame in the video to be processed, where r is greater than or equal to 1 and less than or equal to N.

For example, after the video processing device obtains the video to be processed, the video processing device may determine the segmentation time interval of the video to be processed according to the frame rate of the enhanced video, divide the video to be processed into N consecutive video frames according to the time interval, and then randomly select one video frame from the N consecutive video frames as the target video frame. The frame rate is the number of video frames played by each second of the enhanced video, and the frame rate of the enhanced video is a user preset value.

Illustratively, the video to be processed is a video segment with a start time of 17:00 and an end time of 17:10, and if the frame rate of the enhanced video is 24, the slicing time interval is 1/24 s; the video to be processed is segmented at 1/24s as a time interval, 14400 (obtained by segmenting the video to be processed of 600 s) continuous video frames are obtained, and then any one of the 14400 video frames is selected as a target video frame.

And S20, performing image enhancement processing on the target video frame to obtain a target enhanced image corresponding to the target video frame.

In this embodiment, the video processing device may select different image enhancement processing modes according to the image enhancement requirement of the target video frame.

For example, the video to be processed is a video obtained by shooting under low illumination conditions such as night, at this time, pixel values of most pixel points of the target video frame are concentrated in a low gray level region, and the image enhancement of the target video frame requires to improve the contrast of the image. The video processing device can perform nonlinear stretching on the gray values of the pixel points in the target video frame, so that the gray values of the target video frame are uniformly distributed, and the contrast of the target video frame is improved.

For example, the video frame to be processed is a video obtained by shooting under the condition of uneven illumination, at this time, an artifact exists in the target video frame due to uneven illumination, and the artifact is required to be eliminated in the image enhancement of the target video frame. Since the color information that an object can be observed is determined by the reflection ability (reflection properties) of the object for different light waves, which does not vary due to variations in the intensity of incident light, and the intensity of incident light that is irradiated on the object, the reflection ability of the object for different light waves determines the color of the object itself. Therefore, in order to eliminate artifacts caused by uneven illumination of the target video frame, the video processing device may perform filtering processing on the target video frame to remove the influence of incident light in the target video frame and retain the reflection attribute of the object, thereby implementing image enhancement of the target video frame.

In an exemplary embodiment, a model diagram of color information of a target video frame can be shown in fig. 2, where the target video frame can be understood as an original image S 'acquired by a video capture device, the color information of the target video frame is determined by the reflectivity of an object in the target video frame to incident light and the intensity of the incident light irradiating on the object, and then the expression of S' can be

S'＝L'×R' (1)

Wherein S ' is a target video frame, L ' is an illumination component representing illumination intensity information of incident light, and R ' is a reflection component representing the reflection capability of an object to the incident light.

When the incident light is not uniformly illuminated, artifacts appear on the target video frame, and at this time, the real color information of the object in the target video frame cannot be obtained.

In order to eliminate artifacts on the target video frame, the video processing device first performs a smoothing filtering process on the target video frame S 'to recover the illumination component L' of the target video frame as much as possible, and then determines the reflection component L 'of the target video frame by using formula (1) and the illumination component and the target video frame as inputs, and determines the reflection component L' as a target enhanced image corresponding to the target video frame.

The method obtains the reflection component reflecting the reflection capability of the object by removing the illumination component of the target video frame, and can effectively eliminate the artifact caused by uneven illumination (illumination component) so as to realize the enhancement of the object in the target video frame).

In this embodiment, the image enhancement processing method for the target video frame may also be one or more of histogram equalization, wavelet transformation, chroma mapping, and image fusion imaging, which is not limited herein.

And S30, determining an enhanced image corresponding to the video frame to be processed according to the target enhanced image.

In this embodiment, the video frame to be processed is a video frame to be subjected to image enhancement processing determined according to the target video frame.

The video frame to be subjected to image enhancement processing may be a video frame separated from the target video frame by a preset frame number, or may be a video frame adjacent to the target video frame.

Alternatively, the video frame to be subjected to the image enhancement processing may be a video frame separated from the target video frame by a preset frame number. The numerical value of the preset frame number can be determined by the ratio of the frame rate of the video to be processed to the frame rate of the enhanced video.

For example, assuming that the video to be processed is obtained by the video processing device dividing the acquired video according to the slicing time interval in step 10, the slicing time interval may be set according to the frame rate of the enhanced video, so that the frame rate of the video to be processed is n times of the frame rate of the enhanced video, where n is an integer greater than or equal to 1.

Specifically, the slicing interval t may be expressed as:

t＝1/nfv (2)

where fv is the frame rate of the enhanced video. The frame rate of the video to be processed is 1/t, nfv, so that the frame rate of the processed video is n times the frame rate of the enhanced video.

If the ratio of the frame rate of the video to be processed to the frame rate of the enhanced video is n, the preset frame number is n-1.

For example, if n is 2, it indicates that an enhanced image is obtained every 2 video frames in the video to be processed, that is, the video frame to be processed is subjected to image enhancement processing every other video frame, so that the preset frame number is 1.

In this embodiment, the video frame separated from the target video frame by the preset frame number may be a video frame separated from the target video frame by the preset frame number and having a playing time later than that of the target video frame, or may be a video frame separated from the target video frame by the preset frame and having a playing time earlier than that of the target video frame.

In an exemplary embodiment, if the target video frame is a first video frame of the video to be processed (i.e., a first video frame of the video to be processed), the video frame to be processed is a video frame that is separated from the target video frame by a preset frame number and has a playing time later than that of the target video frame. For example, the video to be processed includes 256 consecutive video frames, and the 256 video frames are numbered sequentially according to the playing sequence as 1 and 2 … 256; assuming that the target video frame is the 1 st video frame in the video to be processed, and the preset frame number is 1, the video frame to be processed is the 3 rd video frame in the video to be processed.

In another example, if the target video frame is the last video frame of the video to be processed (e.g., the video frame playing at the latest time in the video to be processed), the video frame to be processed is a video frame that is separated from the target video frame by a preset frame number and has a playing time earlier than that of the target video frame. For example, the video to be processed includes 256 consecutive video frames, and the 256 video frames are numbered sequentially according to the playing sequence as 1 and 2 … 256; assuming that the target video frame is the 256 th video frame in the video to be processed and the preset frame number is 1, the video frame to be processed is the 254 th video frame in the video to be processed.

In another example, if the number of video frames between the target video frame and the last video frame is greater than the preset frame number, and the number of video frames between the target video frame and the first video frame is greater than the preset frame number, the video frame to be processed may be a video frame that is separated from the target video frame by the preset frame number and has a playing time later than the target video frame, or may be a video frame that is separated from the target video frame by the preset frame and has a playing time earlier than the target video frame. For example, the video to be processed includes 256 consecutive video frames, and the 256 video frames are numbered sequentially according to the playing sequence as 1 and 2 … 256; assuming that the target video frame is the 100 th video frame in the video to be processed and the preset frame number is 1, the video frame to be processed is the 102 th video frame or the 98 th video frame in the video to be processed.

In this embodiment, the video frame to be subjected to the image enhancement processing may also be a video frame adjacent to the target video frame. A video frame adjacent to a target video frame may be understood as a video frame separated from the target video frame by a frame number of 0; namely, when the preset frame number is 0, the video frame to be processed is a video frame adjacent to the target video frame.

Based on this, the video frame adjacent to the target video frame may be a video frame adjacent to the target video frame and playing later than the target video frame, or may be a video frame adjacent to the target video frame and playing earlier than the target video frame.

In this embodiment, the purpose of determining, by the video processing device, the enhanced image corresponding to the video frame to be processed according to the target enhanced image is to update, according to the pixel value of the pixel point on the target enhanced image that has been subjected to the image enhancement processing, the pixel value of the pixel point on the video frame to be processed that has not been subjected to the image enhancement image processing, and obtain the enhanced image of the video frame to be processed, so that the pixel values of the pixel points on the enhanced images corresponding to the video frame to be processed and the target video frame have relevance.

For example, if the video to be processed includes N consecutive video frames, where N is greater than or equal to 2 and N is an integer; the target video frame is the r-th frame in the video to be processed, wherein r is larger than 2 and smaller than N-1, and the video frame to be processed is a video frame adjacent to the target video frame; then, according to the enhanced image of the target video frame, two video frames (the (r + 1) th frame and the (r-1) th frame) adjacent to the target video frame (the r-th frame) can be simultaneously obtained and respectively correspond to the enhanced image; the method for obtaining the enhanced images corresponding to the (r + 1) th frame and the (r-1) th frame can be the same. An exemplary description is given below of determining an enhanced image corresponding to an r +1 th video frame (a video frame to be processed) from an enhanced image (a target enhanced image) of the r-th video frame.

For example, determining an enhanced image corresponding to the (r + 1) th video frame according to the enhanced image of the (r) th video frame may include:

and A1, acquiring a fixed point set and a moving point set of each pixel point of the r +1 th video frame relative to the r video frame.

Optionally, for each pixel point on the r +1 th video frame, the luminance change rate of the pixel point relative to the pixel point at the same coordinate position on the r +1 th video frame is used as a velocity vector of the pixel point, the pixel point with the velocity vector larger than a preset value is determined as a moving point, and the pixel point with the velocity vector smaller than or equal to the preset value is determined as an immobile point. Based on the method, each pixel point on the (r + 1) th video frame is determined to be an immobile point or a moving point relative to the (r) th video frame, then a moving point set is determined according to all the moving points on the (r + 1) th video frame, and the immobile point set is determined according to all the immobile points on the (r + 1) th video frame.

In another example, the pixel values of the pixels in the same coordinate position on the r +1 th video frame and the r +1 th video frame are subjected to difference processing, so as to obtain a pixel difference value corresponding to each pixel on the r +1 th video frame. In the video shooting process, the shot object may be static or moving in two adjacent video frames, wherein the positions and pixel values of the pixel points for representing the static object on the two adjacent video frames are the same, and the positions and pixel values of the pixel points for representing the moving object outline on the two adjacent video frames are different.

Therefore, the positions and contours of the (r + 1) th video frame relative to a plurality of moving objects on the (r) th video frame can be determined according to the pixel points with the pixel difference value not being 0.

Then, taking pixel points on the moving object contour and inside the moving object contour as motion points of the (r + 1) th video frame relative to the (r) th video frame, and determining a motion point set according to all the motion points; and taking pixel points outside the moving object outline as immobile points of the (r + 1) th video frame relative to the (r) th video frame, and determining an immobile point set according to all the immobile points.

Step A2, aiming at each immobile point in the immobile point set, taking the pixel value of the pixel point corresponding to the immobile point in the enhanced image of the r video frame as the pixel value of the immobile point

The pixel point corresponding to the motionless point in the enhanced image of the r-th video refers to a pixel point with the same coordinate as the motionless point in the r + 1-th video frame.

For example, assuming that the coordinates of the stationary point SP1 in the r +1 th video frame are (Xp1, Yp1), the pixel point corresponding to the stationary point SP1 is a pixel point with coordinates (Xp1, Yp1) in the enhanced image of the r-th video frame.

In this step, the pixel value of the pixel point corresponding to the motionless point in the enhanced image of the r-th video frame is used as the pixel value of the motionless point, which may mean that the pixel point with the same coordinate as the motionless point in the enhanced image of the r-th video frame is determined, and the pixel value of the pixel point is used as the pixel value of the motionless point.

For example, assuming that the coordinates of the stationary point SP1 in the r +1 th video frame are (Xp1, Yp1), the pixel value of the stationary point SP1 is the pixel value of the pixel point having the coordinates (Xp1, Yp1) in the enhanced image of the r-th video frame.

And A3, determining a corresponding pixel point of the motion point in the enhanced image of the r-th video frame aiming at each motion point in the motion point set, and determining the pixel value of the motion point according to the pixel values of two adjacent pixel points of the corresponding pixel point.

The coordinates of the corresponding pixel points of the motion points in the enhanced image of the r-th video frame can be obtained by calculation according to the speed vectors of the motion points and the coordinates of the motion points in the r + 1-th video frame.

For example, suppose the coordinates of the motion point JP1 on the r +1 th video frame are (xp1, yp1), and the velocity vector is (v)_x，v_y) Then the coordinates of the corresponding pixel point of the motion point JP1 in the enhanced image of the r-th video frame are (xp1+ v)_x×△t，yp1+v_yX Δ t). Where Δ t is the time interval between the r +1 th video frame and the r-th video frame.

In this step, the pixel value of the motion point is determined according to the pixel values of two adjacent pixel points to the corresponding pixel point, which may mean that the average value of the pixel values of two adjacent pixel points to the corresponding pixel point is used as the pixel value of the motion point.

The two pixels adjacent to the corresponding pixel may be two pixels adjacent to the corresponding pixel along the X direction, or two pixels adjacent to the corresponding pixel along the Y direction.

For example, assuming that the coordinates of a corresponding pixel point JP1 'of the motion point JP1 in the enhanced image of the r-th video frame are (xp 1', yp1 '), the resolution of the enhanced image of the r-th video frame in the X direction is i, and the resolution in the Y direction is j, the coordinates of two pixel points adjacent to JP 1' may be (xp1 '+ i, yp1) and (xp 1' -i, yp1), or may also be (xp1 ', yp 1' + j) and (xp1 ', yp 1' -j).

After the coordinates of two pixel points adjacent to JP 1' are obtained, the average value of the pixel values of the two pixel points is used as the pixel value of the motion point JP 1.

And step A4, the video processing equipment combines all the fixed points in the fixed point set and all the moving points in the moving point set to obtain the enhanced image of the (r + 1) th video frame.

The purpose in this step is to combine the pixel values of each stationary point in the stationary point set and the pixel values of each moving point in the moving point set according to the coordinates of each pixel point (including the stationary point and the moving point) to obtain the enhanced image of the (r + 1) th video frame.

Illustratively, the motionless point set includes 6 motionless points, and the pixel values of the 6 motionless points are respectively represented as Sp1(1,1), Sp2(1,2), Sp3(1,3), Sp4(2,1), Sp5(3,1), and Sp6(3, 2); the motion point set comprises 3 motion points, and the pixel values of the 3 motion points are respectively Jp1(2,2), Jp2(2,3) and Jp3(3, 3); wherein, (x, y) is the coordinate of each pixel point.

Then, the 9 pixel points are combined according to the coordinate positions to obtain an enhanced image of the (r + 1) th video frame, which can be referred to in the following table 1:

TABLE 1 enhanced image Pixel composition Table

Sp1(1,1)	Sp2(1,2)	Sp3(1,3)
			Sp4(2,1)	Jp1(2,2)	Jp2(2,3)
Sp5(3,1	Sp6(3,2)	Jp3(3,3)

And S40, taking the video frame to be processed as a target video frame, taking the enhanced image corresponding to the video frame to be processed as a target enhanced image, and returning to the step of determining the enhanced image corresponding to the video frame to be processed according to the target enhanced image until the enhanced images respectively corresponding to the preset number of video frames in the video to be processed are obtained.

In this embodiment, in the step of determining an enhanced image corresponding to a video frame to be processed according to the target enhanced image, the video frame to be processed is a video frame which is separated from the target enhanced video by a preset frame number and has not been subjected to image enhancement processing.

Illustratively, the video to be processed includes 256 video frames, the target video frame obtained for the first time in step 30 is the 100 th video frame, and the preset frame number is 1.

In the first loop, the target video frame is the 100 th video frame, and the video frame to be processed is the 102 th video frame or the 98 th video frame.

If the video frame to be processed is the 102 th video frame; in the second loop, the target video frame is the 102 th video frame, and the video frame to be processed may only be the 104 th video frame (a video frame which is 1 frame away from the 102 th video frame and has not been subjected to the image enhancement processing). In the subsequent cycle, the target video frames are all video frames which are separated from the target enhancement video by 1 video frame and have the playing time later than that of the target video frames.

If the video frame to be processed is the 98 th video frame; in the second loop, the target video frame is the 98 th video frame, and at this time, the video frame to be processed may only be the 96 th video frame (a video frame which is 1 frame away from the 98 th video frame and has not been subjected to image enhancement processing); in the subsequent cycle, the target video frames are all video frames which are separated from the target enhancement video by 1 video frame and have the playing time earlier than that of the target video frames.

If in the first loop, if the video processing device simultaneously acquires the 102 th video frame and the 98 th enhanced image through two threads, for example, thread 1 is used for acquiring the 102 th enhanced image of the video frame, and the user of thread 2 acquires the target video frame as the 98 th video frame; in the second loop, the target video frame of the thread 1 is the 102 th video frame, and the video frame to be processed is the 104 th video frame; the target video frame of the thread 2 is the 98 th video frame, and the video frame to be processed is the 96 th video frame. In the subsequent cycle, all the target video frames in the thread 1 are video frames which are separated from the target enhancement video by 1, and the playing time of the video frames is later than that of the target video frames; the target video frames in the thread 2 are all video frames which are separated from the target enhancement video by 1, and the playing time of the video frames is earlier than that of the target video frames.

In this embodiment, the preset number may be determined by the number of video frames in the video to be processed and the preset number of frames.

For example, the video to be processed includes N video frames, N is greater than or equal to 2 and N is an integer. Assuming that the preset frame number is 1, namely determining a video frame to be processed every 1 frame; if N can be divided by 2, enhanced images of N/2 video frames can be obtained; if N is not divisible by 2, enhanced images of N-1/2 video frames to be processed can be obtained.

In this embodiment, the video processing device obtains enhanced images corresponding to a preset number of video frames in the video to be processed.

Illustratively, the video to be processed includes 256 video frames, the numbers of the 256 video frames are 1,2, and 3 … 256 in sequence, and if the preset frame number is 1, the preset number is 128. Assuming that a target video frame obtained by the video processing device for the first time is a 100 th video frame, in a first loop, the video processing device obtains an enhanced image of a 102 th video frame through a thread 1, and obtains an enhanced image of a 98 th video frame through a thread 2.

The thread 1 acquires the enhanced image of the 104 th video frame in the second loop; acquiring an enhanced image of a 106 th video frame in a third cycle; by analogy, thread 1 sequentially obtains the enhanced video corresponding to the video frames numbered 102, 104 … 256 in a plurality of cycles. That is, thread 1 obtains 78 enhanced images corresponding to the respective video frames.

Acquiring an enhanced image of a 98 th video frame by the thread 2 in a second cycle; acquiring an enhanced image of a 96 th video frame in a third cycle; by analogy, thread 2 obtains the enhanced video corresponding to the video frames numbered 98, 96, … 2 sequentially in a plurality of cycles. That is, thread 2 obtains 49 enhanced images respectively corresponding to the video frames.

At this time, the video processing device obtains enhanced images corresponding to 128 (preset number) video frames in total of 1+78+ 49.

And S50, determining the enhanced video corresponding to the video to be processed according to the enhanced image corresponding to the video frame in the video to be processed.

The enhanced images corresponding to the video frames in the video to be processed refer to the enhanced images respectively corresponding to the preset number of video frames obtained in step 40.

In this embodiment, after obtaining the enhanced images corresponding to the preset number of video frames, the video processing device sorts the preset number of video frames according to the identifier of each video frame to obtain a video segment composed of the preset number of video frames, and then replaces each video frame in the video segment with a corresponding enhanced image to obtain an enhanced video.

The video frame identifier may be a playing time, a sequence number, or the like.

Illustratively, the video to be processed includes 32 consecutive video frames, and the sequence numbers of the 32 video frames are 1 and 2 … 32 in sequence. The sequence number is used for representing the playing sequence of the N video frames in the video to be processed, and the smaller the sequence number is, the earlier the playing time is.

Assuming that the video frame to be processed is a video frame separated from the target video frame by one frame, the target video frame obtained by the video processing device for the first time is a 16 th video frame, in the first cycle, the video processing device obtains an enhanced image of the 18 th video frame through a thread 1, and a thread 2 obtains an enhanced image of the 14 th video frame. After a plurality of iterations, thread 1 obtains enhanced images corresponding to the video frames with the sequence numbers of 18, 20, 22 …, 30 and 32 respectively, and thread 2 obtains enhanced images corresponding to the video frames with the sequence numbers of 14, 12, 10 and … 2 respectively, that is, the video processing device obtains enhanced images corresponding to 16 video frames with the sequence numbers of 2, 4, 6, … 30 and 32 respectively; then numbering and sequencing the 16 video frames according to the sequence to obtain an intermediate video; and finally, replacing each video frame in the intermediate video with a corresponding enhanced image to obtain an enhanced video.

In the video enhancement method provided by the embodiment of the application, the enhancement image corresponding to each to-be-processed video frame is obtained based on the target enhancement image of the target video frame which is separated from the to-be-processed video frame by the preset frame number, so that the pixel values of all pixel points on the enhancement images corresponding to the to-be-processed video frame and the target video frame are ensured to have relevance, and therefore, brightness jitter cannot be generated between the enhancement images corresponding to the to-be-processed video frame and the target video frame; because each group of adjacent video frames in the enhanced video correspond to the enhanced images corresponding to the group of video frames to be processed in the video to be processed and the target video frame, the pixel values of the pixel points on the adjacent video frames in the enhanced video are also related, and the brightness jitter is not generated.

Fig. 3 is a schematic flowchart of a process for obtaining a target enhanced image according to an embodiment of the present application, and describes one possible implementation manner of obtaining a target enhanced image corresponding to a target video frame in step 20 in fig. 1. As shown in fig. 3, performing image enhancement processing on a target video frame to obtain a target enhanced image corresponding to the target video frame includes:

s201, acquiring a brightness channel of the target video frame.

Video frames in videos to be processed acquired by video acquisition equipment are usually RGB (Red, Green, Blue) color images, and include three color channels of Red R, Green G, and Blue B, where the three color channels all include luminance information and the luminances of the three color channels are different, and if the three channels are respectively subjected to image enhancement processing, the proportion of R, G, B is easily changed, thereby causing color distortion of the obtained enhanced images. In order to avoid color distortion of the enhanced image, only the luminance channel of the target video frame is usually enhanced, so the luminance channel of the target video frame needs to be acquired first.

In this embodiment, the acquiring, by the video processing device, the luminance channel of the target video frame may include the following steps:

and D1, converting the target video frame from the RGB color space to the color space containing the brightness channel to obtain the converted target video frame.

In this step, the color space including the luminance channel may be an HSV color space, an YCbCr color space, or the like, and is not particularly limited herein.

In the HSV color space, H is Hue and represents chroma; s is Saturration, which indicates Saturation, and V is Value, which indicates brightness. In the YCbCr color space, Y is the luminance component, Cb refers to the blue chrominance component, and Cr refers to the red chrominance component.

In this step, converting the target video frame from the RGB color space to a color space (which may be referred to as a target color space) including a luminance channel may refer to obtaining a parameter value corresponding to each pixel point in the target video frame in the target color space.

Illustratively, the target video frame comprises a plurality of pixel points, and the pixel value of each pixel point in the RGB color space can be represented as (R, G, B), wherein R, G, B is a value between 0 and 255. For example, the pixel point value of the pixel point f is (0, 100, 255).

Assuming that the target color space is an HSV color space, the corresponding parameter values (H, S, V) of each pixel point of the target video frame in the HSV color space may be sequentially determined.

Wherein, the pixel value of the pixel point in the RGB color space is (R, G, B), and the chroma value H of the pixel point can be shown in the following formula (3)

Wherein, R ', G' and B 'are reference values obtained by respectively normalizing the R value, the G value and the B value of the pixel point, and R' is R/255; g' ═ G/255; b' ═ B/255;

cmax is the maximum of the reference values, Cmax ═ max (R ', G ', B ');

cmin is the minimum value among the reference values, where Cmin is min (R ', G ', B ');

Δ＝Cmax-Cmin。

the saturation value S of the pixel point can be referred to the following formula (4)

Wherein Cmax and Δ are the same as those explained in formula (3).

The brightness value V of the pixel point f can be referred to the following formula (5)

V＝Cmax (5)

Wherein Cmax is the same as explained in formula (3).

And obtaining the H value, the S value and the V value of each pixel point of the target video frame in the HSV color space based on the formulas (3), (4) and (5), thereby realizing the conversion of the target video frame from the RGB color space to the HSV color space.

And D2, extracting and obtaining the brightness channel of the target video frame according to the converted target video frame.

In this step, the luminance channel of the converted target video frame may be used as the luminance channel of the target video frame.

For example, assuming that the converted target video frame is HSV color space, the converted target video frame includes three channels of chrominance H, saturation S and luminance V, and the luminance channel V is used as a luminance channel of the target video frame.

After obtaining the luminance channel of the target video frame, the video processing device may perform image enhancement processing on the luminance channel according to the image enhancement requirement of the target video frame to obtain an enhanced luminance channel. And then converting the video frame into an RGB color channel according to the enhanced brightness channel, the initial chrominance channel H of the target video frame and the initial saturation S to obtain an enhanced image of the target video frame.

The method for performing image enhancement processing on the luminance channel of the target video frame may be a histogram equalization method, a method based on wavelet change, or the like. The video processing device can select different image enhancement processing modes according to the image enhancement requirement of the target video frame. An exemplary way of image enhancement to remove artifacts is described below by steps 202 through 204.

S202, determining a plurality of illumination components of the target video frame according to the brightness channel of the target video frame.

As can be seen from fig. 2, the core idea of removing the artifacts in the target video frame is to estimate the illumination component of the luminance channel as accurately as possible, and perform enhancement processing on the luminance channel according to the estimated illumination component.

The estimation method of the target illumination component is usually obtained by performing smooth filtering on the original image, where the illumination component has more image details but the brightness is low. Therefore, the brightness channel can be processed in various ways to obtain a plurality of illumination components of the brightness channel; and then, fusing the plurality of illumination components, so that the obtained fused illumination components have higher brightness and more image details. The purpose of this step is to obtain multiple illumination components for the luminance channel.

In one possible implementation, the luminance channel is filtered to obtain the first illumination component L1 of the luminance channel.

For example, the luminance channel is subjected to a guide filter process, or a gaussian filter process.

Illustratively, the gaussian function G (x, y) may be represented by equation (6):

wherein, x and y represent coordinate values of pixel points on the brightness channel, λ is a constant, and σ is a scale parameter.

When sigma is small, the obtained illumination component can keep better edge details; the value of σ may be set in advance by the user. Alternatively, the scale parameter σ may be set to a small value so that the first illumination component L1 may maintain good edge details.

In another possible implementation, in order to make the estimated illumination component have higher luminance, a variation operation may be performed on the luminance channel to obtain a second illumination component L2 of the luminance channel. The second illumination component estimated via the variational approach has a higher luminance but may lose some detail information.

In the variation operation, the illumination component estimation of the brightness channel is converted into the optimal solution problem of quadratic programming, and the second illumination component L2 is estimated by a gradient descent method after the objective function is set.

Optionally, the objective function of the variational operation is as shown in equation (7):

wherein α, β and γ are all weight coefficients, D is a difference operator, L is an illumination component of the luminance channel V, and R is a reflection component of the luminance channel V.

S203, determining fusion illumination components corresponding to the target video frame according to the plurality of illumination components of the target video frame.

The first illumination component L1 obtained in step 202 has better edge detail information, and the obtained second illumination component L2 has higher brightness, and the purpose of this step is to perform fusion processing on the first illumination component L1 and the second illumination component L2 to obtain a fused illumination component, so that the obtained fused illumination component has both better detail information and higher brightness.

It should be understood that if more than two illumination components are obtained in step 202, the more than two illumination components may be subjected to a fusion process to obtain a fused illumination component of the target video frame.

In this embodiment, determining the fusion illumination component corresponding to the target video frame according to the plurality of illumination components of the target video frame may refer to obtaining respective gradients corresponding to the first illumination component L1 and the second illumination component L2, respectively, taking the gradients as weight values, and performing fusion processing on the first illumination component L1 and the second illumination component L2 to obtain the fusion illumination component corresponding to the target video frame.

The gradient of the first illumination component L1 may be an average gradient of the first illumination component L1, or a gradient of each pixel point on the first illumination component L1. The gradient of the second illumination component L2 may refer to an average gradient of the second illumination component L2, or may be a gradient of each pixel point on the second illumination component L2.

Illustratively, the gradients of L1 and L2 are each a gradient of each pixel point.

Then take pixel O as an example, the gradient of the pixel O on L1 is J, and the pixel value is f₁(ii) a The gradient of the pixel point O on the L2 is K, the pixel value is f2, and the pixel value f after the pixel point is fused is₃Comprises the following steps:

f₃＝f₁×J+f₂×K (8)

and (4) processing each pixel point according to a formula (8) to obtain a pixel value of each pixel point after fusion. And then changing the pixel value of each pixel point on the first illumination component or the second illumination component into a fused pixel value to obtain a fused illumination component corresponding to the target video frame.

And S204, processing the target video frame according to the fusion illumination component to generate a target enhanced image corresponding to the target video frame.

After the fused illumination component is obtained, determining a reflection component of the target video frame based on the fused illumination component and a brightness channel of the target video frame, and using the reflection component as an enhanced brightness channel of the target video frame; it should be noted that, since the fused illumination component has both better detail information and higher brightness, the reflection component determined based on the fused illumination component can accurately represent the characteristics of the object in the video frame, thereby enhancing the brightness channel of the target video frame.

Since the color space where the target video frame and the target enhanced image are located needs to be kept unchanged, after the enhanced luminance channel of the target video frame is obtained, the target enhanced image in the same color space as the target video frame needs to be obtained through conversion based on the enhanced luminance channel and other channels in the color space where the enhanced luminance channel is located.

For example, the target video frame is an image in an RGB color space, the color space where the enhanced luminance channel is located is an HSV color space, and the enhanced luminance channel is a luminance channel V in the HSV color space, the video processing device needs to further obtain a chrominance channel H of the target video frame and a saturation channel S of the target video frame, convert the chrominance channel H, the saturation channel S, and generate a color image in a corresponding RGB color space according to the luminance channel V, the chrominance channel H, and the saturation channel S, and use the color image as a target enhanced image corresponding to the target video frame.

For a more clear description of the present embodiment, please refer to fig. 4, in which fig. 4 is a schematic diagram of generating the enhanced image of the target. As shown in fig. 4, a target video frame is a color image in RGB color space, and the target video frame may be represented as f (R, G, B), and first, the target video frame is converted from RGB color space to HSV color space to obtain a chrominance channel H, a saturation channel S, and a luminance channel V; the luminance channel V is enhanced to obtain an enhanced luminance channel V ', and then the chrominance channel H, the saturation channel S and the luminance channel V' are converted into an RGB color space to obtain R1, G1 and B1, where R1 represents a red channel, G1 represents a green channel and B1 represents a blue channel, and the R1, G1 and B1 are combined to generate a target enhanced image f (R1, G1 and B1).

The process of converting the color space of the chrominance channel H, the saturation channel S and the luminance channel V' into the RGB color space to obtain R1, G1 and B1 can be determined based on the formulas (3), (4) and (5).

In the embodiment, the application scenario for removing the artifacts of the target video frame is implemented by processing the luminance channel in multiple ways (two ways), estimating multiple illumination components, and performing fusion processing on the multiple illumination components to obtain a fused illumination component, so that the fused illumination component has both better detailed information and higher luminance, and therefore, the reflection component determined based on the fused illumination component can accurately reflect the characteristics of the object, thereby enhancing the target video frame.

Fig. 5 is a schematic flowchart of a process of determining a fused illumination component of a target video frame according to an embodiment of the present application, and describes a possible implementation manner of the fusion process performed on two illumination components in step S203 in fig. 3. As shown in fig. 5, determining a fused illumination component corresponding to a target video frame according to a plurality of illumination components of the target video frame includes:

s2031, determining a first gradient of the first illumination component and a second gradient of the second illumination component.

Here, the first illumination component L1 may be obtained by performing a smoothing filtering process on the luminance channel. The second illumination component L2 may be obtained by performing a variation operation on the luminance channel.

In this embodiment, the gradient of the first illumination component L1 is an average gradient of all the pixel points in the first illumination component; the gradient of the second illumination component is the average gradient of all the pixel points in the second illumination component.

For example, the formula for calculating the average gradient of the first illumination component L1 can be referred to as formula (9):

wherein,

and F (i, j) is the brightness value of the pixel point (i, j) in the first illumination component.

The step of calculating the average gradient of the second illumination component is the same as the step of calculating the average gradient of the first illumination component, and specifically, see formula (10):

wherein,

and f (i, j) is the brightness value of the pixel point (i, j) in the second illumination component.

S2032, based on the first gradient and the second gradient, fusing the first illumination component and the second illumination component to obtain a fused illumination component.

In this embodiment, the video processing device may perform fusion processing on the first illumination component and the second illumination component based on a weighted average method, so that the fusion illumination component may provide more target information, so as to implement enhancement processing on the luminance channel according to the fusion illumination component, and further implement enhancement processing on the target video frame according to the enhanced luminance channel.

Illustratively, the video processing device first determines a weight of the first illumination component and a weight of the second illumination component from the first gradient and the second gradient.

Wherein the weight of the first illumination component and the weight of the second illumination component are determined from the first gradient and the second gradient, see equations (11) and (12), respectively:

wherein G is_GFA weight representing the first illumination component,

an average gradient representing the first illumination component (first gradient); g_VFA weight representing the second illumination component,

represents an average gradient of the second illumination component (second gradient).

After obtaining the weight of the first illumination component and the weight of the second illumination component, the video processing device performs weighted summation on the first illumination component and the second illumination component according to the weight of the first illumination component and the weight of the second illumination component to obtain a fusion illumination component.

The first illumination component and the second illumination component are weighted and summed according to the weight of the first illumination component and the weight of the second illumination component, and the mode of obtaining the fusion illumination component can refer to formula (13):

L₃＝L₁×G_GF+L₂×G_VF (13)

wherein L is₁Representing a first illumination component, L₂Representing a second illumination component, L₃Representing the fusion illumination component G_GFWeight representing the first illumination component, G_VFRepresenting the weight of the second illumination component.

When the video to be processed is the video obtained by shooting under the low illumination condition such as night, the artifact exists in the video to be processed under the condition that the illumination is uneven, meanwhile, each video frame in the video is dark due to the low illumination, and the processing at the later stage is not beneficial, if the enhanced image of each video frame is obtained only based on the mode of removing the illumination component, although the problem of the artifact can be solved better, the processed image is further darkened, based on the artifact, the illumination component can be corrected (the influence of the illumination unevenness is eliminated) after the reflection component is obtained, the corrected illumination component is obtained, and the final enhanced image is determined according to the corrected illumination component and the reflection component obtained in the embodiment of fig. 3. The enhanced image is subjected to illumination compensation through the corrected illumination component, and the problems of artifacts and low brightness of a video obtained by shooting under a low-illumination condition can be solved at the same time. This is exemplarily described below by the embodiment shown in fig. 6.

Fig. 6 is a schematic flow chart for generating a target enhanced image according to another embodiment of the present application, and describes one possible implementation manner of step S204 in the embodiment of fig. 3. As shown in fig. 6, processing the target video frame according to the fused illumination component to generate a target enhanced image corresponding to the target video frame includes:

s2041, determining a reflection component of the brightness channel according to the fusion illumination component and the brightness channel.

The fused illumination component of this step may be the fused illumination component L in the embodiment shown in fig. 5₃。

The luminance channel in this step is the luminance channel in step 201 in the embodiment of fig. 3.

And (3) taking the brightness channel as an original image S, fusing the illumination component as an illumination component, and determining the reflection component of the brightness channel based on the formula (1).

And S2042, correcting the reflection component to generate a corrected reflection component.

When the video to be processed is a video obtained by shooting under low illumination conditions such as night, the target video frame often has a condition of underexposure, and the contrast of the image is low at this time, specifically, the contrast of the reflection component of the target video frame is low.

Illustratively, the method of the correction process may be gamma correction.

And S2043, performing nonlinear stretching on the fusion illumination component to generate a stretched fusion illumination component.

The purpose of this step is to non-linearly stretch the fused illumination component. Optionally, the gray scale of the fusion illumination component can be graded through two preset gray scale thresholds to obtain three gray scale levels of a low gray scale level, a medium gray scale level and a high gray scale level; and then, gray stretching is respectively carried out on the three gray levels, and different stretching parameters (namely, nonlinear stretching) are set so as to effectively enhance the brightness in a low gray level (namely, low brightness) area and reduce the influence of uneven illumination.

Illustratively, the two preset grayscale thresholds are a first grayscale threshold and a second grayscale threshold respectively, where the second grayscale threshold is greater than the first grayscale threshold and smaller than the maximum grayscale value of the fusion illumination component, and the first grayscale threshold is greater than the minimum grayscale value of the fusion illumination component; the gray level smaller than the first gray threshold value in the fusion light component is taken as a low gray level, the gray level smaller than the second gray threshold value in the fusion light component is taken as a medium gray level, and the gray level larger than or equal to the second gray threshold value in the fusion light component is taken as a high gray level. Alternatively, the gray scale of the fused illumination component may be trisected by setting the first gray scale threshold and the second gray scale threshold.

In this step, setting different stretching parameters for different gray levels can be realized by an arctan function.

For example, the expression for the fused illumination component non-linear stretch can be seen in equation (14):

wherein a and b are constants, L₃Representing the component of the fusion illumination, L_finalThe stretched fused illumination component. The arctan function curve is a non-linear curve that acts to stretch regions of low pixel values (low gray levels) to effectively enhance the luminance in low luminance regions.

S2044, obtaining an enhanced brightness channel according to the stretched fusion illumination component and the corrected reflection component.

After the stretched fusion illumination component and the corrected reflection component are obtained, a brightness channel can be determined by utilizing the stretched fusion illumination component and the corrected reflection component, and it needs to be noted that the influence of illumination nonuniformity is reduced by the stretched fusion illumination component, and the image contrast is enhanced by the corrected reflection component, so that compared with the previous brightness channel, the brightness channel can simultaneously solve the problems of artifacts and insufficient exposure of images in a video shot under a low illumination condition; for convenience of description, in the present embodiment, the luminance channel may be referred to as an enhanced luminance channel. For example, in an implementation manner, the manner of obtaining the enhanced luminance channel according to the stretched fused illumination component and the corrected reflection component may be to perform an integration process on the stretched fused illumination component and the corrected reflection component according to formula (1) to obtain an adjusted luminance channel, and use the adjusted luminance channel as the enhanced luminance channel, and of course, the enhanced luminance channel may also be determined according to the stretched fused illumination component and the corrected reflection component in other manners.

And S2045, processing the target video frame according to the enhanced brightness channel to generate a target enhanced image corresponding to the target video frame.

For example, the video frame in the video to be processed acquired by the video acquisition device is an RGB (Red, Green, Blue) color image, that is, the target video frame is generally an RGB color image, and therefore the target enhanced image is also an RGB color image.

Optionally, if the color space where the enhanced luminance channel is located is an HSV color space, and the enhanced luminance channel is a luminance channel V in the HSV color space, an S saturation channel and a chrominance channel H of the target video frame need to be further acquired, and then an RGB color image is generated by conversion according to the enhanced luminance channel, the saturation channel, and the chrominance channel, and is taken as a target enhanced image.

For example, if the target enhanced image is an RGB color space and the color space where the enhanced luminance channel is located is an HSV color space, generating the target enhanced image corresponding to the target video frame may include:

and E1, acquiring a saturation channel and a chrominance channel of the target video frame.

The step of obtaining the saturation channel and the chrominance channel of the target video frame may specifically refer to the related description of step S204 in the embodiment of fig. 3, which is not repeated herein.

And E2, generating an RGB color image according to the enhanced brightness channel, the saturation channel and the chrominance channel.

In this step, an RGB color image is generated according to the luminance enhancement channel, the saturation channel, and the chrominance channel, which may mean that each pixel point is converted from an HSV color space to an RGB color space, and a pixel value of the pixel point is expressed by an R value, a G value, and a B value on each pixel point, where the R value, the G value, and the B value are values between 0 and 255.

The process of converting each pixel point from HSV color space to RGB color space can be determined based on formulas (3), (4), and (5).

And E3, taking the RGB color image as a target enhanced image corresponding to the target video frame.

In practical applications, before the enhanced luminance channel, the saturation channel and the chrominance channel are converted to generate the RGB color image in step 2045, the enhanced luminance channel may be first preprocessed, so that the enhanced luminance channel can represent the real luminance of the target video frame more clearly. This is exemplarily described below by the embodiment shown in fig. 7.

Fig. 7 is a schematic flowchart of a process for generating a target enhanced image according to yet another embodiment of the present application, and illustrates one possible implementation manner of generating an RGB color image according to an enhanced luminance channel, a saturation channel, and a chrominance channel in step S2045 in the embodiment of fig. 6. As shown in fig. 7, generating an RGB color image according to the enhanced luminance channel, the saturation channel, and the chrominance channel may refer to:

s2061, dividing the enhanced luminance channel into a plurality of image blocks.

In this step, the video processing device may equally divide the luminance channel according to the size of the image of the enhanced luminance channel, and divide the luminance channel into a plurality of image blocks.

Illustratively, the enhanced luminance channel is an image of size 64 × 64, and the luminance channel is divided into 64 8 × 8 image blocks.

S2062, performing contrast enhancement on the luminance of each image block to obtain an enhanced image block corresponding to the image block.

The goal in this step is to enhance the contrast of each image block.

The contrast enhancement processing on the brightness of the image block may refer to that, for each image block, the video processing device performs statistics to obtain a gray distribution histogram of the image block, and determines a mapping function corresponding to the image block based on the gray distribution histogram; and carrying out gray scale transformation on the image block based on the mapping function to obtain an enhanced image block corresponding to the image block so as to achieve the purpose of improving the contrast of the image block.

Optionally, in order to avoid an excessive increase in contrast of each image block, the contrast of the enhanced image block corresponding to each image block needs to be limited.

For example, after the gray distribution histogram of each image block is obtained through statistics, the gray distribution histogram obtained through statistics in each image block is cut according to a preset upper limit value, the cut gray values are uniformly distributed in the whole gray interval of the image block to obtain an updated gray distribution histogram, and then a corresponding mapping function is determined according to the updated gray distribution histogram.

S2063, splicing the adjacent enhanced image blocks to obtain an updated enhanced brightness channel.

In this embodiment, the local contrast of the enhanced luminance channel may be improved by performing block processing on the enhanced luminance channel, but if the pixel point of each image block is only transformed by the mapping function in the image block, the processing result of the enhanced luminance channel may be a block effect (for example, a sudden change in luminance), and the purpose of this step is to avoid the block effect occurring in the processed enhanced luminance channel.

Alternatively, adjacent enhanced image blocks may be spliced based on a difference operation to obtain an updated enhanced luminance channel.

For example, for a pixel point J on an image block, four image blocks adjacent to the image block to which the pixel point J belongs in the left-right direction and in the up-down direction may be obtained first, mapping functions corresponding to the four image blocks respectively are determined, the gray value of the pixel point is transformed based on the four mapping functions to obtain four mapping values, and then bilinear interpolation is performed on the four mapping values to obtain the updated gray value of the pixel point J. And repeating the process until the updated gray values of all the pixel points on the image block are obtained.

It should be understood that, for a pixel point of an image block on the boundary of the enhanced luminance channel, the updated gray value of the pixel point can be obtained only by splicing two or three image blocks adjacent to the image block where the pixel point is located. For example, for a corner point of the enhanced luminance channel, a mapping function of two image blocks adjacent to the image block where the corner point is located may be directly used for transformation. The corner point may be any one of an upper left corner, a lower left corner, an upper right corner and a lower right corner.

S2064, according to the updated enhanced brightness channel, the saturation channel and the chroma channel, the RGB color image is generated.

In this step, the updated enhanced luminance channel may be used as the enhanced luminance channel in step 2045 in the embodiment of fig. 6, and then an RGB color image is obtained based on the implementation manner in step 2045 in the embodiment of fig. 6.

Fig. 8 is a schematic flowchart of a process of determining an enhanced image corresponding to a video frame to be processed according to an embodiment of the present application, and describes a possible implementation manner of determining an enhanced image corresponding to a video frame to be processed according to a target enhanced image in step 30 in the embodiment of fig. 1. As shown in fig. 8, determining an enhanced image corresponding to a video frame to be processed according to a target enhanced image includes:

s301, determining a motionless point set and a motion point set of the pixel points of the video frame to be processed relative to the target video frame.

The purpose of the step is to determine the immobile point set and the moving point set relative to the target video frame (between adjacent video frames) in the pixel points of the video frame to be processed, so as to subsequently and respectively adopt different pixel value assignment methods for the pixel points in the immobile point set and the moving point set.

Optionally, determining a stationary point set and a moving point set of pixels of a video frame to be processed relative to a target video frame may refer to, for each pixel of the video frame to be processed, taking a luminance change rate of the pixel relative to a pixel at the same coordinate position on the target video frame as a velocity vector of the pixel, determining a pixel with the velocity vector greater than a preset value as a moving point, and determining the moving point set according to all the moving points; and determining pixel points of which the speed vectors are less than or equal to a preset value as motionless points, and determining a motion point set according to all motionless points.

The velocity vector of the pixel point may include a velocity vector u in an X direction and a velocity vector v in a Y direction on an image plane where the video frame to be processed is located, where the X direction and the Y direction are directions of coordinate axes of the image plane where the video frame to be processed is located.

Alternatively, the velocity vector of the pixel point may be represented as (u, v), and the comparison between the velocity vector and the preset value may be a comparison between a vector value of the velocity vector and the preset value. Wherein, the vector value of the velocity vector can be characterized as the motion velocity of the pixel.

For example, determining a stationary point set and a moving point set of a pixel point of a video frame to be processed relative to a target video frame may include the following steps:

step 1, aiming at each pixel point in a video frame to be processed, obtaining a velocity vector of the pixel point, and determining the motion velocity of the pixel point according to the velocity vector of the pixel point.

The velocity vector comprises a velocity vector u in the X direction and a velocity vector v in the Y direction on the image plane of the video frame to be processed, wherein the X direction and the Y direction are directions of coordinate axes of the pixel points.

For example, assuming that the target video frame is a first video frame of the video to be processed and the video frame to be processed is a second video frame of the video to be processed, the speed vector of each pixel point in the second frame image relative to the first frame image may be obtained as follows:

and aiming at each pixel point, respectively obtaining the pixel values of the pixel point on the second video frame and the first video frame, carrying out space-time differential processing on the difference value of the two pixel values to obtain the change rate of the pixel value in the X direction and the change rate of the pixel value in the Y direction, taking the change rate of the pixel value in the X direction as a velocity vector u, and taking the change rate of the pixel value in the Y direction as a velocity vector v.

After obtaining the velocity vector u of each pixel point in the X direction and the velocity vector v of each pixel point in the Y direction, the motion velocity of each pixel point can be referred to as formula (15):

wherein S is_(x,y)The motion velocity of the pixel point is u, the velocity vector of the pixel point in the X direction is u, and the velocity vector of the pixel point in the Y direction is v. Wherein, (x, y) are coordinates of the pixel points.

And 2, determining pixel points with the motion speed larger than a preset value as motion points, and generating a motion point set according to all the motion points.

And 3, determining pixel points with the motion speed less than or equal to a preset value as immobile points, and generating an immobile point set according to all the immobile points.

The preset value can be preset by a user or preset by a system, and after the step 2 and the step 3, a fixed point set FP formed by all fixed points is obtained, and a moving point set DP formed by all moving points is obtained.

S302, aiming at each immobile point in the immobile point set, taking the pixel value of the pixel point corresponding to the immobile point in the target enhanced image as the pixel value of the immobile point.

The pixel points corresponding to the fixed points in the target enhanced image refer to the pixel points with the same coordinates as the fixed points in the video frame to be processed.

For example, assuming that the coordinates of the stationary point FP1 in the video frame to be processed are (Fx1, Fy1), the pixel point corresponding to the stationary point FP1 is a pixel point with coordinates (Fx1, Fy1) in the target enhanced image.

For example, the target video frame corresponds to the destinationMarking the enhanced image as I_tThe video frame to be processed is I_t+1Then the video frame I to be processed_t+1The pixel value of the upper stationary point P can be referred to formula (16):

wherein,

for video frames I to be processed_t+1The pixel value of the stationary point P in the enhanced image,

enhancing an image I for a target_tThe pixel value of the pixel point with the same position as the P position.

Illustratively, the target video frame is a first video frame of the video to be processed, and the video frame to be processed is a second video frame of the video to be processed. The P point is a fixed point of the second video frame relative to the first video frame, and the pixel value of the P point in the enhanced image of the second video frame is the same as that of the pixel point with the same position as the P point in the enhanced image of the first video frame, and the pixel value is the same as that of the pixel point with the same position as the P point in the enhanced image of the first video frame

S303, aiming at each motion point in the motion point set, determining a corresponding pixel point of the motion point in the target enhanced image, and determining the pixel value of the motion point according to the pixel values of a plurality of pixel points in a preset area where the corresponding pixel point is located.

In this embodiment, the corresponding pixel point of the motion point Q in the target enhanced image may refer to the corresponding pixel point of the motion point in the target video frame.

Optionally, the video frame to be processed and the target video frame are adjacent frames, and the coordinate of the corresponding pixel point Q' of the motion point in the target video frame may be determined according to the velocity vector of the motion point Q and the coordinate of the motion point in the video frame to be processed.

The operation speed of the operation point can be obtained by referring to equation (15).

Illustratively, the coordinates of the motion point Q in the video frame to be processed are (x, y), and the velocity vector is (u)_x，v_y) Then the coordinate of the pixel point Q' corresponding to the motion point Q in the target video frame is (x + u)_x×△t，y+v_yX Δ t); namely, the coordinate of the corresponding pixel point Q' of the motion point in the target enhanced image is (x + u)_x×△t，y+v_y×△t)。

Wherein Δ t is a time interval between the video frame to be processed and the target video frame.

In this embodiment, the determining of the pixel value of the motion point according to the pixel values of the plurality of pixel points in the preset region where the corresponding pixel point is located may specifically refer to determining the plurality of pixel points in the preset region where the corresponding pixel point is located, determining an average value of the pixel values of the plurality of pixel points, and taking the average value as the pixel value of the motion point.

The size of the preset area can be adjusted according to the image enhancement effect.

Exemplaryly,

is the pixel value of the motion point Q,

the Q point is the pixel value in the target enhanced image corresponding to the target video frame, then

The value of (c) can be referred to the following formula (17):

wherein, (x, y) is the coordinate value of pixel point Q, (u)_x,v_y) Representing the velocity vector of pixel Q. (x + u)_x×△t，y+v_yXxat) as pixel point Q in target video frame I_t(x) the coordinates of the corresponding pixel points in (1)+u_x×△t+i，y+v_yxDeltat + j) is the coordinate of a plurality of pixel points in the preset area where the corresponding pixel point is located; 2n is the number of pixel points in the preset area; and delta t is the time interval between the video frame to be processed and the target video frame.

For example, if n is 2, then 2n is 4, that is, the pixel values of four pixel points in the preset area where the corresponding pixel point is located are determined

The value of (c):

s304, obtaining an enhanced image of the video frame to be processed according to the pixel values of all the motionless points in the motionless point set and the pixel values of all the moving points in the moving point set.

In this step, the pixel values of each pixel point on the video frame to be processed are adjusted according to the pixel values of all the stationary points in the stationary point set and the pixel values of all the moving points in the moving point set, so as to obtain an enhanced image of the video frame to be processed.

For example, for each pixel point in a video frame to be processed, if the pixel point belongs to an immobile point set, the pixel value of the pixel point is adjusted to the pixel value of an immobile point in the immobile point set, which has the same coordinate as the pixel point; if the pixel point belongs to the motion point set, the pixel value of the pixel point is adjusted to be the pixel value of the motion point with the same coordinate as the pixel point in the motion point set.

Illustratively, the video frame to be processed includes M pixel points, and the coordinate values of the M pixel points are f respectively₁、、f₂……f_MThe M pixel points are divided into an immobile point set FP and a moving point set DP, the pixel value of each immobile point in the immobile point set FP is predetermined, and the pixel value of each moving point in the moving point set DP is predetermined. Then aiming at each pixel point f in the video frame to be processed_kWherein k is greater than or equal to 1 and less than or equal to M, if the pixel point belongs to the fixed point set FP, the pixel point f is set_kPixel value change ofThe coordinate in the more immobile point set FP is f_kThe pixel value of the stationary point of (1); if the pixel point belongs to the motion point set DP, the pixel point f is processed_kChange the coordinate of the motion point set DP to be f_kPixel values of the moving points of (1); and executing the steps on the M pixel points of the video frame to be processed to obtain a changed image, and taking the changed image as an enhanced image of the video frame to be processed. And performing the adjustment on the pixel value of each pixel point in the video frame to be processed to obtain an adjusted video frame to be processed, and taking the adjusted video frame to be processed as an enhanced image of the video frame to be processed.

According to the video enhancement method provided by the embodiment of the application, the pixel points of the video frame to be processed are divided into the fixed point set and the moving point set relative to the target video frame, the pixel value of the fixed point in the fixed point set is the same as the pixel value of the corresponding pixel point in the target enhanced image, and the moving point in the moving point set is obtained by updating the pixel values of a plurality of pixel points in the preset area where the corresponding pixel point in the target enhanced image is located. The relevance of the pixel value of each pixel point is guaranteed, the space-time consistency of adjacent video frames is further guaranteed, and the brightness jitter of the adjacent video frames after image enhancement is effectively avoided.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

Based on the video enhancement method provided by the above embodiment, the embodiment of the present invention further provides an embodiment of an apparatus for implementing the above method embodiment.

Fig. 9 is a schematic structural diagram of a video enhancement apparatus according to an embodiment of the present application. Each unit is included to execute each step in the embodiments corresponding to fig. 1, fig. 3, and fig. 5 to fig. 8, please refer to the related description in each corresponding embodiment of fig. 1, fig. 3, and fig. 5 to fig. 8. For convenience of explanation, only the portions related to the present embodiment are shown. Referring to fig. 9, the video enhancement apparatus 60 includes an acquisition module 601, a first enhancement module 602, a first determination module 603, an execution module 604, and a second determination module 605.

The obtaining module 601 is configured to obtain a target video frame in a video to be processed.

The first enhancement module 602 is configured to perform image enhancement processing on a target video frame to obtain a target enhanced image corresponding to the target video frame.

A first determining module 603, configured to determine, according to the target enhanced image, an enhanced image corresponding to the video frame to be processed; the video frame to be processed is a video frame to be subjected to image enhancement processing determined according to the target video frame.

The executing module 604 is configured to use the video frame to be processed as a target video frame, use the enhanced image corresponding to the video frame to be processed as a target enhanced image, and return to the step of executing the step of determining the enhanced image corresponding to the video frame to be processed according to the target enhanced image until obtaining the enhanced images respectively corresponding to the preset number of video frames in the video to be processed.

The second determining module 605 is configured to determine, according to the enhanced image of the video frame in the video to be processed, an enhanced video corresponding to the video to be processed.

Optionally, the video frame to be processed is a video frame adjacent to the target video frame.

Optionally, the first enhancing module 602 performs image enhancement processing on the target video frame to obtain a target enhanced image corresponding to the target video frame, including:

acquiring a brightness channel of a target video frame;

Optionally, the first enhancement module 602 determines a plurality of illumination components of the target video frame according to the luminance channel of the target video frame, including:

Optionally, the determining, by the first enhancement module 602, a fused illumination component corresponding to the target video frame according to the multiple illumination components of the target video frame includes:

Optionally, the first enhancing module 602 processes the target video frame according to the fusion illumination component to generate a target enhanced image corresponding to the target video frame, including:

determining a reflection component of a brightness channel according to the fusion illumination component and the brightness channel;

Optionally, the first enhancing module 602 processes the target video frame according to the enhanced luminance channel to generate a target enhanced image corresponding to the target video frame, including:

Optionally, the first enhancement module 602 generates an RGB color image according to the enhanced luminance channel, the saturation channel and the chrominance channel, including:

dividing an enhanced luminance channel into a plurality of image blocks;

Optionally, the determining module 603 determines, according to the target enhanced image, an enhanced image corresponding to the video frame to be processed, including:

Optionally, the determining module 603 determines a stationary point set and a moving point set of a pixel point of a video frame to be processed, which are relative to a target video frame, and includes:

determining pixel points with the motion speed larger than a preset value as motion points, and generating a motion point set according to all the motion points;

and determining the pixel points with the motion speed less than or equal to a preset value as motionless points, and generating a motionless point set according to all the motionless points.

Optionally, the determining module 603 determines the pixel value of the motion point according to the pixel values of a plurality of pixel points in the preset region where the corresponding pixel point is located, including:

determining an average value of pixel values of a plurality of pixel points;

the average value is taken as the pixel value of the moving point.

Fig. 10 is a schematic diagram of a video processing device according to an embodiment of the present application. As shown in fig. 10, the video processing device 70 of this embodiment includes: at least one processor 701, a memory 702, and a computer program stored in said memory 702 and executable on said processor 701. The video processing apparatus further comprises a communication section 703, wherein the processor 701, the memory 702, and the communication section 703 are connected by a bus 704.

The processor 701, when executing the computer program, implements the steps in the various video enhancement method embodiments described above, such as steps S10-S50 in the embodiment shown in fig. 1. Alternatively, the processor 701, when executing the computer program, implements the functions of each module/unit in each device embodiment described above, for example, the functions of the modules 601 to 605 shown in fig. 9.

Those skilled in the art will appreciate that fig. 10 is merely an example of a video processing device and is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or different components such as input output devices, network access devices, buses, etc.

The Processor 701 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 702 may be an internal storage unit of the video processing apparatus or an external storage device of the video processing apparatus, and the memory 702 is used for storing the computer program and other programs and data required by the video processing apparatus. The memory 702 may also be used to temporarily store data that has been output or is to be output.

The embodiments of the present application also provide a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the computer program implements the steps in the above-mentioned method embodiments.

The embodiments of the present application provide a computer program product, which when running on a mobile terminal, enables the mobile terminal to implement the steps in the above method embodiments when executed.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the processes in the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium and can implement the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code to a photographing apparatus/terminal apparatus, a recording medium, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), an electrical carrier signal, a telecommunications signal, and a software distribution medium. Such as a usb-disk, a removable hard disk, a magnetic or optical disk, etc. In certain jurisdictions, computer-readable media may not be an electrical carrier signal or a telecommunications signal in accordance with legislative and patent practice.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus/network device and method may be implemented in other ways. Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

The above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims

1. A method of video enhancement, comprising:

acquiring a target video frame in a video to be processed, and performing image enhancement processing on the target video frame to obtain a target enhanced image corresponding to the target video frame;

determining an enhanced image corresponding to the video frame to be processed according to the target enhanced image; the video frame to be processed is a video frame to be subjected to image enhancement processing determined according to the target video frame;

taking the video frame to be processed as a target video frame, taking an enhanced image corresponding to the video frame to be processed as a target enhanced image, and returning to execute the step of determining the enhanced image corresponding to the video frame to be processed according to the target enhanced image until obtaining the enhanced images respectively corresponding to the preset number of video frames in the video to be processed;

2. The video enhancement method of claim 1, wherein the video frame to be processed is a video frame adjacent to the target video frame.

3. The video enhancement method of claim 1, wherein the image enhancement processing on the target video frame to obtain a target enhanced image corresponding to the target video frame comprises:

acquiring a brightness channel of the target video frame;

determining a fusion illumination component corresponding to the target video frame according to the plurality of illumination components of the target video frame;

4. The video enhancement method of claim 3, wherein said determining a plurality of illumination components of the target video frame based on the luminance channel of the target video frame comprises:

and carrying out variation operation on the brightness channel to obtain a second illumination component.

5. The video enhancement method of claim 4, wherein said determining a fused illumination component corresponding to the target video frame from the plurality of illumination components of the target video frame comprises:

and performing fusion processing on the first illumination component and the second illumination component based on the first gradient and the second gradient to obtain the fusion illumination component.

6. The video enhancement method of claim 3, wherein the processing the target video frame according to the fused illumination component to generate the target enhanced image corresponding to the target video frame comprises:

determining a reflection component of the brightness channel according to the fusion illumination component and the brightness channel;

7. The video enhancement method of claim 6, wherein the processing the target video frame according to the enhanced luminance channel to generate the target enhanced image corresponding to the target video frame comprises:

acquiring a saturation channel and a chrominance channel of the target video frame;

8. The video enhancement method of claim 7 wherein said generating an RGB color image from said enhanced luma channel, said saturation channel, and said chroma channel comprises:

dividing the enhanced luminance channel into a plurality of image blocks;

and generating the RGB color image according to the updated enhanced brightness channel, the saturation channel and the chrominance channel.

9. The video enhancement method of any one of claims 1 to 8, wherein the determining, according to the target enhanced image, an enhanced image corresponding to a video frame to be processed comprises:

determining a motionless point set and a motion point set of the pixel points of the video frame to be processed relative to the target video frame;

regarding each immobile point in the immobile point set, taking the pixel value of a pixel point corresponding to the immobile point in the target enhanced image as the pixel value of the immobile point;

determining corresponding pixel points of the motion points in the target enhanced image aiming at each motion point in the motion point set, and determining pixel values of the motion points according to pixel values of a plurality of pixel points in a preset area where the corresponding pixel points are located;

and obtaining an enhanced image of the video frame to be processed according to the pixel values of all the fixed points in the fixed point set and the pixel values of all the moving points in the moving point set.

10. The method of claim 9, wherein the determining a set of stationary points and a set of moving points of pixels of the video frame to be processed relative to the target video frame comprises:

aiming at each pixel point in the video frame to be processed, acquiring an optical flow vector of the pixel point, and determining the motion speed of the pixel point according to the optical flow vector of the pixel point;

determining pixel points with the motion speed larger than a preset value as motion points, and determining the motion point set according to all the motion points;

and determining pixel points with the motion speed less than or equal to the preset value as immobile points, and determining the immobile point set according to all the immobile points.

11. The method of claim 9, wherein the determining the pixel value of the motion point according to the pixel values of the plurality of pixels in the preset region where the corresponding pixel is located comprises:

determining an average value of pixel values of the plurality of pixel points;

and taking the average value as the pixel value of the motion point.

12. A video processing device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the steps of the method according to any of claims 1 to 11 are implemented when the computer program is executed by the processor.

13. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 11.