CN117939313A

CN117939313A - HDR video generation method and device

Info

Publication number: CN117939313A
Application number: CN202311828312.7A
Authority: CN
Inventors: 邸宏伟
Original assignee: Xian Wingtech Information Technology Co Ltd
Current assignee: Xian Wingtech Information Technology Co Ltd
Priority date: 2023-12-27
Filing date: 2023-12-27
Publication date: 2024-04-26

Abstract

The application relates to the technical field of video processing, and provides an HDR video generation method and device. The method comprises the following steps: acquiring an original video frame sequence generated when a video is recorded currently; performing motion detection on video frames in the original video frame sequence to obtain a first key video frame in the original video frame sequence; acquiring a next video frame after the first key video frame in the original video frame sequence, and performing underexposure processing on the next video frame to acquire a second key video frame; performing fusion processing on the first key video frame and the second key video frame in the original video frame sequence to obtain a target key video frame; and generating a target video frame sequence based on the target key video frame, and combining the target video sequence frames to synthesize the HDR video. The application is used to reduce the use of computing resources when generating HDR video.

Description

HDR video generation method and device

Technical Field

The present application relates to the field of communications technologies, and in particular, to a method and an apparatus for generating an HDR video.

Background

Currently, in the field of video generation, HDR video is video content recorded and presented using High Dynamic Range (HDR) technology. HDR video has a wider range of darkness and richer color rendering capabilities than traditional Standard Dynamic Range (SDR) video. HDR video employs higher brightness and contrast to present more detail and more realistic imagery. The color display device can display darker black and brighter white, so that the image can be detailed and rich in both bright parts and dark parts, and a wider color gamut range is included. This enables a more accurate and more vivid representation of brightness and color in the real world.

In the prior art, when generating an HDR video, each frame in the original video frame sequence needs to be underexposed, when the frame of the original video sequence is 30 frames, the underexposure needs to be performed on the 30 frames of video frames respectively, so that a group of 30 frames of video frame sequences are obtained, and then the HDR video is synthesized; the result is a large amount of computing resources, which results in a device that needs to bear a large amount of data throughput, so that some devices with smaller running memory cannot complete HDR video at all, resulting in poor user experience.

Disclosure of Invention

Based on this, the embodiments of the present application provide a method and apparatus for generating an HDR video, which are used to reduce the use of computing resources when generating the HDR video.

The embodiment of the application provides an HDR video generation method, which comprises the following steps:

acquiring an original video frame sequence generated when a video is recorded currently;

Performing motion detection on video frames in the original video frame sequence to obtain a first key video frame in the original video frame sequence; the first key video frame is a video frame with a pixel change rate between any video frame and the previous video frame being greater than a first preset threshold in the original video frame sequence; the pixel change rate is used for representing the pixel value change degree between the video frame and the video frame of the previous frame;

Acquiring a next video frame after the first key video frame in the original video frame sequence, and performing underexposure processing on the next video frame to acquire a second key video frame;

Performing fusion processing on the first key video frame and the second key video frame in the original video frame sequence to obtain a target key video frame;

And generating a target video frame sequence based on the target key video frame, and combining the target video sequence frames to synthesize the HDR video.

In one embodiment, after the acquiring the sequence of original video frames generated at the time of the current recorded video, the method further comprises:

Underexposure treatment is carried out on the second frame video frames in the original video frame sequence, and target second frame video frames are obtained;

Performing fusion processing on the first key video frame and the second key video frame in the original video frame sequence to obtain a first target key video frame; the second frame video frame is the next video frame of the first frame video frame in the original video frame sequence;

And generating a target video frame sequence based on the first target key video frame and the target key video frame.

In one embodiment, after acquiring the original video frame sequence generated at the time of the current recorded video, the method further comprises:

Judging whether the scene when the video is recorded currently is a high dynamic range HDR scene or not;

If yes, performing motion detection on the video frames in the original video frame sequence, and acquiring a first key video frame in the original video frame sequence.

In one embodiment, the determining whether the scene at the time of recording the video is a high dynamic range HDR scene comprises:

Acquiring brightness values of all pixel points of a current video frame in the original video frame sequence, and counting the number of the pixel points with the brightness values larger than a second preset threshold value;

judging whether the number of the pixel points is larger than a preset number;

if yes, the scene when the video is recorded currently is a high dynamic range HDR scene;

If not, the scene when the video is recorded currently is not a high dynamic range HDR scene.

In one embodiment, the performing motion detection on the video frames in the original video frame sequence to obtain the first key video frame in the original video frame sequence includes:

And performing motion detection on video frames in the original video frame sequence based on an inter-frame difference method, and determining the first key video frame in the original video frame sequence.

In one embodiment, the underexposure of the second frame video frame to obtain a target second frame video frame includes:

acquiring an overexposed region in the next video frame, and acquiring a first brightness average value of the overexposed region;

acquiring a second brightness average value of a normal exposure area in the next video frame;

acquiring exposure compensation parameters corresponding to the next video frame based on the first brightness average value and the second brightness average value;

and carrying out exposure compensation processing on the next video frame according to the exposure compensation parameters to obtain the second key video frame.

In one embodiment, after the fusing the first key video frame and the second key video frame in the original video frame sequence to obtain a target key video frame, the method further includes:

And smoothing the transition region of the first key video frame and the second key video frame in the target key video frame.

The embodiment of the application provides a video generating device, which comprises:

the acquisition unit is used for acquiring an original video frame sequence generated when the video is recorded currently;

the detection unit is used for detecting the motion of the video frames in the original video frame sequence and obtaining a first key video frame in the original video frame sequence; the first key video frame is a video frame with a pixel change rate between any video frame and the previous video frame being greater than a first preset threshold in the original video frame sequence; the pixel change rate is used for representing the pixel value change degree between the video frame and the video frame of the previous frame;

The processing unit is used for acquiring a next video frame after the first key video frame in the original video frame sequence, and performing underexposure processing on the next video frame to acquire a second key video frame;

The fusion unit is used for carrying out fusion processing on the first key video frame and the second key video frame in the original video frame sequence to obtain a target key video frame;

And the generating unit is used for generating a target video frame sequence based on the target key video frame and combining the target video sequence frames to synthesize the HDR video.

In one embodiment, the obtaining unit is further configured to perform underexposure processing on a second frame video frame in the original video frame sequence to obtain a target second frame video frame; performing fusion processing on the first key video frame and the second key video frame in the original video frame sequence to obtain a first target key video frame; the second frame video frame is the next video frame of the first frame video frame in the original video frame sequence; and generating a target video frame sequence based on the first target key video frame and the target key video frame.

In one embodiment, the obtaining unit is further configured to determine whether the scene when the video is currently recorded is a high dynamic range HDR scene; if yes, performing motion detection on the video frames in the original video frame sequence, and acquiring a first key video frame in the original video frame sequence.

In one embodiment, the obtaining unit is specifically configured to obtain a luminance value of each pixel point of the current video frame in the original video frame sequence, and count the number of pixel points where the luminance value is greater than a second preset threshold; judging whether the number of the pixel points is larger than a preset number; if yes, the scene when the video is recorded currently is a high dynamic range HDR scene; if not, the scene when the video is recorded currently is not a high dynamic range HDR scene.

In one embodiment, the obtaining unit is specifically configured to obtain an overexposed area in the next video frame, and obtain a first luminance average value of the overexposed area; acquiring a second brightness average value of a normal exposure area in the next video frame; acquiring exposure compensation parameters corresponding to the next video frame based on the first brightness average value and the second brightness average value; and carrying out exposure compensation processing on the next video frame according to the exposure compensation parameters to obtain the second key video frame.

In one embodiment, the fusion unit is further configured to smooth a transition region between the first key video frame and the second key video frame in the target key video frame.

In one embodiment, the detecting unit is specifically configured to perform motion detection on video frames in the original video frame sequence based on an inter-frame difference method, and determine the first key video frame in the original video frame sequence.

An embodiment of the present application provides an electronic device, including a memory and a processor, where the memory stores a computer program, and the processor implements the steps of the HDR video generation method provided in any embodiment of the present application when the computer program is executed by the processor.

An embodiment of the present application provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the HDR video generation method provided by any embodiment of the present application.

The HDR video generation method and the device provided by the embodiment of the application are used for acquiring the original video frame sequence generated when the video is recorded currently; performing motion detection on video frames in the original video frame sequence to obtain a first key video frame in the original video frame sequence; the first key video frame is a video frame with a pixel change rate between any video frame and the previous video frame being greater than a first preset threshold in the original video frame sequence; the pixel change rate is used for representing the pixel value change degree between the video frame and the video frame of the previous frame; acquiring a next video frame after the first key video frame in the original video frame sequence, and performing underexposure processing on the next video frame to acquire a second key video frame; performing fusion processing on the first key video frame and the second key video frame in the original video frame sequence to obtain a target key video frame; and generating a target video frame sequence based on the target key video frame, and combining the target video sequence frames to synthesize the HDR video. According to the embodiment of the application, the video frames of the original video frame sequence are subjected to motion detection, and the target video frame sequence is generated by underexposure processing and fusion of the next video frame of the current frame with the pixel value change larger than the first preset threshold value between any current frame and the adjacent previous video frame in the original video frame sequence, so that the exposure adjustment of each frame in the original video frame sequence is avoided as much as possible, the occupation of calculation resources is reduced, HDR video synthesis can be realized on some devices with smaller memory operation, and the user experience is improved.

Drawings

FIG. 1 is a flow diagram of a method for HDR video generation in one embodiment;

FIG. 2 is a flow chart of an HDR video generation method according to another embodiment;

FIG. 3 is a flow chart of an HDR video generation method according to another embodiment;

FIG. 4 is a block diagram of the HDR video generation apparatus in one embodiment;

Fig. 5 is an internal structural diagram of an electronic device in one embodiment.

Detailed Description

The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.

The HDR video generation method provided by the embodiment of the application can be applied to a scene for generating an HDR video during video recording. The scenario may be implemented based on various mobile terminals, which may be devices that provide video recording and/or other business data connectivity to a user, handheld devices with video recording capabilities, computing devices or other processing devices connected to a wireless modem, vehicle-mounted devices, wearable devices, terminal devices in future 5G networks or terminal devices in future evolving PLMN networks, etc. The mobile terminal may communicate with one or more core networks via a radio access network (Radio Access Network, RAN). By way of example, mobile terminals may be Mobile telephones (or "cellular" telephones) and computers with Mobile terminals, such as portable, pocket, hand-held, computer-built-in or car-mounted Mobile devices that exchange voice and/or data with radio access networks, as well as Personal communication service (Personal Communication Service, PCS) telephones, cordless telephones, session initiation protocol (Session Initiation Protocol, SIP) phones, wireless local loop (Wireless Local Loop, WLL) stations, personal digital assistants (Personal DIGITAL ASSISTANT, PDA) and the like, as well as Mobile devices, UE terminals, access terminals, wireless communication devices, terminal units, end stations, mobile stations (Mobile stations), remote stations, remote terminals (Subscriber units), subscriber stations (Subscriber Station), user agents (User agents), terminal devices and the like.

The execution body of the HDR video generation method provided in the embodiment of the present application may be the above-mentioned terminal device (including a mobile terminal and a non-mobile terminal), or may be an HDR video generation apparatus formed by functional modules and/or functional entities in the terminal device that can implement the display method, which may be specifically determined according to actual use requirements, and the embodiment of the present application is not limited.

It should be noted that, when the embodiment of the present application is mainly applied to a scene for generating an HDR video, each frame in the original video frame sequence is often encoded by an encoder to generate a segment-by-segment group of pictures (Group of Pictures, GOP), and a decoder reads the segment-by-segment group of pictures when playing, then decodes and then reads the pictures to render and display. A group of pictures is a group of consecutive pictures consisting of an I frame (Intra-coded frame) and a number of B frames (Bi-directionalpredicted-coded frame), P frames (Predictive-coded frame), which are the basic units for video encoder and decoder access, and its sequence repeats until the end of the picture. I frames are intra-coded frames (also called key frames), P frames are forward predicted frames (also called forward reference frames), and B frames are bi-directionally interpolated frames (also called bi-directional reference frames). An I frame is a complete picture, while P and B frames record changes relative to an I frame. Without an I frame, P and B frames cannot be decoded.

Where an I-frame is a key frame in a video sequence, which is the basis for lossless compression. Each I frame is information of an image obtained by encoding the entire image without being based on any previous frame. P-frames enable video compression by motion prediction using previously encoded frames. It predicts the pixel value of the current frame using the motion information and pixel differences in the existing reference frame and encodes only the prediction error. The P-frame can reduce the amount of data to some extent compared to the I-frame, since it does not need to encode the complete picture information, only the prediction error. B frames enable video compression by bi-directional motion prediction using previously and later encoded frames. It predicts the pixel value of the current frame using the motion information and pixel differences of the two previous and next reference frames and encodes only the prediction error. B-frames can reduce the amount of data even further than P-frames because it can use more reference frames for motion prediction.

When the original video frame sequence is acquired, when a picture group, i.e. an I frame, a P frame and a B frame, is acquired through an encoder, the original video frame sequence is usually encoded and compressed based on an h.264 compression standard, so as to obtain the I frame, the P frame and the B frame.

Among them, h.264 is also called advanced video coding (Advanced Video Coding, AVC), which is a widely used video codec standard. It is jointly formulated by the International telecommunication Union (ITU-T) and the International organization for standardization (ISO/IEC). H.264 has made a great breakthrough in video compression, can provide high quality video images and achieve high compression rates at relatively low bit rates. This means that it can transmit or store video in a small file size and transmit high quality video in real time in a bandwidth limited network environment. H.264 is suitable for various applications such as internet streaming media, video conferencing, wireless communication, and digital television broadcasting.

The embodiment of the application mainly aims at how to generate I frames in a picture group, namely key frames, because in the prior art, when HDR video is generated, each frame in an original video frame sequence needs to be underexposed firstly, when the original video sequence frame is 30 frames, the 30 frames of video frames need to be underexposed respectively, a group of 30 frames of video frame sequences can be obtained, and then HDR video is synthesized; the result is a large amount of computing resources, which results in a device that needs to bear a large amount of data throughput, so that some devices with smaller running memory cannot complete HDR video at all, resulting in poor user experience. Accordingly, in order to solve the above-described problems, embodiments of the present application provide the following embodiments:

in one embodiment, there is provided an HDR video generation method, which includes the following steps S101 to S105, referring to fig. 1:

S101, acquiring an original video frame sequence generated when the video is recorded currently.

In some embodiments, the original video frame sequence generated by the current recorded video may be obtained by performing video recording on a user through a video mobile terminal and performing encoding processing to obtain a target video; since the original video frame sequence is not subjected to any encoding process, the original video frame sequence can be understood as a group of video frame sequences in which the video frames are all I frames; at present, a user can generally record a video through an application program with a video recording function configured on a mobile terminal, specifically, the mobile terminal can start a camera to record the video to obtain the original video frame sequence in response to an instruction of starting to record the video issued by the user, and process the original video frame sequence to generate an HDR video.

It should be noted that, the embodiment of the present application is mainly applied to the scene of generating the HDR video, so it is necessary to detect whether the scene recorded by the current video belongs to the high dynamic range HDR scene, if so, the subsequent steps S102-S105 are executed; if not, implementing the traditional video recording to generate a target video scheme.

Wherein HDR (HighDynamicRange) represents a high dynamic range, which is a technology and an image processing method, and can expand the brightness range of display, and display more bright and dark details in the image, thereby bringing more abundant color details so as to display more lifelike and lifelike images. HDR technology is to make the picture displayed by a display device more similar to the real picture that our human eyes see at ordinary times. The novel technology is adopted on the basis of improving the shooting and display dynamic range, and the capability of expressing the latitude of the television system is improved to the human eye level by using higher quantization precision.

S102, performing motion detection on video frames in the original video frame sequence to obtain first key video frames in the original video frame sequence.

The first key video frame is a video frame with a pixel change rate between any video frame and the previous video frame being greater than a first preset threshold in the original video frame sequence; the pixel change rate is used to represent the degree of change in pixel values between the video frame and a previous frame video frame.

In some embodiments, the motion detection of the video frames in the original video frame sequence is performed to determine whether there is a substantial change in the background in the video frames in the original video frame sequence; the motion detection (MotionDetection) is a computer vision technique for detecting moving objects in a video or image sequence. It determines whether there is motion by identifying a change in pixel level or a difference in object between successive frames. The target video frame may be defined as a video frame in which the background in the video frame has a large variation from the previous frame. Therefore, in the embodiment of the present application, the current frame with a larger background variation of the video frame than the previous frame in the original video frame sequence is used as the target video frame, so as to prepare for generating the corresponding I frame (key frame) in the frame group when the HDR video synthesis is performed subsequently.

Specifically, in the case that the first key video frame is a video frame in the original video frame sequence, the pixel change rate between any video frame and the previous video frame is greater than a first preset threshold; the pixel change rate may be obtained by calculating the difference value of the pixel values of the corresponding pixel points between two adjacent frames, then taking the absolute value of the pixel value difference value of the corresponding pixel point, then summing the absolute values of the pixel value difference values of all the corresponding pixel points between two frames, and taking the average value. If the pixel change rate is greater than the first preset threshold, the change amplitude between the current two adjacent frames is larger, and the current two adjacent frames can be used as the first key video frame.

S103, acquiring a next video frame after the first key video frame in the original video frame sequence, and performing underexposure processing on the next video frame to acquire a second key video frame.

In some embodiments, the obtaining a next video frame after the key video frame in the original video frame sequence, that is, obtaining a next video frame of a current frame with a larger background variation amplitude in the original video frame sequence, performing underexposure processing on the next video frame, so that brightness of an overexposed area of the next video frame is not overexposed; in this step, the brightness of the overexposed region is only adjusted so as not to overexposure; however, there is an underexposed area in the next video frame, that is, an area with a relatively low brightness value, that is, an underexposed area, so that, after only the brightness of the overexposed area of the next video frame is adjusted to make the brightness of the overexposed area of the next video frame overexposed, the obtained second key video frame may also be called an underexposed frame, and based on the underexposed second key video frame, the underexposed next video frame and the overexposed first key video frame are fused in a fusion manner, so as to obtain a frame of normally exposed first key video frame, that is, the following second key video frame.

The specific implementation method can refer to the following steps one to four:

step one, acquiring an overexposed region in the next video frame, and acquiring a first brightness average value of the overexposed region.

In some embodiments, the brightness value of each pixel point in the video frame of the next frame needs to be calculated first; specifically, the luminance value of the pixel point of the video frame may be obtained based on the attribute of the video frame, because in the embodiment of the present application, a video frame in YUV format is used, (Y represents luminance, that is, gray value), while U and V represent chromaticity, which are used to describe the color and saturation of an image and are used to designate the color of a pixel, YUV refers to a pixel format in which luminance parameters and chromaticity parameters are separately represented, Y, U, V are represented by using one byte (8 bits), and the value range is 0-255. The luminance values of all pixels of the video frame will be between 0-255.

Judging whether the pixel points are overexposed or not by acquiring the brightness value of each pixel point, for example, screening out the pixel points with the brightness value larger than 220; namely, the overexposure area is obtained, and the average value of the brightness values is calculated according to the brightness values of all the pixel points of the overexposure area; i.e. the first luminance mean, is denoted as Over Exp.

And step two, obtaining a second brightness average value of the normal exposure area in the next video frame.

In some embodiments, determining a pixel with a luminance value within a preset range according to the luminance value of each pixel in the next frame of video frame, for example, screening out pixels with luminance values between 50 and 220, and also obtaining an average value of the luminance values of all the pixels with luminance values within the preset range; i.e. the second luminance mean, is denoted Avg Exp.

And thirdly, acquiring exposure compensation parameters corresponding to the next video frame based on the first brightness average value and the second brightness average value.

In some embodiments, the exposure compensation parameter may be calculated with reference to the following formula:

Wherein EV-represents the exposure compensation parameter.

And step four, performing exposure compensation processing on the next video frame according to the exposure compensation parameters to obtain the second key video frame.

In some embodiments, the brightness of the overexposed area in the next video frame may be adjusted according to the exposure compensation parameter, so as to obtain a next video frame with a highlight area not overexposed, i.e. the second key video frame.

S104, fusing the first key video frame and the second key video frame in the original video frame sequence to obtain a target key video frame.

In some embodiments, after the second key video frame, that is, the next frame of the underexposed key video frame, the first key video frame and the second key video frame in the original video frame sequence need to be fused, so as to reduce the brightness of the overexposed area in the first key video frame, so that the highlight area in the first key video frame is not overexposed originally, and the target key video frame with normal exposure is obtained.

Optionally, after the fusing processing is performed on the first key video frame and the second key video frame in the original video frame sequence, the embodiment of the present application further performs smoothing processing on a transition region between the first key video frame and the second key video frame in the target key video frame, so that the transition region between the first key video frame and the second key video frame in the target key video frame is more natural and smooth, and a brightness difference of the transition region is avoided, which results in influencing color display and reducing user appearance when generating an HDR video later. For example, a bilateral filter may be employed to smooth transitions in the transition region.

S105, generating a target video frame sequence based on the target key video frame, and combining the target video sequence frames to synthesize the HDR video.

In some embodiments, after the video frames with larger variation amplitude in the original video frame sequence are fused, the obtained target key video frame is an I frame (key frame) used in h.264 coding, and then the corresponding P frame and B frame are generated according to the target key video frame; for synthesizing HDR video.

It should be noted that, in the process of video recording, an original video frame sequence is generated immediately, and the application also performs motion detection on newly generated video frames in the immediately generated original video frame sequence, immediately generates a target key video frame, obtains the target video sequence frame based on an h.264 coding technology, and synthesizes the HDR video.

According to the HDR video generation method provided by the embodiment of the application, the original video frame sequence generated when the video is recorded currently is obtained; performing motion detection on video frames in the original video frame sequence to obtain a first key video frame in the original video frame sequence; the first key video frame is a video frame with a pixel change rate between any video frame and the previous video frame being greater than a first preset threshold in the original video frame sequence; the pixel change rate is used for representing the pixel value change degree between the video frame and the video frame of the previous frame; acquiring a next video frame after the first key video frame in the original video frame sequence, and performing underexposure processing on the next video frame to acquire a second key video frame; performing fusion processing on the first key video frame and the second key video frame in the original video frame sequence to obtain a target key video frame; and generating a target video frame sequence based on the target key video frame, and combining the target video sequence frames to synthesize the HDR video. According to the embodiment of the application, the video frames of the original video frame sequence are subjected to motion detection, and the target video frame sequence is generated by underexposure processing and fusion of the next video frame of the current frame with the pixel value change larger than the first preset threshold value between any current frame and the adjacent previous video frame in the original video frame sequence, so that the exposure adjustment of each frame in the original video frame sequence is avoided as much as possible, the occupation of calculation resources is reduced, HDR video synthesis can be realized on some devices with smaller memory operation, and the user experience is improved.

In the above embodiment, after the original video frame sequence generated when the currently recorded video is acquired, referring to fig. 2, the method further includes the following steps S201 to S203:

S201, underexposure processing is carried out on the second frame video frame in the original video frame sequence, and a target second frame video frame is obtained.

In some embodiments, since it is also necessary to ensure that the first frame in the target video frame sequence is an I frame (key frame) when generating the HDR video, that is, it is stated that the exposure of the first frame in the target video frame sequence is to be ensured to be normal so that the first frame can include all image information, but since in the above steps S101-S105, when performing motion detection, since there is no previous frame adjacent to the first frame, the processing of the exposure of the first frame is omitted, the embodiments of the present application set steps S301-S303 to perform exposure adjustment on the first frame, so as to obtain the first frame video frame with normal exposure, that is, the first target key video frame.

It should be noted that, in the embodiment of the present application, before the motion detection is performed on the original video frame sequence, it is further required to determine whether the original video frame sequence includes a first frame video frame generated at a start time when the current recording of video is started, if yes, a second frame video frame of the original video frame sequence may be obtained, which indicates that the original video frame sequence generated by the current video recording has not yet started to perform the encoding compression process, and it is required to adjust the exposure of the first frame of the original video frame sequence to generate a frame I (key frame). Specifically, the next frame of the first frame in the original video frame sequence, namely the second frame video frame, is underexposed to obtain the target second frame video frame. Also, the method for performing the underexposure processing on the second frame video frame may refer to the description of the embodiment described in step S103, which is not repeated here.

S202, fusion processing is carried out on the first key video frame and the second key video frame in the original video frame sequence, and a first target key video frame is obtained.

Wherein the second frame video frame is the next video frame of the first frame video frame in the original video frame sequence.

In some embodiments, after the target second frame video frame, that is, the next frame of the underexposed first frame video frame, the first key video frame and the second key video frame in the original video frame sequence need to be fused, so as to reduce the brightness of the overexposed region in the first frame video frame, so that the highlight region in the first frame video frame is not overexposed originally, and the normally exposed first target key video frame is obtained.

Optionally, after the fusing processing is performed on the first frame video frame and the second frame video frame in the original video frame sequence, the embodiment of the present application further performs smoothing processing on a transition region between the first frame video frame and the second frame video frame in the first target key video frame, so that the transition region between the first frame video frame and the second frame video frame in the first target key video frame is more natural and smooth, and a brightness difference of the transition region is avoided to be great, which results in influencing color display and reducing user appearance when an HDR video is generated subsequently.

S203, generating a target video frame sequence based on the first target key video frame and the target key video frame.

Through the steps, the first frame I frame (key frame) in the target video frame sequence can be ensured to be a normally exposed video frame, and then the target video frame sequence is generated based on the first target key video frame and the target key video frame; to ensure that HDR video can be successfully synthesized.

As an extension and refinement of the above embodiment, referring to fig. 3, the HDR video generation method further includes the following steps:

S301, acquiring an original video frame sequence generated when the video is recorded currently.

S302, judging whether the scene when the video is recorded currently is a high dynamic range HDR scene or not.

Specifically, how to determine whether the current video recording scene is a high dynamic range HDR scene may be implemented by referring to the following steps:

step 1, obtaining brightness values of all pixel points of a current video frame in the original video frame sequence, and counting the number of the pixel points with the brightness values larger than a second preset threshold.

In some embodiments, after a user starts recording a video through a mobile terminal, an original video frame sequence generated by recording a current video may be acquired, and the current video frame is acquired from the original video frame sequence, because each frame image in the current original video frame sequence is generally a video frame in YUV format, Y, U, V is represented by using one byte (8 bits), and the value range is 0-255, the brightness value of all pixel points of the video frame is between 0-255.

And then the number of pixel points with the brightness value larger than a second preset threshold value in the current frame can be obtained. Specifically, the second preset threshold may be set to 220, and when the YUV value of each pixel point in the current video frame is greater than 220, it indicates that the brightness of the pixel point is higher and has been exposed; and further obtaining YUV values of all pixel points in the current video frame, comparing the YUV values with the second preset threshold value, and then counting the number of pixel points with YUV values larger than the second preset threshold value in all pixel points in the current video frame.

And step 2, judging whether the number of the pixel points is larger than a preset number.

And step 3, if so, the scene when the video is recorded currently is a high dynamic range HDR scene.

In some embodiments, the total number of pixels in the current video frame is obtained, and if the number of pixels with the luminance value greater than the second preset threshold is greater than 10% of the total number of pixels, the current video recording scene is considered to be an HDR scene.

And 4, if not, the scene when the video is recorded currently is not a high dynamic range HDR scene.

In some embodiments, if the number of pixels having the luminance value greater than the second preset threshold is greater than 10% of the total number of pixels, the current video recording scene is considered to be an HDR scene.

And S303, if yes, performing motion detection on the video frames in the original video frame sequence to acquire a first key video frame in the original video frame sequence.

S304, performing motion detection on the video frames in the original video frame sequence based on an inter-frame difference method, and determining the first key video frames in the original video frame sequence.

In some embodiments, when motion detection is performed on video frames in the original video frame sequence, the motion detection may be implemented based on an inter-frame difference algorithm, specifically, the inter-frame difference algorithm (Temporal Difference) performs a difference operation on two or three frames of images that are continuous in time, pixel points corresponding to different frames are subtracted, and an absolute value of a gray level difference is determined, and when the absolute value exceeds a certain threshold, the motion target may be determined, so as to implement a detection function of the motion target.

Specifically, in the present application, the motion detection is performed on the video frames in the original video frame sequence based on the inter-frame difference method, and the specific step of determining the first key video frame in the original video frame sequence may refer to the following steps 1 to 3:

And step 1, acquiring absolute values of gray differences of all corresponding pixel points between any two adjacent video frames in the original video frame sequence based on the inter-frame difference method.

And step 2, acquiring an average value of the gray level differences based on the absolute values of the gray level differences of all the corresponding pixel points.

And step 3, if the average value of the gray level differences is larger than a third preset threshold value, taking a second frame of video frames in two adjacent frames as the first key video frame.

For example, when the third preset threshold is set to 100, each video frame in the original video frame sequence has 200 pixels, the absolute value of the gray level difference of the corresponding pixel between two adjacent frames is obtained, then the absolute values of the gray level differences corresponding to the 200 pixels are obtained, and an average calculation is performed, that is, the absolute values of the gray level differences of the 200 pixels are added and divided by 200, so as to obtain the average value of the gray level differences. And comparing the average value of the obtained gray level differences with a third preset threshold 100, if the average value of the gray level differences between the current adjacent two frames is larger than 100, the change amplitude between the current adjacent two frames is larger, that is, the change of the background picture of the current video frame and the background picture of the previous frame is larger, and the situation that the background environment of the video frame has larger change can be understood, and the current frame is needed to be used as an I frame in a group of pictures to guide the synthesis of the subsequent HDR video.

S305, acquiring a next video frame after the first key video frame in the original video frame sequence, and performing underexposure processing on the next video frame to acquire a second key video frame.

S306, fusing the first key video frame and the second key video frame in the original video frame sequence to obtain a target key video frame.

S307, generating a target video frame sequence based on the target key video frame, and combining the target video sequence frames to synthesize the HDR video.

It should be noted that, in the embodiment of the present application, the user may respond to the instruction for ending video recording at any time, and the motion detection of the original video frame sequence in the above steps is ended.

Step one, judging whether the original video frame sequence contains a first frame when the current recorded video starts.

And step two, if yes, acquiring a second frame video frame of the original video frame sequence.

It should be understood that, although the steps in the flowcharts of fig. 1 to 3 are shown in order as indicated by arrow 15, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in fig. 1-3 may include multiple sub-steps or phases that are not necessarily performed at the same time, but may be performed at different times, nor does the order in which the sub-steps 20 or phases are performed necessarily occur in sequence, but may be performed alternately or alternately with at least a portion of the other steps or sub-steps of other steps.

In one embodiment, as shown in fig. 4, there is provided an HDR video generating apparatus, comprising: an acquisition unit 401, a detection unit 402, a processing unit 403, a fusion unit 404, a generation unit 405, wherein:

an obtaining unit 401, configured to obtain an original video frame sequence generated when a video is currently recorded;

A detecting unit 402, configured to perform motion detection on video frames in the original video frame sequence, and obtain a first key video frame in the original video frame sequence; the first key video frame is a video frame with a pixel change rate between any video frame and the previous video frame being greater than a first preset threshold in the original video frame sequence; the pixel change rate is used for representing the pixel value change degree between the video frame and the video frame of the previous frame;

a processing unit 403, configured to obtain a next video frame after the first key video frame in the original video frame sequence, and perform underexposure processing on the next video frame, to obtain a second key video frame;

A fusion unit 404, configured to perform fusion processing on the first key video frame and the second key video frame in the original video frame sequence, so as to obtain a target key video frame;

a generating unit 405, configured to generate a target video frame sequence based on the target key video frame, and combine the target video frame sequence to synthesize an HDR video.

In one embodiment, the obtaining unit 401 is further configured to perform underexposure processing on a second frame video frame in the original video frame sequence to obtain a target second frame video frame; performing fusion processing on the first key video frame and the second key video frame in the original video frame sequence to obtain a first target key video frame; the second frame video frame is the next video frame of the first frame video frame in the original video frame sequence; and generating a target video frame sequence based on the first target key video frame and the target key video frame.

In one embodiment, the obtaining unit 401 is further configured to determine whether the scene when the video is currently recorded is a high dynamic range HDR scene; if yes, performing motion detection on the video frames in the original video frame sequence, and acquiring a first key video frame in the original video frame sequence.

In one embodiment, the obtaining unit 401 is specifically configured to obtain a luminance value of each pixel point of the current video frame in the original video frame sequence, and count the number of pixel points where the luminance value is greater than a second preset threshold; judging whether the number of the pixel points is larger than a preset number; if yes, the scene when the video is recorded currently is a high dynamic range HDR scene; if not, the scene when the video is recorded currently is not a high dynamic range HDR scene.

In one embodiment, the detecting unit 402 is specifically configured to perform motion detection on video frames in the original video frame sequence based on an inter-frame difference method, and determine the first key video frame in the original video frame sequence.

In one embodiment, the processing unit 403 is specifically configured to obtain an overexposed area in the next video frame, and obtain a first luminance average value of the overexposed area; acquiring a second brightness average value of a normal exposure area in the next video frame; acquiring exposure compensation parameters corresponding to the next video frame based on the first brightness average value and the second brightness average value; and carrying out exposure compensation processing on the next video frame according to the exposure compensation parameters to obtain the second key video frame.

In one embodiment, the fusing unit 404 is further configured to smooth a transition region between the first key video frame and the second key video frame in the target key video frame.

The HDR video generation device provided by the embodiment of the application obtains the original video frame sequence generated when the video is recorded currently; performing motion detection on video frames in the original video frame sequence to obtain a first key video frame in the original video frame sequence; the first key video frame is a video frame with a pixel change rate between any video frame and the previous video frame being greater than a first preset threshold in the original video frame sequence; the pixel change rate is used for representing the pixel value change degree between the video frame and the video frame of the previous frame; acquiring a next video frame after the first key video frame in the original video frame sequence, and performing underexposure processing on the next video frame to acquire a second key video frame; performing fusion processing on the first key video frame and the second key video frame in the original video frame sequence to obtain a target key video frame; and generating a target video frame sequence based on the target key video frame, and combining the target video sequence frames to synthesize the HDR video. According to the embodiment of the application, the video frames of the original video frame sequence are subjected to motion detection, and the target video frame sequence is generated by underexposure processing and fusion of the next video frame of the current frame with the pixel value change larger than the first preset threshold value between any current frame and the adjacent previous video frame in the original video frame sequence, so that the exposure adjustment of each frame in the original video frame sequence is avoided as much as possible, the occupation of calculation resources is reduced, HDR video synthesis can be realized on some devices with smaller memory operation, and the user experience is improved.

For specific limitations of the HDR video generation apparatus, reference may be made to the above limitations of the HDR video generation method, and no further description is given here. The various modules in the HDR video generation apparatus described above may be implemented in whole or in part in software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, an electronic device is provided, which may be a terminal device, and an internal structure diagram thereof may be as shown in fig. 5. The electronic device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the electronic device is configured to provide computing and control capabilities. The memory of the electronic device includes a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The communication interface of the electronic device is used for conducting wired or wireless communication with an external terminal, and the wireless communication can be achieved through WIFI, an operator network, near Field Communication (NFC) or other technologies. The computer program is executed by a processor to implement an HDR video generation method. The display screen of the electronic equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the electronic equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the electronic equipment, and can also be an external keyboard, a touch pad or a mouse and the like.

It will be appreciated by those skilled in the art that the structure shown in fig. 5 is merely a block diagram of a portion of the structure associated with the present inventive arrangements and is not limiting of the electronic device to which the present inventive arrangements are applied, and that a particular electronic device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, the HDR video generation provided by the present application may be implemented in the form of a computer program that is executable on an electronic device such as that shown in FIG. 5. The memory of the electronic device may store various program modules constituting the HDR video generation apparatus, such as the acquisition unit, the detection unit, the processing unit, the fusion unit, and the generation unit shown in fig. 4. The computer program constituted by the respective program modules causes the processor to execute the steps in the HDR video generation method of the respective embodiments of the present application described in the present specification.

For example, the electronic device shown in fig. 5 may acquire the original video frame sequence generated when the video is currently recorded by the acquisition unit execution step in the HDR video generation apparatus shown in fig. 4. The electronic equipment can perform motion detection on the video frames in the original video frame sequence through the detection unit executing step to obtain a first key video frame in the original video frame sequence; the first key video frame is a video frame with a pixel change rate between any video frame and the previous video frame being greater than a first preset threshold in the original video frame sequence; the pixel change rate is used to represent the degree of change in pixel values between the video frame and a previous frame video frame. The electronic device may acquire a next video frame after the first key video frame in the original video frame sequence through the processing unit executing step, and perform underexposure processing on the next video frame to acquire a second key video frame. The electronic device may perform a fusion process on the first key video frame and the second key video frame in the original video frame sequence through a fusion unit executing step, so as to obtain a target key video frame. The electronic device may generate a target video frame sequence based on the target key video frames by the generating unit executing steps, and synthesize an HDR video in combination with the target video sequence frames.

In one embodiment, an electronic device is provided comprising a memory storing a computer program and a processor that when executing the computer program performs the steps of: acquiring an original video frame sequence generated when a video is recorded currently; performing motion detection on video frames in the original video frame sequence to obtain a first key video frame in the original video frame sequence; the first key video frame is a video frame, wherein the average value of the absolute values of the pixel value changes of all pixel points is larger than a first preset threshold value between the previous video frame adjacent to any video frame in the original video frame sequence; acquiring a next video frame after the first key video frame in the original video frame sequence, and performing underexposure processing on the next video frame to acquire a second key video frame; performing fusion processing on the first key video frame and the second key video frame in the original video frame sequence to obtain a target key video frame; and generating a target video frame sequence based on the target key video frame, and combining the target video sequence frames to synthesize the HDR video.

In one embodiment, the processor when executing the computer program further performs the steps of: underexposure treatment is carried out on the second frame video frames in the original video frame sequence, and target second frame video frames are obtained; performing fusion processing on the first key video frame and the second key video frame in the original video frame sequence to obtain a first target key video frame; the second frame video frame is the next video frame of the first frame video frame in the original video frame sequence; and generating a target video frame sequence based on the first target key video frame and the target key video frame.

In one embodiment, the processor when executing the computer program further performs the steps of: judging whether the scene when the video is recorded currently is a high dynamic range HDR scene or not; if yes, performing motion detection on the video frames in the original video frame sequence, and acquiring a first key video frame in the original video frame sequence.

In one embodiment, the processor when executing the computer program further performs the steps of: acquiring brightness values of all pixel points of a current video frame in the original video frame sequence, and counting the number of the pixel points with the brightness values larger than a second preset threshold value; judging whether the number of the pixel points is larger than a preset number; if yes, the scene when the video is recorded currently is a high dynamic range HDR scene; if not, the scene when the video is recorded currently is not a high dynamic range HDR scene.

In one embodiment, the processor when executing the computer program further performs the steps of: and performing motion detection on video frames in the original video frame sequence based on an inter-frame difference method, and determining the first key video frame in the original video frame sequence.

In one embodiment, the processor when executing the computer program further performs the steps of: acquiring an overexposed region in the next video frame, and acquiring a first brightness average value of the overexposed region; acquiring a second brightness average value of a normal exposure area in the next video frame; acquiring exposure compensation parameters corresponding to the next video frame based on the first brightness average value and the second brightness average value; and carrying out exposure compensation processing on the next video frame according to the exposure compensation parameters to obtain the second key video frame.

In one embodiment, the processor when executing the computer program further performs the steps of: and smoothing the transition region of the first key video frame and the second key video frame in the target key video frame.

According to the electronic equipment provided by the embodiment of the application, the original video frame sequence generated when the video is recorded currently is obtained; performing motion detection on video frames in the original video frame sequence to obtain a first key video frame in the original video frame sequence; the first key video frame is a video frame with a pixel change rate between any video frame and the previous video frame being greater than a first preset threshold in the original video frame sequence; the pixel change rate is used for representing the pixel value change degree between the video frame and the video frame of the previous frame; acquiring a next video frame after the first key video frame in the original video frame sequence, and performing underexposure processing on the next video frame to acquire a second key video frame; performing fusion processing on the first key video frame and the second key video frame in the original video frame sequence to obtain a target key video frame; and generating a target video frame sequence based on the target key video frame, and combining the target video sequence frames to synthesize the HDR video. According to the embodiment of the application, the video frames of the original video frame sequence are subjected to motion detection, and the target video frame sequence is generated by underexposure processing and fusion of the next video frame of the current frame with the pixel value change larger than the first preset threshold value between any current frame and the adjacent previous video frame in the original video frame sequence, so that the exposure adjustment of each frame in the original video frame sequence is avoided as much as possible, the occupation of calculation resources is reduced, HDR video synthesis can be realized on some devices with smaller memory operation, and the user experience is improved.

In one embodiment, a computer readable storage medium is provided having a computer program stored thereon, which when executed by a processor, performs the steps of: acquiring an original video frame sequence generated when a video is recorded currently; performing motion detection on video frames in the original video frame sequence to obtain a first key video frame in the original video frame sequence; the first key video frame is a video frame, wherein the average value of the absolute values of the pixel value changes of all pixel points is larger than a first preset threshold value between the previous video frame adjacent to any video frame in the original video frame sequence; acquiring a next video frame after the first key video frame in the original video frame sequence, and performing underexposure processing on the next video frame to acquire a second key video frame; performing fusion processing on the first key video frame and the second key video frame in the original video frame sequence to obtain a target key video frame; and generating a target video frame sequence based on the target key video frame, and combining the target video sequence frames to synthesize the HDR video.

In one embodiment, the computer program when executed by the processor further performs the steps of: underexposure treatment is carried out on the second frame video frames in the original video frame sequence, and target second frame video frames are obtained; performing fusion processing on the first key video frame and the second key video frame in the original video frame sequence to obtain a first target key video frame; the second frame video frame is the next video frame of the first frame video frame in the original video frame sequence; and generating a target video frame sequence based on the first target key video frame and the target key video frame.

In one embodiment, the computer program when executed by the processor further performs the steps of: judging whether the scene when the video is recorded currently is a high dynamic range HDR scene or not; if yes, performing motion detection on the video frames in the original video frame sequence, and acquiring a first key video frame in the original video frame sequence.

In one embodiment, the computer program when executed by the processor further performs the steps of: acquiring brightness values of all pixel points of a current video frame in the original video frame sequence, and counting the number of the pixel points with the brightness values larger than a second preset threshold value; judging whether the number of the pixel points is larger than a preset number; if yes, the scene when the video is recorded currently is a high dynamic range HDR scene; if not, the scene when the video is recorded currently is not a high dynamic range HDR scene.

In one embodiment, the computer program when executed by the processor further performs the steps of: and performing motion detection on video frames in the original video frame sequence based on an inter-frame difference method, and determining the first key video frame in the original video frame sequence.

In one embodiment, the computer program when executed by the processor further performs the steps of: acquiring an overexposed region in the next video frame, and acquiring a first brightness average value of the overexposed region; acquiring a second brightness average value of a normal exposure area in the next video frame; acquiring exposure compensation parameters corresponding to the next video frame based on the first brightness average value and the second brightness average value; and carrying out exposure compensation processing on the next video frame according to the exposure compensation parameters to obtain the second key video frame.

In one embodiment, the computer program when executed by the processor further performs the steps of: and smoothing the transition region of the first key video frame and the second key video frame in the target key video frame.

The computer readable storage medium provided by the embodiment of the application acquires an original video frame sequence generated when a video is recorded currently; performing motion detection on video frames in the original video frame sequence to obtain a first key video frame in the original video frame sequence; the first key video frame is a video frame with a pixel change rate between any video frame and the previous video frame being greater than a first preset threshold in the original video frame sequence; the pixel change rate is used for representing the pixel value change degree between the video frame and the video frame of the previous frame; acquiring a next video frame after the first key video frame in the original video frame sequence, and performing underexposure processing on the next video frame to acquire a second key video frame; performing fusion processing on the first key video frame and the second key video frame in the original video frame sequence to obtain a target key video frame; and generating a target video frame sequence based on the target key video frame, and combining the target video sequence frames to synthesize the HDR video. According to the embodiment of the application, the video frames of the original video frame sequence are subjected to motion detection, and the target video frame sequence is generated by underexposure processing and fusion of the next video frame of the current frame with the pixel value change larger than the first preset threshold value between any current frame and the adjacent previous video frame in the original video frame sequence, so that the exposure adjustment of each frame in the original video frame sequence is avoided as much as possible, the occupation of calculation resources is reduced, HDR video synthesis can be realized on some devices with smaller memory operation, and the user experience is improved.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, or the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms, such as static random access memory (Static Random Access Memory, SRAM), dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples illustrate only a few embodiments of the application, which are described in detail and are not to be construed as limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of protection of the present application is to be determined by the appended claims.

Claims

1. A method for HDR video generation, comprising:

2. The method of claim 1, wherein after the obtaining the sequence of original video frames generated at the time of the current recorded video, the method further comprises:

3. The method of claim 1, wherein after acquiring the original video frame sequence generated at the time of the current recorded video, the method further comprises:

4. The method of claim 3, wherein said determining whether the scene at the time of the current recording of video is a high dynamic range HDR scene comprises:

judging whether the number of the pixel points is larger than a preset number;

5. The method of claim 1, wherein the motion detecting the video frames in the original video frame sequence to obtain the first key video frame in the original video frame sequence comprises:

6. The method of claim 1, wherein the acquiring a next video frame in the original video frame sequence after the first key video frame and underexposing the next video frame to acquire a second key video frame comprises:

7. The method of claim 1, wherein after the fusing of the first and second key video frames in the original sequence of video frames to obtain a target key video frame, the method further comprises:

8. An HDR video generation apparatus, comprising:

9. An electronic device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any one of claims 1 to 7 when the computer program is executed.

10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.