WO2021036991A1

WO2021036991A1 - High dynamic range video generation method and device

Info

Publication number: WO2021036991A1
Application number: PCT/CN2020/110825
Authority: WO
Inventors: 许亦然; 张俪耀; 马靖
Original assignee: 华为技术有限公司
Priority date: 2019-08-30
Filing date: 2020-08-24
Publication date: 2021-03-04
Also published as: CN112449120A; CN112449120B

Abstract

The present application provides a high dynamic range (HDR) video generation method and device, relating to the technical field of artificial intelligence (AI). The method comprises: a terminal apparatus receiving a processing request inputted by a user and requesting generation of a HDR video; dividing each first image frame comprised in an original video to at least two types of regions, wherein first image features of different types of regions are different; processing the different types of regions by using corresponding first enhancement processing methods to acquire enhanced regions; combining multiple enhanced regions of each first image frame to acquire a second image frame respectively corresponding to each first image frame; and acquiring a HDR video on the basis of these second image frames. In the process, acquiring images of different exposure times does not require each first image frame to be captured, and therefore a long exposure process is not necessary. The terminal apparatus divides an image frame of an original video into different regions, which are further divided into at least two types of regions. An enhancement scheme of each type of region can be determined according to a first image feature of the type of region, such that different types of regions use different enhancement schemes.

Description

Method and device for generating high dynamic range video

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on August 30, 2019, the application number is 201910817877.2, and the application name is "High Dynamic Range Video Generation Method and Apparatus", the entire content of which is incorporated into this application by reference in.

Technical field

This application relates to the technical field of artificial intelligence (AI), and in particular to a method and device for generating a high dynamic range video.

Background technique

In the field of image processing, dynamic range refers to the ratio of the brightness of the brightest light to the brightness of the darkest light on an image, and the dynamic range is one of the important dimensions of image quality evaluation. According to the dynamic range, images can be divided into high dynamic range (HDR) images, low dynamic range (LDR) images, and so on.

Generally, in order to obtain an HDR image, the same object can be photographed multiple times in a short period of time to obtain an LDR image of the object with different exposure durations, and then these LDR images are synthesized into an HDR image. For example, in a long exposure process, a short exposure image, a medium exposure image, and a long exposure image of the object are acquired, and then the three images are merged to obtain an HDR image.

With the development of HDR technology and the gradual popularization of HDR displays, users’ demand for HDR videos is gradually increasing. At this time, terminal devices are required to have the ability to shoot HDR videos or to convert LDR videos into HDR videos. However, the video is composed of individual image frames. If each image frame is composed of images with different exposure time, the terminal device needs to shoot multiple images with different exposure time for each image frame in a short time, and then These images are combined into image frames, and then an HDR video is obtained from the multiple image frames. Obviously, limited by the exposure time, the dynamic range of the video captured by the terminal device is limited.

Summary of the invention

The embodiments of the present application provide a method and device for generating a high dynamic range video. The HDR video is obtained by dividing an image frame into different regions and using different enhancement schemes for different regions.

In the first aspect, an embodiment of the present application provides a method for generating a high dynamic range HDR video. The method is described from the perspective of a terminal device. The method includes: after the terminal device receives a user input request to generate an HDR video processing request , Divide each first image frame contained in the original video into at least two types of areas. Different types of areas have different first image features. Use the corresponding first enhancement processing method for the different types of areas to obtain the enhanced area, and then , Splicing multiple regions after the enhancement processing of each first image frame to obtain a second image frame corresponding to each first image frame, and obtain an HDR video based on the second image frames. In this process, there is no need to take multiple images with different exposure time for each first image frame. Therefore, it is not limited to the long exposure process. The terminal device divides the image frame into different areas by dividing the original video. The area of is divided into at least two types of areas, and the enhancement scheme for this type of area can be determined according to the first image feature of each type of area, so that different types of areas use different enhancement schemes. In addition, since the embodiment of the present application processes each image frame to obtain HDR video, it is not necessary to shoot images with a long exposure time, for example, there is no need to shoot images with medium exposure time and long exposure time, that is to say, the implementation of this application For example, there is no need for terminal equipment to obtain high-dynamic video by lengthening the exposure. Therefore, the obtained HDR video will not have ghost images.

In a feasible design, the terminal device uses the corresponding first enhancement processing method to process each of the multiple regions, and after obtaining the multiple regions after the enhancement processing, it also performs semantic segmentation on the first region to separate The first area is divided into a plurality of sub-areas, and the plurality of sub-areas contains at least two types of sub-areas. The second image features of the different types of sub-areas are different. The first area is any one of the multiple areas. The first image feature and the second image feature The second image feature is an image feature of different dimensions. The second enhancement processing method of each sub-region is determined, and the corresponding second enhancement processing method is used for each sub-region of the multiple sub-regions, and the multiple sub-regions after the enhancement processing are stitched together. Area, get the first area. With this solution, the terminal device can further divide the area into multiple sub-regions according to the second image feature, determine the second enhancement processing mode for each sub-region, and adopt the second enhancement processing mode to perform the Perform processing to achieve the purpose of further enhancing the processing of each sub-region.

In a feasible design, when the terminal device divides the first image frame contained in the original video into multiple regions according to the processing request, each pixel of the first image frame is extracted from the YUV color space data of the first image frame According to the Y data of each pixel of the first image frame, the gradient of each pixel is determined, and the first image frame is divided into multiple regions according to the gradient of each pixel. With this solution, the purpose of dividing the first image frame into multiple regions according to brightness is achieved.

In a feasible design, the terminal device converts the first image frame into a YUV image before acquiring the YUV color space data of the first image frame. With this solution, the purpose of dividing the first image frame into multiple regions according to brightness is achieved.

In a feasible design, the terminal device uses the corresponding first enhancement processing method to process each of the multiple regions respectively, and obtains the histogram of each of the multiple regions before obtaining the multiple regions after the enhancement processing. In the figure, the dynamic range value of the corresponding area is determined according to each histogram, and the first enhancement processing mode corresponding to each area is determined according to each dynamic range value. By adopting this solution, the purpose of obtaining the first enhanced processing mode of the area by the terminal device according to the dynamic range value of the area is realized.

In a feasible design, each area contains the target area, and the dynamic range value includes the high-exposure dynamic range value and the low-light dynamic range value. When the terminal device determines the dynamic range value of the corresponding area according to each histogram, it extracts the peak value of the target area, Average value and area ratio. This area ratio is used to indicate the ratio of high exposure area and low exposure area. According to the peak value, average value and area ratio, determine the high exposure dynamic range value and low brightness dynamic range value of the target area. Using this solution, the terminal device determines the high-exposure dynamic range and low-brightness dynamic range values of the target area, and then determines the first enhancement processing method according to the dynamic distribution, so that the terminal device uses different enhancement processing solutions for different types of areas. deal with.

In a feasible design, the original video is the video currently being shot by the electronic device; or, the original video is the local LDR video of the electronic device. With this solution, the terminal device can record HDR video in real time or convert a pre-recorded video into HDR video.

In the second aspect, an embodiment of the present application provides a high dynamic range HDR video generation device, including:

The transceiver unit is configured to receive a processing request input by a user, and the processing request is used to request the generation of an HDR video;

The processing unit, the user divides the first image frame contained in the original video into a plurality of regions according to the processing request, and the plurality of regions include at least two types of regions. Each area in each area is processed by the corresponding first enhancement processing method to obtain multiple areas after the enhancement processing, and the multiple areas after the enhancement processing are spliced to obtain a second image frame, which is generated according to the second image frame HDR video.

In a feasible design, the processing unit is configured to perform semantic segmentation on the first area to divide the first area into a plurality of sub-areas, and the plurality of sub-areas includes at least two types of sub-areas, and different types of sub-areas. The second image feature of the region is different, the first region is any one of the multiple regions, the first image feature and the second image feature are image features of different dimensions, and the difference of each subregion is determined In the second enhancement processing manner, each of the multiple sub-regions is processed by a corresponding second enhancement processing manner, and the multiple sub-regions after the enhancement processing are spliced to obtain the first region.

In a feasible design, the processing unit is configured to extract the Y data of each pixel of the first image frame from the YUV color space data of the first image frame, and according to the first image frame The Y data of each pixel of, determine the gradient of each pixel, and divide the first image frame into a plurality of regions according to the gradient of each pixel.

In a feasible design, the processing unit is further configured to convert the first image frame into a YUV image before acquiring the YUV color space data of the first image frame.

In a feasible design, the processing unit is used to process each of the multiple regions by using the corresponding first enhancement processing mode to obtain the multiple regions after the enhancement processing. According to the histogram of each of the multiple areas, the dynamic range value of the corresponding area is determined according to each of the histograms, and the first enhancement processing mode corresponding to each area is respectively determined according to each of the dynamic range values.

In a feasible design, each region includes a target region, the dynamic range value includes a high-exposure dynamic range value and a low-brightness dynamic range value, and the processing unit is configured to extract the peak value and average value of the target region And the area proportion, the area proportion is used to indicate the ratio of the high-exposure area and the low-exposure area, and the high-exposure dynamics of the target area is determined according to the peak value, the average value, and the area proportion Range value and the low-light dynamic range value.

In a feasible design, the original video is a video currently being shot by the terminal device, or the original video is a local LDR video of the terminal device.

In a third aspect, an embodiment of the present application provides a terminal device, including: a processor, a memory, and a computer program stored on the memory and capable of running on the processor, and when the processor executes the program Perform the method in the first aspect or various possible implementation manners of the first aspect above.

In a fourth aspect, the embodiments of the present application provide a computer program product containing instructions, which when run on a terminal device, cause the terminal device computer to execute the method in the first aspect or various possible implementation manners of the first aspect. .

In a fifth aspect, an embodiment of the present application provides a computer-readable storage medium that stores instructions in the computer-readable storage medium, which when run on a terminal device, causes the terminal device to execute the first aspect or the first aspect described above. Of the various possible implementation methods.

In a sixth aspect, an embodiment of the present application provides a chip system. The chip system includes a processor and may also include a memory, configured to implement the functions of the terminal device in the foregoing method. The chip system can be composed of chips, or it can include chips and other discrete devices.

In the high dynamic range HDR video generation method and device provided by the embodiments of the present application, after receiving a processing request input by a user requesting to generate an HDR video, the terminal device divides each first image frame contained in the original video into at least two types of regions, different types The first image features of the regions are different, and the different types of regions are processed by the corresponding first enhancement processing method to obtain the enhanced region. After that, the enhanced regions of each first image frame are spliced to obtain each first image frame. A second image frame corresponding to each image frame is used to obtain an HDR video based on the second image frames. In this process, there is no need to take multiple images with different exposure time for each first image frame. Therefore, it is not limited to the long exposure process. The terminal device divides the image frame into different areas by dividing the original video. The area of is divided into at least two types of areas, and the enhancement scheme for this type of area can be determined according to the first image feature of each type of area, so that different types of areas use different enhancement schemes. In addition, since the embodiment of the present application processes each image frame to obtain HDR video, it is not necessary to shoot images with a long exposure time, for example, there is no need to shoot images with medium exposure time and long exposure time, that is to say, the implementation of this application For example, there is no need for terminal equipment to obtain high-dynamic video by lengthening the exposure. Therefore, the obtained HDR video will not have ghost images.

Description of the drawings

Figure 1 is a schematic diagram of the process of generating HDR images;

Figure 2 is a flowchart of an HDR video generation method provided by an embodiment of the present application;

FIG. 3 is a schematic diagram of an input processing request in the HDR video generation method provided by an embodiment of the present application;

FIG. 4 is another schematic diagram of the input processing request in the HDR video generation method provided by the embodiment of the present application;

FIG. 5 is a schematic diagram of each area of the first image frame in the HDR video generation method provided by the embodiment of the present application;

6 is another schematic diagram of each area of the first image frame in the HDR video generation method provided by the embodiment of the present application;

FIG. 7 is a schematic diagram of the process of distinguishing segmentation and sub-region segmentation in the HDR video generation method provided by an embodiment of the present application;

FIG. 8 is a flowchart of dividing a first image frame in the HDR video generation method provided by an embodiment of the present application;

FIG. 9 is a flowchart of determining the first enhancement processing mode in the HDR video generation method provided by the embodiment of the present application;

FIG. 10 is a schematic diagram of regions and histograms in the HDR video generation method provided by an embodiment of the present application;

FIG. 11 is a schematic diagram of a process of an HDR video generation method provided by an embodiment of the present application;

FIG. 12 is a schematic structural diagram of a high dynamic range HDR video generation device provided by an embodiment of the present invention;

FIG. 13 is a schematic structural diagram of a terminal device provided by an embodiment of the present application;

FIG. 14 is a schematic diagram of the hardware structure of a terminal device provided by an embodiment of the application.

detailed description

At present, in order to obtain HDR images, it is necessary to obtain multiple LDR images with different exposure durations, and then synthesize these LDR images into HDR images. Figure 1 is a schematic diagram of the process of generating HDR images. Please refer to Figure 1, a long exposure process T includes three stages: a short exposure stage, a medium exposure stage and a long exposure stage. The exposure durations corresponding to the three stages are t1, t2, and t3, respectively, where t1<t2<t3 , T1+t2+t3≤T. The terminal device obtains three images with different exposure durations, which are a short exposure duration image, a medium exposure duration image, and a long exposure duration image; after that, the three images are fused to obtain an HDR image.

According to the above: in Figure 1, three images with different exposure durations in a long exposure process are merged. Some or all of the images are likely to appear ghosts due to the long exposure time and the user's hand shaking when shooting. For example, ghost images may appear in medium-exposure duration images and long-exposure duration images, resulting in HDR images synthesized from these images. When this method is applied to generate HDR video, it is necessary to generate images with different exposure durations for each image frame in the video, and many objects in the video are motion, resulting in ghost images in images with different exposure durations; in addition, a The long exposure process T is limited. Obviously, the terminal device cannot shoot multiple image frames in the limited long exposure process T, and shoot images with different exposure durations for each image frame. Therefore, the above-mentioned method of synthesizing HDR images from multiple LDR images with different exposure durations is not suitable for HDR video generation.

Therefore, the embodiments of the present application provide a method and device for generating a high dynamic range video, which divide the original video into different regions and use different enhancement schemes to obtain HDR videos.

In the embodiment of the present application, the terminal device may directly shoot the HDR video, or the terminal device may also process the local original video to obtain the HDR video. Among them, the original video may be an LDR video, a standard dynamic range (Standard Dynamic Range, SDR) video, etc., pre-shot by the terminal device or downloaded from the server.

The terminal device involved in the embodiment of the present application may be a terminal device capable of video broadcasting and video recording. In the embodiments of this application, the terminal device may communicate with one or more core networks or the Internet via a radio access network (e.g., radio access network, RAN), and may be a mobile terminal device, such as a mobile phone (or called a "cellular "Phones, mobile phones), computers, and data cards, for example, can be portable, pocket-sized, handheld, computer-built or vehicle-mounted mobile devices, and they exchange language and/or data with the wireless access network. For example, personal communication service (PCS) phones, cordless phones, session initiation protocol (SIP) phones, wireless local loop (WLL) stations, personal digital assistants (PDAs), and tablets Computers (Pad), computers with wireless transceiver functions and other equipment. Wireless terminal equipment can also be called system, subscriber unit, subscriber station, mobile station, mobile station (MS), remote station (remote station), access point ( access point (AP), remote terminal equipment (remote terminal), access terminal equipment (access terminal), user terminal equipment (user terminal), user agent (user agent), subscriber station (SS), user terminal equipment (customer premises equipment, CPE), terminal (terminal), user equipment (user equipment, UE), mobile terminal (mobile terminal, MT), etc. Wireless terminal devices can also be wearable devices and next-generation communication systems, for example, terminal devices in 5G networks or terminal devices in public land mobile network (PLMN) networks that will evolve in the future, and in NR communication systems. Terminal equipment, etc.

In the following, taking HDR video shooting by a terminal device as an example, the HDR video generation method described in the embodiment of the present application will be described in detail.

Fig. 2 is a flowchart of an HDR video generation method provided by an embodiment of the present application. This embodiment is described from the perspective of a terminal device, and this embodiment includes:

101. Receive a processing request input by a user, where the processing request is used to request generation of an HDR video.

Exemplarily, the user may input a processing request to the terminal device through a voice method, a touch method, or the like. Exemplary, please refer to Fig. 3 and Fig. 4.

Fig. 3 is a schematic diagram of an input processing request in the HDR video generation method provided by an embodiment of the present application. Referring to Figure 3, in this embodiment, an HDR video generation application (application, APP) is installed on the terminal device. The APP can be an APP that comes with the operating system of the terminal device, such as a camera, or it can be downloaded and installed by the user. To the third-party APP on the terminal device. Taking the APP as a third-party APP as an example, when the HDR video needs to be generated, the user clicks on the HDR video on the desktop to generate the APP to start the APP. In addition to the shooting button on the APP's interface, there are also options for users to choose, such as time-lapse shooting, slow motion, video, photo, panorama, etc. After the user selects the video (as shown by the dashed box in the figure), the terminal device appears to turn on the two options of flash and HDR for the user to choose. If the user selects the HDR option, the terminal device will shoot the HDR video in real time.

FIG. 4 is another schematic diagram of an input processing request in the HDR video generation method provided by the embodiment of the present application. Please refer to FIG. 4, the difference between this embodiment and FIG. 3 is that in this embodiment, after the user starts the APP, in addition to the shooting button, there are also options for the user to select, such as time-lapse shooting, Slow motion, normal video, HDR video, photo, panorama and other options, users can select HDR video, then you can shoot HDR video in real time.

It should be noted that although the foregoing Figures 3 and 4 illustrate the embodiments of the present application by taking real-time HDR video shooting as an example, the embodiments of the present application are not limited. In other feasible implementation manners, it may also be a user After starting the APP, a selection button is also displayed on the APP interface. When the user clicks the selection button, the terminal device pops up a selection page. The selection page displays a series of videos that have been downloaded or recorded by the terminal device. After the user selects the original video , The terminal device generates an HDR video based on the selected original video.

102. According to the processing request, divide the first image frame included in the original video into a plurality of regions, the plurality of regions include at least two types of regions, and the first image features of the different types of regions are different.

Exemplarily, when the terminal device directly shoots the HDR video, the original video is a video captured in real time in the viewing frame of the terminal device, and the first image frame is an image frame in the original video. When the terminal device converts the local video to HDR video, the original video is the video stored locally on the terminal device, but it is pre-recorded or downloaded from the server by the terminal device, and the first image frame is any image frame in the original video. For each first image frame, the terminal device divides the first image frame into a plurality of areas. These areas can be divided into at least two types of areas. Each type of area includes at least one area. When at least one area includes two Or when there are more than two regions, the two or more regions are not continuous in the first image frame, and the first image features of regions of different types are different. Among them, the image characteristics include brightness, contained content, color, texture, shape, and so on. In the following, how to divide the image frame into multiple regions will be described by taking the image feature as the brightness and the content contained in the image as an example. Exemplary, please refer to Fig. 5 and Fig. 6.

FIG. 5 is a schematic diagram of each area of the first image frame in the HDR video generation method provided by an embodiment of the present application. In the embodiment of this application, the image frame is a YUV image, and the YUV color space data of the first image frame can be obtained according to the first image frame, where the Y data represents brightness (luminance or luma), that is, brightness; and " U" and "V" represent chrominance (chrominance or chroma), which are used to describe the color and saturation of an image. Y data is also called brightness information, and U data and V data are also called color information. Assuming that all the brightness has a brightness of 0 to 255, a total of 255 gray scales, the brightness of the 255 gray scale is divided into three types in advance: 0 to 193 gray scale interval is low brightness interval, 194 to 223 gray scale interval is middle brightness interval, 224 The ～255 gradation interval is the high brightness interval. Referring to FIG. 5, the terminal device extracts the brightness of each pixel of the first image frame, and determines the interval range to which the brightness of each pixel belongs, so as to divide the first image frame into three brightness regions according to the brightness, each of which is high brightness Area, medium-brightness area and low-brightness area, the same brightness area can be discontinuous. As shown in Figure 5, the high-brightness area includes area A and area B, the medium-brightness area includes area C and area D, and the low-brightness area includes area E.

FIG. 6 is another schematic diagram of each area of the first image frame in the HDR video generation method provided by the embodiment of the present application. Referring to Figure 6, the terminal device uses image detection algorithms, such as inter-frame difference method, background modeling method, point detection method, image segmentation method, cluster analysis method, and motion vector field method, to detect whether the first image frame contains Specific content, such as characters, sky, buildings, trees, etc., may not be continuous in areas containing the same specific content. The terminal device divides the first image frame into four types of areas according to the content, which are a person area, a sky area, a building area, and a tree area.

103. Use a corresponding first enhancement processing manner to process each of the multiple regions, respectively, to obtain multiple regions after enhancement processing.

Exemplarily, the terminal device determines the corresponding first enhancement processing mode for each type of area, and the first enhancement processing mode is different for different types of areas. After that, for each area, the terminal device adopts the first enhancement processing manner corresponding to the area to perform enhancement processing on the area. For example, referring to Figure 5, the terminal device performs color-brightening processing on high-brightness areas, such as area A and area B; for medium-brightness areas, such as area C and area D, it performs processing to increase the contrast; for low-brightness areas, such as area E Perform processing to increase the exposure intensity. For another example, please refer to Figure 6, the terminal device performs skin tone enhancement processing on the character area; color brightening processing on the sky area; blurring processing on the building area; texture enhancement processing on the tree area to ensure that the first image frame The quality of each area of the video is improved, thereby ensuring the improvement of video quality.

104. Join the multiple regions after the enhancement processing to obtain a second image frame.

Exemplarily, in the above step 103, multiple regions after the enhancement processing of each first image frame can be obtained. In this step, for each first image frame, through alpha-blending (alpha-blending, alpha-blending) , Laplacian blending (laplacian blending), etc., splicing multiple regions after enhancement processing to obtain a second image frame corresponding to each first image frame.

105. Generate the HDR video according to the second image frame.

Exemplarily, the original video includes multiple first image frames, and different first image frames correspond to different second image frames. After obtaining the second image frame from the first image frame, the terminal device fuses the second image frames to obtain the HDR video.

In the HDR video generation method provided by the embodiments of the present application, after receiving a user input request to generate an HDR video processing request, the terminal device divides each first image frame contained in the original video into at least two types of regions, and the first image of the different types of regions With different characteristics, the different types of regions are processed by the corresponding first enhancement processing method to obtain the enhanced region. After that, the enhanced regions of each first image frame are spliced to obtain the corresponding first image frame. Based on the second image frames of, the HDR video is obtained. In this process, there is no need to take multiple images with different exposure time for each first image frame. Therefore, it is not limited to the long exposure process. The terminal device divides the image frame into different areas by dividing the original video. The area of is divided into at least two types of areas, and the enhancement scheme for this type of area can be determined according to the first image feature of each type of area, so that different types of areas use different enhancement schemes. In addition, since the embodiment of the present application processes each image frame to obtain HDR video, it is not necessary to shoot images with a long exposure time, for example, there is no need to shoot images with medium exposure time and long exposure time, that is to say, the implementation of this application For example, there is no need for terminal equipment to obtain high-dynamic video by lengthening the exposure. Therefore, the obtained HDR video will not have ghost images.

It should be noted that in the above implementation, it is not necessary to divide the region, but to use a certain enhancement processing method for the first image frame to perform enhancement processing. The specific enhancement processing method used can be based on the first image frame. The overall condition is determined. For example, if the overall brightness of the first image frame is relatively good, the overall color of the first image frame is brightened. This method has limitations and cannot improve the effect of the dynamic range of the entire scene. Therefore, by customizing the enhancement processing scheme by region, the limitations brought by a single enhancement processing scheme can be improved.

In the foregoing embodiment, each type of area corresponds to a first enhanced processing mode, and the terminal device uses the corresponding first enhanced processing mode for processing for each area. Further, for each area, the terminal device may further divide the area into a plurality of sub-areas according to the second image characteristics, determine the second enhancement processing mode for each sub-areas, and adopt the second enhancement processing mode, Each sub-area is processed. The following describes in detail how the terminal device divides sub-areas and so on.

In a feasible implementation manner, the terminal device first divides the first image frame into a plurality of regions, these regions include at least two types of regions, the first image features of the same type of regions are the same, and the first image features of different types of regions are different. . After that, for any one of the multiple areas included in a type of area, hereinafter referred to as the first area, the terminal device performs semantic segmentation on the first area to divide the first area into multiple sub-areas. The sub-region includes at least two types of sub-regions. The second image features of the same category of sub-regions are the same, and the second image features of different categories of sub-regions are different. The first image feature and the second image feature are images of different dimensions. Feature, determine the second enhancement processing method for each sub-region, and process each of the multiple sub-regions by using the corresponding second enhancement processing method respectively, and splice the multiple sub-regions after the enhancement processing to obtain the first One area.

Exemplarily, for any one area of the multiple areas of the first image frame, which is referred to as the first area hereinafter, the terminal device divides the first area into multiple sub-areas according to the second image feature. FIG. 7 is a schematic diagram of the process of distinguishing segmentation and sub-region segmentation in the HDR video generation method provided by an embodiment of the present application. Referring to Figure 7, assuming that the first image feature is brightness and the second image feature is the content contained in the image, the terminal device divides the first image frame into a high-brightness area, a low-brightness area, and a medium-brightness area. Take a high-brightness area as an example. The terminal device further performs semantic segmentation on the high-brightness area, and divides the high-brightness area into a person sub-areas, a sky sub-areas, a grass sub-areas, etc. (not shown in the figure). Then, determine the second enhancement processing method corresponding to each sub-region, and use the second enhancement processing method to enhance the corresponding sub-region; then, stitch each sub-region to obtain the corresponding region, and then stitch each region, Get the second image frame.

It should be noted that although in the above embodiment, after the first image frame is divided into at least two types of areas, each area in each type of area is processed by the first enhancement processing method corresponding to the area type, and then After dividing any one of the types of areas (hereinafter referred to as the first area) to obtain at least two types of sub-areas, the second enhancement processing method corresponding to the sub-area category is used for each sub-area for processing. However, the embodiments of the present application are not limited. In other feasible implementation manners, after the first image frame is divided into multiple regions, these regions are not processed, but each region is further divided into subregions. After that, each sub-region is processed. Or, after sub-regions are obtained by dividing each region, different enhancement processing methods are adopted according to the image content in the sub-regions. For example, referring to Figure 6 again, taking the first area as the person area as an example, the terminal device performs semantic segmentation on the first area to obtain two sub-areas, where the person in one sub-area is male, and the person in the other sub-area For females, different enhancement processing methods are used to process the two character sub-regions.

In this embodiment, the terminal device may further divide the area into a plurality of sub-areas according to the second image feature, determine the second enhancement processing method for each sub-areas, and adopt the second enhancement processing method for each sub-area. Perform processing to achieve the purpose of further enhancing the processing of each sub-region.

In the following, taking the first image feature as brightness as an example, how the terminal device divides the first image frame included in the original video into multiple regions according to the processing request will be described in detail. Exemplarily, refer to FIG. 8. FIG. 8 is a flowchart of dividing the first image frame in the HDR video generation method provided in an embodiment of the present application. This embodiment includes:

201. Acquire a first image frame.

Exemplarily, the terminal device sequentially inputs any image frame of the original video, that is, the first image frame, into the HDR video generating device. For example, when the original video is a recorded video, the terminal device performs framing processing on it to obtain an image frame sequence, and according to the order of each image frame in the image frame sequence, the image frames are sequentially input to the HDR video generating device. For another example, when the original video is the video currently being shot by the terminal device, the terminal device captures each image frame frame by frame and sequentially inputs it into the HDR video generating device. Among them, the first image frame is a YUV image. If the first image frame is not a YUV image, such as red green blue (RGB), etc., the first image frame needs to be converted into a YUV image, and the hue, brightness, and saturation ( hue Luminance Saturation, HLS) and other images that facilitate the extraction of brightness information.

202. Extract Y data of each pixel of the first image frame from the YUV color space data of the first image frame.

Exemplarily, the Y data in the YUV color space data represents brightness (luminance or luma), that is, brightness; while "U" and "V" represent chrominance (chrominance or chroma), which are used to describe the color of an image And saturation. The terminal device extracts the Y data of each pixel of the first image frame, thereby obtaining multiple Y data.

203. The terminal device filters the sum data to obtain smooth Y data.

Exemplarily, the terminal device uses Gaussian filtering or the like to perform smoothing processing on the Y data, so as to obtain smooth Y data.

204. Determine the gradient of each pixel point according to the Y data of each pixel point of the first image frame.

Exemplarily, the terminal device uses Robert edge detection operator, Sobel edge detection operator, etc., to determine the gradient G of each pixel.

Among them, G _x is the gradient in the horizontal axis direction of a pixel, and G _y is the gradient in the vertical axis direction of the corresponding pixel. Assuming that in the horizontal axis direction, pixel point a and pixel point b are two adjacent pixels, and in the vertical axis direction, pixel point a and pixel point c are two adjacent pixels, then for pixel point a , G _x is equal to the absolute value of the difference between the Y data of pixel a and the Y data of pixel b, and G _y is equal to the absolute value of the difference between the Y data of pixel a and the Y data of pixel c.

During the segmentation process, a first threshold and a second threshold are preset, where the first threshold is expressed as a high threshold (Highthreshold). If a pixel is a high-brightness pixel, the gradient of the pixel cannot be lower than the first threshold; The second threshold is expressed as a low threshold. If a pixel is a low-brightness pixel, the gradient of the pixel cannot be higher than the second threshold. For any pixel in the first image frame, mark the gradient of the pixel as G _p . If G _{p is} greater than or equal to the first threshold, it means that the pixel is a high-brightness pixel; if G _{p is} less than or equal to the second threshold , It means that the pixel is a low-brightness pixel; if G _{p is} between the first threshold and the second threshold, it means that the pixel is a medium-brightness pixel.

205. Divide the first image frame into multiple regions according to the gradient of each pixel.

Exemplarily, the terminal device uses the pixels in the first image frame that exceed the first threshold and are grouped together as one area. Since the pixels with gradients exceeding the first threshold are likely to be distributed in different places in the first image frame, a threshold can be set, such as 100. Continuous 100 or more pixels with gradients exceeding the first threshold form a high Brightness area. In this way, there is likely to be more than one high-brightness area in the first image frame. In the same way, the low-brightness area and the medium-brightness area are obtained in the same way, so that the first image frame is divided into multiple areas.

In this embodiment, the purpose of dividing the first image frame into multiple regions according to brightness is achieved.

Hereinafter, how to determine the first enhancement processing mode of each category of region in the above-mentioned embodiment will be described in detail. Exemplarily, refer to FIG. 9. FIG. 9 is a flowchart of determining the first enhancement processing mode in the HDR video generation method provided in an embodiment of the present application. This embodiment includes:

301. Obtain a histogram of each of the multiple regions.

Exemplarily, for each of the multiple areas, the terminal device may make a histogram of each area based on color characteristics and the like. FIG. 10 is a schematic diagram of regions and histograms in the HDR video generation method provided by an embodiment of the present application. Please refer to FIG. 10, for each of the multiple regions, the histogram is f(x), x=0,1,2,...255.

302. Determine the dynamic range value of the corresponding area according to each of the histograms.

Exemplarily, each area includes a target area, and the dynamic range value includes a high-exposure dynamic range value and a low-brightness dynamic range value. For the target area, the terminal device extracts the peak value and the average value based on the histogram corresponding to the target area. The dynamic range (DR) value of the target area is determined based on parameters such as the value, the proportion of high-exposure/low-bright areas, and so on.

For example, please refer to Figure 10 again, assuming that the peak value p={x|where max(f(x))};

average value

Proportion of high-exposure/low-bright areas

Or, the proportion of high-exposure/low-bright areas

High exposure

Low exposure

Among them, a, b, and c are positive numbers, and the lower the DR value, the better the dynamic range. The high-exposure DR is used to reflect the high-exposure degree of the target area, and the low-exposure DR is used to reflect the low-exposure degree of the target area. If the high-exposure DR to and the low-exposure DR value are both low, the dynamic range of the target area is better.

303. Determine the first enhancement processing mode corresponding to each region according to each of the dynamic range values.

Exemplarily, different dynamic range values correspond to different first enhancement processing methods. After determining the dynamic range value of an area, the terminal device can obtain the first enhancement processing method of the area by looking up a table or the like. For example, for the target area, the terminal device determines the first enhancement processing mode of the target area according to the exposure DR value and the low-brightness DR value.

In this embodiment, the purpose of obtaining the first enhanced processing mode of the area by the terminal device according to the dynamic range value of the area is achieved.

In the following, an example is used to describe the above-mentioned HDR video generation method in detail. Exemplarily, refer to FIG. 11, which is a schematic diagram of a process of an HDR video generation method provided in an embodiment of the present application.

Referring to FIG. 11, the terminal device is currently recording a video V0, which is the original video described above. In the process of recording V0, the terminal device captures image frames frame by frame. The first captured image frame contains the bridge hole and the scenery. The bridge hole is darker, and the scenery outside the bridge hole is brighter. The terminal device divides the first image frame into area 1 and area 2 according to brightness, where area 1 is a bridge hole portion, which is a low-brightness area, area 2 is a scenery portion, and this portion is a high-brightness area. The first enhancement processing method corresponding to area 1 is high-exposure recovery processing, and the first enhancement processing method corresponding to area 2 is dark area brightening processing, and the terminal device performs high-exposure recovery processing on area 1, so that the bridge holes in area 1 are exposed When the intensity is enhanced, the terminal device performs dark area brightening processing on area 2, so that the brightness of the scenery in area 2 is enhanced. After that, the terminal device uses an edge fusion algorithm to splice the strongly processed area 1 and the enhanced processed area 2 to obtain a second image frame. In the process of shooting video, the terminal device performs the above-mentioned processing on each captured first image frame and synthesizes the second image frame to obtain the second image frame corresponding to each image frame, and finally, synthesize each second image frame. Image frame, get HDR video.

It should be noted that, in the foregoing embodiment, when the terminal device divides the first image frame into regions, it divides the first image frame according to the first image feature. However, the embodiment of the present application is not limited. In other feasible implementation manners, the terminal device may also combine the first image feature and the second image feature to partition the first image frame. For example, if the first image feature is brightness and the second image feature is a person, when the terminal device divides the first image frame containing the person, it can divide the person in the first image frame and use it as a region. The remaining area is subdivided into high-brightness areas and low-brightness areas.

The following are device embodiments of the present invention, which can be used to implement the method embodiments of the present invention. For details that are not disclosed in the device embodiment of the present invention, please refer to the method embodiment of the present invention.

FIG. 12 is a schematic structural diagram of a high dynamic range HDR video generation device provided by an embodiment of the present invention. The HDR video generating apparatus 100 may be implemented in software and/or hardware. As shown in FIG. 12, the HDR video generation device 100 includes:

The transceiver unit 11 is configured to receive a processing request input by a user, and the processing request is used to request the generation of an HDR video;

The processing unit 12, the user divides the first image frame contained in the original video into a plurality of regions according to the processing request, the plurality of regions include at least two types of regions, and the first image features of the different types of regions are different. Each of the multiple regions is processed by the corresponding first enhancement processing method to obtain multiple regions after the enhancement processing, and the multiple regions after the enhancement processing are spliced to obtain a second image frame, according to the second image frame Generate HDR video.

In a feasible implementation manner, the processing unit 12 is configured to perform semantic segmentation on the first area to divide the first area into a plurality of sub-areas, and the plurality of sub-areas includes at least two types of sub-areas. The second image features of the category subregions are different, the first region is any one of the multiple regions, the first image feature and the second image feature are image features of different dimensions, and each subregion is determined The second enhancement processing method of the region, for each of the plurality of subregions, respectively uses the corresponding second enhancement processing method to process, and the multiple subregions after the enhancement processing are spliced to obtain the first region.

In a feasible implementation manner, the processing unit 12 is configured to extract the Y data of each pixel of the first image frame from the YUV color space data of the first image frame, and according to the first image frame The Y data of each pixel of the image frame determines the gradient of each pixel, and the first image frame is divided into a plurality of regions according to the gradient of each pixel.

In a feasible implementation manner, the processing unit 12 is further configured to convert the first image frame into a YUV image before acquiring the YUV color space data of the first image frame.

In a feasible implementation manner, the processing unit 12 is used to process each of the multiple regions by using the corresponding first enhancement processing mode to obtain the multiple regions after the enhancement processing. Obtain a histogram of each of the multiple areas, determine the dynamic range value of the corresponding area according to each of the histograms, and determine the first enhancement processing mode corresponding to each area according to each of the dynamic range values.

In a feasible implementation manner, each area includes a target area, the dynamic range value includes a high-exposure dynamic range value and a low-brightness dynamic range value, and the processing unit 12 is configured to extract the peak value, The average value and the area ratio, the area ratio is used to indicate the ratio of the high exposure area and the low exposure area, and the high value of the target area is determined according to the peak value, the average value, and the area ratio. Exposure dynamic range value and the low light dynamic range value.

In a feasible implementation manner, the original video is a video currently being shot by the terminal device, or the original video is a local LDR video of the terminal device.

The HDR video generation device provided in the embodiment of the present invention can perform the actions of the terminal device in the foregoing embodiment, and its implementation principles and technical effects are similar, and will not be repeated here.

It should be noted that it should be understood that the above transceiving unit may be a transceiver when actually implemented, and the processing unit may be implemented in a form of software calling through a processing element; it may also be implemented in a form of hardware. For example, the processing unit may be a separate processing element, or it may be integrated in a chip of the above-mentioned device for implementation. In addition, it may also be stored in the memory of the above-mentioned device in the form of program code, and a certain processing element of the above-mentioned device Call and execute the functions of the above processing unit. In addition, all or part of these units can be integrated together or implemented independently. The processing element described here may be an integrated circuit with signal processing capabilities. In the implementation process, each step of the above method or each of the above units may be completed by an integrated logic circuit of hardware in the processor element or instructions in the form of software.

For example, the above units may be one or more integrated circuits configured to implement the above methods, such as: one or more application specific integrated circuits (ASIC), or one or more microprocessors (digital signal processor, DSP), or, one or more field programmable gate arrays (FPGA), etc. For another example, when one of the above units is implemented in the form of processing element scheduling program code, the processing element may be a general-purpose processor, such as a central processing unit (CPU) or other processors that can call program codes. For another example, these units can be integrated together and implemented in the form of a system-on-a-chip (SOC).

FIG. 13 is a schematic structural diagram of a terminal device provided by an embodiment of the present application. As shown in FIG. 13, the terminal device 200 includes:

Processor 21 and memory 22;

The memory 22 stores computer execution instructions;

The processor 21 executes the computer-executable instructions stored in the memory 22, so that the processor 21 executes the HDR video generation method corresponding to the above terminal device.

For the specific implementation process of the processor 21, refer to the foregoing method embodiments, and the implementation principles and technical effects are similar, and will not be repeated here in this embodiment.

Optionally, the terminal device 200 further includes a communication interface 23. Among them, the processor 21, the memory 22, and the communication interface 23 may be connected through a bus 24.

The embodiment of the present invention also provides a storage medium, and the storage medium stores computer-executable instructions, and when the computer-executed instructions are executed by a processor, they are used to implement the HDR video generation method executed by the above terminal device.

The embodiment of the present invention also provides a computer program product, which is used to implement the HDR video generation method executed by the terminal device when the computer program product runs on the terminal device.

FIG. 14 is a schematic diagram of the hardware structure of a terminal device provided by an embodiment of the application. As shown in FIG. 14, the terminal device 1000 includes but is not limited to: a radio frequency unit 101, a network module 102, an audio output unit 103, an input unit 104, a sensor 105, a display unit 106, a user input unit 107, an interface unit 108, a memory 109, Processor 110, power supply 111 and other components. Those skilled in the art can understand that the structure of the terminal device shown in FIG. 14 does not constitute a limitation on the terminal device, and the terminal device 1000 may include more or fewer components than shown in the figure, or a combination of certain components, or different components. Component arrangement. In the embodiments of the present application, terminal devices include, but are not limited to, mobile phones, tablet computers, palmtop computers, and so on.

Among them, the user input unit 107 is used to receive user input; the display unit 106 is used to respond to the input received by the user input unit 107 and display content according to the input.

It should be understood that, in the embodiment of the present application, the radio frequency unit 101 can be used for receiving and sending signals in the process of sending and receiving information or talking. Specifically, after receiving downlink data from a master base station or a secondary base station, it is processed by the processor 110; In addition, the uplink data is sent to the primary base station or the secondary base station. Generally, the radio frequency unit 101 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency unit 101 can also communicate with the network and other devices through a wireless communication system.

The terminal device 1000 provides users with wireless broadband Internet access through the network module 102, such as helping users to send and receive emails, browse web pages, and access streaming media.

The audio output unit 103 can convert the audio data received by the radio frequency unit 101 or the network module 102 or stored in the memory 109 into an audio signal and output it as sound. Moreover, the audio output unit 103 may also provide audio output related to a specific function performed by the terminal device 1000 (for example, call signal reception sound, message reception sound, etc.). The audio output unit 103 includes a speaker, a buzzer, a receiver, and the like.

The input unit 104 is used to receive audio or video signals. The input unit 104 may include a graphics processing unit (GPU) 1041 and a microphone 1042. The graphics processor 1041 is used to process image data of pictures or videos captured by a camera or the like. The processed image frame can be displayed on the display unit 106. The image frame processed by the graphics processor 1041 may be stored in the memory 109 (or other storage medium) or sent via the radio frequency unit 101 or the network module 102. The microphone 1042 can receive sound, and can process such sound into audio data. The processed audio data can be converted into a format that can be sent to a mobile communication base station via the radio frequency unit 101 for output in the case of a telephone call mode.

The terminal device 1000 further includes at least one sensor 105, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor includes an ambient light sensor and a proximity sensor. The ambient light sensor can adjust the brightness of the display panel 1061 according to the brightness of the ambient light. The proximity sensor can close the display panel 1061 and the display panel 1061 when the terminal device 1000 is moved to the ear. / Or backlight. As a kind of motion sensor, the accelerometer sensor can detect the magnitude of acceleration in various directions (usually three-axis), and can detect the magnitude and direction of gravity when stationary, and can be used to identify the posture of the terminal device (such as horizontal and vertical screen switching, related games , Magnetometer posture calibration), vibration recognition related functions (such as pedometer, tap), etc.; sensor 105 can also include fingerprint sensors, pressure sensors, iris sensors, molecular sensors, gyroscopes, barometers, hygrometers, thermometers, Infrared sensors, etc., will not be repeated here.

The display unit 106 is used to display information input by the user or information provided to the user. The display unit 106 may include a display panel 1061, and the display panel 1061 may be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), etc.

The user input unit 107 can be used to receive inputted numeric or character information, and generate key signal input related to user settings and function control of the terminal device. Specifically, the user input unit 107 includes a touch panel 1071 and other input devices 1072. The touch panel 1071, also called a touch screen, can collect user touch operations on or near it (for example, the user uses any suitable objects or accessories such as fingers, stylus, etc.) on the touch panel 1071 or near the touch panel 1071. operating). The touch panel 1071 may include two parts: a touch detection device and a touch controller. Among them, the touch detection device detects the user's touch position, and detects the signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts it into contact coordinates, and then sends it To the processor 110, the command sent by the processor 110 is received and executed. In addition, the touch panel 1071 can be implemented in multiple types such as resistive, capacitive, infrared, and surface acoustic wave. In addition to the touch panel 1071, the user input unit 107 may also include other input devices 1072. Specifically, other input devices 1072 may include, but are not limited to, a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackball, mouse, and joystick, which will not be repeated here.

Further, the touch panel 1071 can be overlaid on the display panel 1061. When the touch panel 1071 detects a touch operation on or near it, it transmits it to the processor 110 to determine the type of the touch event, and then the processor 110 determines the type of the touch event according to the touch. The type of event provides corresponding visual output on the display panel 1061. Although in FIG. 14, the touch panel 1071 and the display panel 1061 are used as two independent components to realize the input and output functions of the terminal device, but in some embodiments, the touch panel 1071 and the display panel 1061 can be integrated The implementation of the input and output functions of the terminal device is not specifically limited here.

The interface unit 108 is an interface for connecting an external device with the terminal device 1000. For example, the external device may include a wired or wireless headset port, an external power source (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device with an identification module, audio input/output (I/O) port, video I/O port, headphone port, etc. The interface unit 108 may be used to receive input (for example, data information, power, etc.) from an external device and transmit the received input to one or more elements in the terminal device 1000 or may be used to connect to the terminal device 1000 and an external device. Transfer data between devices.

The memory 109 can be used to store software programs and various data. The memory 109 may mainly include a program storage area and a data storage area. The program storage area may store an operating system, an application program required by at least one function (such as a sound playback function, an image playback function, etc.), etc.; Data created by the use of mobile phones (such as audio data, phone book, etc.), etc. In addition, the memory 109 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices.

The processor 110 is the control center of the terminal device. It uses various interfaces and lines to connect the various parts of the entire terminal device, runs or executes software programs and/or modules stored in the memory 109, and calls data stored in the memory 109. , Perform various functions of the terminal equipment and process data, so as to monitor the terminal equipment as a whole. The processor 110 may include one or more processing units; optionally, the processor 110 may integrate an application processor and a modem processor, where the application processor mainly processes the operating system, user interface, and application programs, etc. The adjustment processor mainly deals with wireless communication. It can be understood that the foregoing modem processor may not be integrated into the processor 110.

Referring to FIG. 14, in this embodiment of the present application, a computer program is stored in the memory 109, wherein the processor 110 runs the computer program so that the terminal device executes the above-mentioned HDR video generation method

In the embodiments of the present application, the processor may be a general-purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component, which may implement or Perform the methods, steps, and logical block diagrams disclosed in the embodiments of the present application. The general-purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware processor, or executed and completed by a combination of hardware and software modules in the processor.

In the embodiment of the present application, the memory may be a non-volatile memory, such as a hard disk drive (HDD) or a solid-state drive (SSD), etc., or a volatile memory (volatile memory), for example Random-access memory (random-access memory, RAM). The memory is any other medium that can be used to carry or store desired program codes in the form of instructions or data structures and that can be accessed by a computer, but is not limited to this. The memory in the embodiments of the present application may also be a circuit or any other device capable of realizing a storage function for storing program instructions and/or data.

The methods provided in the embodiments of the present application may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented by software, it can be implemented in the form of a computer program product in whole or in part. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the procedures or functions according to the embodiments of the present invention are generated in whole or in part. The computer may be a general-purpose computer, a special-purpose computer, a computer network, network equipment, user equipment, or other programmable devices. The computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a website, computer, server, or data center. Transmission to another website, computer, server, or data center via wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.). The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or a data center integrated with one or more available media. The usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, a digital video disc (DVD)), or a semiconductor medium (for example, SSD).

Obviously, those skilled in the art can make various changes and modifications to the application without departing from the scope of the application. In this way, if these modifications and variations of this application fall within the scope of the claims of this application and their equivalent technologies, this application is also intended to include these modifications and variations.

Claims

A high dynamic range HDR video generation method, which is characterized in that it includes:

The terminal device receives a processing request input by the user, and the processing request is used to request the generation of an HDR video;

According to the processing request, the terminal device divides the first image frame included in the original video into a plurality of regions, the plurality of regions include at least two types of regions, and the first image features of the different types of regions are different;

The terminal device respectively uses a corresponding first enhancement processing method to process each of the multiple areas to obtain multiple areas after enhancement processing;

Splicing the multiple regions after the enhancement processing by the terminal device to obtain a second image frame;

The terminal device generates an HDR video according to the second image frame.
The method according to claim 1, wherein the terminal device uses a corresponding first enhancement processing method to process each of the multiple areas respectively, and after obtaining the multiple areas after the enhancement processing, the include:

The terminal device performs semantic segmentation on the first area to divide the first area into a plurality of sub-areas, and the plurality of sub-areas includes at least two types of sub-areas, and the second image features of the different types of sub-areas are different. The first area is any one of the plurality of areas, and the first image feature and the second image feature are image features of different dimensions;

The terminal device determines the second enhancement processing mode of each sub-region;

The terminal device respectively uses a corresponding second enhanced processing mode to process each of the multiple sub-areas;

The terminal device splices the multiple sub-regions after the enhancement processing to obtain the first region.
The method according to claim 1 or 2, wherein the terminal device divides the first image frame contained in the original video into multiple regions according to the processing request, comprising:

The terminal device extracts the Y data of each pixel of the first image frame from the YUV color space data of the first image frame;

The terminal device determines the gradient of each pixel according to the Y data of each pixel of the first image frame;

The terminal device divides the first image frame into multiple regions according to the gradient of each pixel.
The method according to claim 3, wherein before the terminal device obtains the YUV color space data of the first image frame, the method further comprises:

The terminal device converts the first image frame into a YUV image.
The method according to any one of claims 1 to 4, wherein the terminal device uses a corresponding first enhancement processing method to process each of the multiple regions, respectively, to obtain the enhanced multiple Before this area, it also includes:

Acquiring, by the terminal device, a histogram of each of the multiple areas;

The terminal device determines the dynamic range value of the corresponding area according to each of the histograms;

The terminal device respectively determines the first enhancement processing mode corresponding to each area according to each of the dynamic range values.
The method according to claim 5, wherein each of the areas includes a target area, the dynamic range value includes a high-exposure dynamic range value and a low-brightness dynamic range value, and the corresponding area is determined according to each of the histograms The dynamic range value of includes:

The terminal device extracts the peak value, the average value, and the area proportion of the target area, and the area proportion is used to indicate the ratio of the high-exposure area and the low-exposure area;

The terminal device determines the high-exposure dynamic range value and the low-brightness dynamic range value of the target area according to the peak value, the average value, and the area proportion.
The method according to any one of claims 1 to 6, characterized in that:

The original video is a video currently being shot by the terminal device;

or,

The original video is a local LDR video of the terminal device.
A terminal device, characterized by comprising: a processor, a memory, and a computer program stored on the memory and running on the processor, wherein the processor executes the program when the program is executed. The following steps:

Receiving a processing request input by a user, where the processing request is used to request the generation of an HDR video;

According to the processing request, dividing the first image frame included in the original video into a plurality of regions, the plurality of regions include at least two types of regions, and the first image features of the different types of regions are different;

Each region of the multiple regions is processed by a corresponding first enhancement processing mode, respectively, to obtain multiple regions after the enhancement processing;

Stitching multiple regions after the enhancement processing to obtain a second image frame;

The HDR video is generated according to the second image frame.
8. The terminal device according to claim 8, wherein said processing each of said multiple areas using a corresponding first enhancement processing mode respectively, and after obtaining multiple areas after enhancement processing, further comprising: :

Perform semantic segmentation on the first region to divide the first region into a plurality of subregions, the plurality of subregions include at least two types of subregions, and the second image features of the different types of subregions are different, and the first region is In any one of the plurality of regions, the first image feature and the second image feature are image features of different dimensions;

Determine the second enhancement processing mode for each sub-region;

For each of the multiple sub-regions, a corresponding second enhancement processing method is used for processing;

The multiple sub-regions after the enhancement processing are spliced to obtain the first region.
The terminal device according to claim 8 or 9, wherein the dividing the first image frame contained in the original video into multiple regions according to the processing request comprises:

The terminal device extracts the Y data of each pixel of the first image frame from the YUV color space data of the first image frame;

The terminal device determines the gradient of each pixel according to the Y data of each pixel of the first image frame;

The terminal device divides the first image frame into multiple regions according to the gradient of each pixel.
The terminal device according to claim 10, wherein before the acquiring YUV color space data of the first image frame, the method further comprises:

The first image frame is converted into a YUV image.
The terminal device according to any one of claims 8 to 11, wherein the corresponding first enhancement processing method is used to process each of the plurality of regions to obtain a plurality of enhanced processing methods. Before the area, it also includes:

Acquiring a histogram of each of the multiple areas;

Determine the dynamic range value of the corresponding area according to each of the histograms;

According to each of the dynamic range values, the first enhancement processing mode corresponding to each region is determined respectively.
The terminal device according to claim 12, wherein each area includes a target area, the dynamic range value includes a high-exposure dynamic range value and a low-brightness dynamic range value, and the corresponding value is determined according to each of the histograms. The dynamic range value of the area, including:

Extracting the peak value, average value, and area proportion of the target area, where the area proportion is used to indicate the ratio of the high-exposure area and the low-exposure area;

The high-exposure dynamic range value and the low-brightness dynamic range value of the target area are determined according to the peak value, the average value, and the area proportion.
The terminal device according to any one of claims 8-13, wherein:

The original video is a video currently being shot by the terminal device;

or,

The original video is a local LDR video of the terminal device.
A computer storage medium, characterized in that instructions are stored in the computer-readable storage medium, and when the instructions run on a terminal device, the terminal device executes the method according to any one of claims 1 to 7 .
A computer program product containing instructions, characterized in that, when the instructions run on a terminal device, the terminal device executes the method according to any one of claims 1 to 7.
A program product, characterized in that the program product includes a computer program, the computer program is stored in a readable storage medium, and at least one processor of a communication device can read the computer program from the readable storage medium The execution of the computer program by the at least one processor causes the communication device to implement the method according to any one of claims 1-7.