CN112449120B

CN112449120B - High dynamic range video generation method and device

Info

Publication number: CN112449120B
Application number: CN201910817877.2A
Authority: CN
Inventors: 许亦然; 张俪耀; 马靖
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2019-08-30
Filing date: 2019-08-30
Publication date: 2022-06-10
Anticipated expiration: 2039-08-30
Also published as: CN112449120A; WO2021036991A1

Abstract

The application provides a method and a device for generating an HDR video, which relate to the technical field of AI (analog to digital) and are characterized in that after terminal equipment receives a processing request of generating the HDR video from a request input by a user, a first image frame contained in an original video is divided into at least two types of regions, the first image characteristics of the regions in different types are different, the regions in different types are processed by adopting a corresponding first enhancement processing mode to obtain enhanced regions, then a plurality of regions enhanced by the first image frame are spliced to obtain a second image frame, and the HDR video is obtained based on the second image frames. In the process, a plurality of images with different exposure durations are not required to be obtained by shooting each first image frame, so that the method is not limited to a long exposure process, the terminal equipment divides the image frame of the original video into different regions, and the different regions are divided into at least two regions, so that the enhancement schemes of the regions can be determined according to the first image characteristics of the regions, and the regions of different types use different enhancement schemes.

Description

High dynamic range video generation method and device

Technical Field

The present application relates to the technical field of Artificial Intelligence (AI), and in particular, to a method and an apparatus for generating a high dynamic range video.

Background

In the field of image processing, dynamic range (dynamic range) refers to the ratio of the brightest illumination intensity to the darkest illumination intensity on an image, and is one of the important dimensions for image quality evaluation. The image may be classified into a High Dynamic Range (HDR) image, a Low Dynamic Range (LDR) image, and the like according to a dynamic range.

Generally, to obtain an HDR image, the same object may be shot multiple times in a short time to obtain LDR images of the object with different exposure durations, and then the LDR images are combined into the HDR image. For example, in a long exposure process, a short exposure image, a medium exposure image, and a long exposure image of the object are acquired, and then the three images are fused, thereby obtaining an HDR image.

With the development of HDR technology and the increasing popularity of HDR displays, user demand for HDR video is increasing, and at this time, terminal devices are required to have the capability of capturing HDR video or the capability of converting LDR video into HDR video. However, the video is composed of image frames, and if each image frame is composed of images with different exposure durations, the terminal device needs to capture a plurality of images with different exposure durations for each image frame in a short time, and then compose the images into an image frame, thereby obtaining the HDR video according to the plurality of image frames. Obviously, the dynamic range of the video shot by the terminal equipment is limited due to the exposure time.

Disclosure of Invention

The embodiment of the application provides a method and a device for generating a high dynamic range video, wherein an HDR video is obtained by dividing an image frame into different regions and using different enhancement schemes for the different regions.

In a first aspect, an embodiment of the present application provides a method for generating a high dynamic range HDR video, where the method is described from the perspective of a terminal device, and the method includes: after receiving a request input by a user for generating an HDR video, a terminal device divides each first image frame contained in an original video into at least two types of regions, wherein the first image characteristics of the regions in different types are different, the regions in different types are processed by adopting a corresponding first enhancement processing mode to obtain the regions after enhancement processing, then, a plurality of regions after enhancement processing of each first image frame are spliced to obtain second image frames corresponding to each first image frame, and the HDR video is obtained based on the second image frames. In the process, a plurality of images with different exposure durations are not required to be obtained by shooting each first image frame, so that the method is not limited to the long exposure process, the terminal equipment divides the image frame of the original video into different regions, the different regions are divided into at least two regions, and the enhancement schemes of the regions can be determined according to the first image characteristics of the regions, so that the regions of different types use different enhancement schemes. In addition, because the HDR video is obtained by processing each image frame in the embodiment of the present application, it is not necessary to capture an image with a long exposure duration, for example, it is not necessary to capture an image with a medium exposure duration and a long exposure duration, that is, the embodiment of the present application does not require the terminal device to obtain a high dynamic video by extending the exposure, and therefore, the obtained HDR video does not have a ghost.

In a feasible design, after the terminal device processes each of the multiple regions in a corresponding first enhancement processing mode respectively to obtain the multiple regions after enhancement processing, semantic segmentation is further performed on the first region to segment the first region into multiple sub-regions, the multiple sub-regions include at least two types of sub-regions, second image features of the sub-regions in different types are different, the first region is any one of the multiple regions, the first image feature and the second image feature are image features of different dimensions, the second enhancement processing mode of each sub-region is determined, each sub-region in the multiple sub-regions is processed in a corresponding second enhancement processing mode respectively, and the multiple sub-regions after enhancement processing are spliced to obtain the first region. By adopting the scheme, the terminal equipment can further divide the area into a plurality of sub-areas according to the second image characteristics, respectively determine a second enhancement processing mode for each sub-area, and process each sub-area by adopting the second enhancement processing mode, so as to realize the purpose of further enhancing each sub-area.

In a feasible design, when a terminal device divides a first image frame contained in an original video into a plurality of regions according to a processing request, extracting Y data of each pixel point of the first image frame from YUV color space data of the first image frame, determining the gradient of each pixel point according to the Y data of each pixel point of the first image frame, and dividing the first image frame into a plurality of regions according to the gradient of each pixel point. With this arrangement, the first image frame is divided into a plurality of regions according to the brightness.

In one possible design, the terminal device converts the first image frame into a YUV image before acquiring the YUV color space data of the first image frame. With this arrangement, the first image frame is divided into a plurality of regions according to the brightness.

In a feasible design, before the terminal device processes each of the plurality of regions in a corresponding first enhancement processing manner to obtain the plurality of regions after enhancement processing, a histogram of each of the plurality of regions is further obtained, a dynamic range value of the corresponding region is determined according to each histogram, and the first enhancement processing manner corresponding to each region is determined according to each dynamic range value. By adopting the scheme, the purpose that the terminal equipment acquires the first enhancement processing mode of the region according to the dynamic range value of the region is achieved.

In a feasible design, each area comprises a target area, the dynamic range value comprises a high exposure dynamic range value and a low brightness dynamic range value, when the terminal equipment determines the dynamic range value of the corresponding area according to each histogram, the terminal equipment extracts the peak value, the average value and the area ratio of the target area, the area ratio is used for indicating the ratio of the high exposure area to the low exposure area, and the high exposure dynamic range value and the low brightness dynamic range value of the target area are determined according to the peak value, the average value and the area ratio. By adopting the scheme, the terminal equipment determines the high exposure dynamic range and the low brightness dynamic range value of the target area, and further determines a first enhancement processing mode according to the dynamic distribution, so that the terminal equipment processes different types of areas by adopting different enhancement processing schemes.

In one possible design, the original video is the video currently being taken by the electronic device; or an LDR video local to the original video electronics device. With this scheme, the terminal device can record the HDR video in real time or convert the prerecorded video into the HDR video.

In a second aspect, an embodiment of the present application provides a high dynamic range HDR video generating apparatus, including:

a transceiving unit, configured to receive a processing request input by a user, where the processing request is used to request generation of an HDR video;

and a processing unit, configured to divide a first image frame included in the original video into a plurality of regions according to the processing request, where the plurality of regions include at least two types of regions, and first image features of different types of regions are different, process each of the plurality of regions in a corresponding first enhancement processing manner to obtain a plurality of regions after enhancement processing, splice the plurality of regions after enhancement processing to obtain a second image frame, and generate an HDR video according to the second image frame.

In a feasible design, the processing unit is configured to perform semantic segmentation on a first region to segment the first region into a plurality of sub-regions, where the plurality of sub-regions include at least two types of sub-regions, second image features of different types of sub-regions are different, the first region is any one of the plurality of regions, the first image feature and the second image feature are image features of different dimensions, determine a second enhancement processing manner of each sub-region, perform processing on each sub-region in the plurality of sub-regions in a corresponding second enhancement processing manner, and splice the plurality of sub-regions after enhancement processing to obtain the first region.

In a feasible design, the processing unit is configured to extract Y data of each pixel point of the first image frame from YUV color space data of the first image frame, determine a gradient of each pixel point according to the Y data of each pixel point of the first image frame, and divide the first image frame into a plurality of regions according to the gradient of each pixel point.

In one possible design, the processing unit is further configured to convert the first image frame into a YUV image before acquiring the YUV color space data of the first image frame.

In a feasible design, before processing each of the plurality of regions by using the corresponding first enhancement processing manner to obtain the plurality of regions after enhancement processing, the processing unit is further configured to obtain a histogram of each of the plurality of regions, determine a dynamic range value of the corresponding region according to each of the histograms, and determine the first enhancement processing manner corresponding to each of the regions according to each of the dynamic range values.

In a feasible design, each of the regions includes a target region, the dynamic range values include a high exposure dynamic range value and a low brightness dynamic range value, the processing unit is configured to extract a peak value, an average value, and a region proportion of the target region, the region proportion is used to indicate a ratio of a high exposure region to a low exposure region, and the high exposure dynamic range value and the low brightness dynamic range value of the target region are determined according to the peak value, the average value, and the region proportion.

In one possible design, the original video is a video currently being captured by the terminal device, or an LDR video local to the terminal device.

In a third aspect, an embodiment of the present application provides a terminal device, including: a processor, a memory, and a computer program stored on the memory and executable on the processor, the processor executing the program when performing the method of the first aspect or the various possible implementations of the first aspect as described above.

In a fourth aspect, embodiments of the present application provide a computer program product containing instructions, which when run on a terminal device, cause the terminal device computer to perform the method of the first aspect or the various possible implementations of the first aspect.

In a fifth aspect, embodiments of the present application provide a computer-readable storage medium, which stores instructions that, when executed on a terminal device, cause the terminal device to perform the method of the first aspect or each possible implementation manner of the first aspect.

In a sixth aspect, an embodiment of the present application provides a chip system, where the chip system includes a processor and may further include a memory, and is configured to implement the function of the terminal device in the foregoing method. The chip system may be formed by a chip, and may also include a chip and other discrete devices.

According to the method and the device for generating the high dynamic range HDR video, after terminal equipment receives a processing request of requesting to generate the HDR video, the terminal equipment divides each first image frame contained in an original video into at least two types of regions, the first image characteristics of the regions in different types are different, the regions in different types are processed in a corresponding first enhancement processing mode to obtain the regions after enhancement processing, then, the regions after enhancement processing of each first image frame are spliced to obtain second image frames corresponding to each first image frame, and the HDR video is obtained based on the second image frames. In the process, a plurality of images with different exposure durations are not required to be obtained by shooting each first image frame, so that the method is not limited to the long exposure process, the terminal equipment divides the image frame of the original video into different regions, the different regions are divided into at least two regions, and the enhancement schemes of the regions can be determined according to the first image characteristics of the regions, so that the regions of different types use different enhancement schemes. In addition, because the HDR video is obtained by processing each image frame in the embodiment of the present application, it is not necessary to capture an image with a long exposure time, for example, it is not necessary to capture an image with a medium exposure time and a long exposure time, that is, the embodiment of the present application does not require the terminal device to obtain a high dynamic video by lengthening the exposure, and therefore, no ghost image appears in the obtained HDR video.

Drawings

FIG. 1 is a schematic diagram of a process for generating an HDR image;

fig. 2 is a flowchart of an HDR video generation method provided by an embodiment of the present application;

fig. 3 is a schematic diagram of an input processing request in an HDR video generating method according to an embodiment of the present application;

fig. 4 is another schematic diagram of an input processing request in an HDR video generation method provided by an embodiment of the present application;

fig. 5 is a schematic diagram of regions of a first image frame in an HDR video generation method provided by an embodiment of the present application;

fig. 6 is another schematic diagram of regions of a first image frame in an HDR video generation method provided by an embodiment of the present application;

fig. 7 is a schematic process diagram of distinguishing between segmentation and sub-region segmentation in the HDR video generation method provided by the embodiment of the present application;

fig. 8 is a flowchart of segmenting a first image frame in an HDR video generation method provided by an embodiment of the present application;

fig. 9 is a flowchart of determining a first enhancement processing manner in the HDR video generation method provided by the embodiment of the present application;

fig. 10 is a schematic diagram of regions and histograms in an HDR video generation method provided by an embodiment of the present application;

fig. 11 is a process schematic diagram of an HDR video generation method provided by an embodiment of the present application;

fig. 12 is a schematic structural diagram of an HDR video generating apparatus according to an embodiment of the present invention;

Fig. 13 is a schematic structural diagram of a terminal device according to an embodiment of the present application;

fig. 14 is a schematic diagram of a hardware structure of a terminal device according to an embodiment of the present application.

Detailed Description

Currently, to obtain an HDR image, a plurality of LDR images with different exposure durations are obtained, and then the LDR images are combined into the HDR image. Fig. 1 is a schematic diagram of a process for generating an HDR image. Referring to fig. 1, a long exposure process T includes three stages: the exposure time lengths corresponding to the short exposure stage, the middle exposure stage and the long exposure stage are respectively T1, T2 and T3, wherein T1 is more than T2 and more than T3, and T1+ T2+ T3 is less than or equal to T. The terminal equipment obtains three images with different exposure durations, namely a short exposure duration image, a medium exposure duration image and a long exposure duration image; then, the three images are subjected to fusion processing, thereby obtaining an HDR image.

According to the above, it can be seen that: in fig. 1, three images with different exposure durations in a long exposure process are fused, so that a part or all of the images are likely to have ghosts due to long exposure durations, hand shake during user shooting, and the like, for example, the images with medium exposure durations and the images with long exposure durations may have ghosts, so that the HDR image synthesized from the images may also have ghosts. When the method is applied to generating an HDR video, images with different exposure durations are required to be generated for each image frame in the video, and many objects in the video move, so that the images with different exposure durations appear ghost images; in addition, one long exposure process T is limited, and obviously, the terminal device cannot capture a plurality of image frames in the limited long exposure process T, and capture images with different exposure durations for each image frame. Therefore, the above-described method of synthesizing an HDR image from LDR images of a plurality of different exposure durations is not suitable for HDR video generation.

Therefore, the embodiments of the present application provide a method and an apparatus for generating a high dynamic range video, which divide an image frame of an original video into different regions, and use different enhancement schemes for the different regions, thereby obtaining an HDR video.

In the embodiment of the application, the terminal device may directly capture the HDR video, or the terminal device may also process the local original video to obtain the HDR video. The original video may be an LDR video, a Standard Dynamic Range (SDR) video, or the like, which is previously captured by the terminal device or downloaded from the server.

The terminal device related in the embodiment of the present application may be a terminal device capable of video broadcasting and video recording. In the embodiments of the present application, the terminal device may communicate with one or more core networks or the internet via a Radio Access Network (RAN), and may be a mobile terminal device, such as a mobile phone (or a mobile phone), a computer and a data card, for example, a portable, pocket, handheld, computer-embedded or vehicle-mounted mobile device, which exchanges languages and/or data with the RAN. Examples of such devices include Personal Communication Service (PCS) phones, cordless phones, Session Initiation Protocol (SIP) phones, Wireless Local Loop (WLL) stations, Personal Digital Assistants (PDAs), tablet computers (pads), and computers with wireless transceiving functions. A wireless terminal device may also be referred to as a system, a subscriber unit (subscriber unit), a subscriber station (subscriber station), a mobile station (mobile station), a Mobile Station (MS), a remote station (remote station), an Access Point (AP), a remote terminal device (remote terminal), an access terminal device (access terminal), a user terminal device (user terminal), a user agent (user agent), a Subscriber Station (SS), a user terminal device (CPE), a terminal (terminal), a User Equipment (UE), a Mobile Terminal (MT), etc. The wireless terminal device may also be a wearable device and a next generation communication system, for example, a terminal device in a 5G network or a terminal device in a Public Land Mobile Network (PLMN) network for future evolution, a terminal device in an NR communication system, etc.

Next, taking an example in which a terminal device captures an HDR video, a method for generating an HDR video according to an embodiment of the present application is described in detail.

Fig. 2 is a flowchart of an HDR video generation method provided in an embodiment of the present application, which is described from the perspective of a terminal device, and the embodiment includes:

101. a processing request input by a user is received, the processing request requesting generation of an HDR video.

For example, the user may input the processing request to the terminal device by a voice method, a touch method, or the like. For example, see fig. 3 and 4.

Fig. 3 is a schematic diagram of an input processing request in an HDR video generating method according to an embodiment of the present application. Referring to fig. 3, in this embodiment, an HDR video generation Application (APP) is installed on the terminal device, where the APP may be an APP carried by an operating system of the terminal device, such as a camera, or an APP of a third party that is downloaded and installed on the terminal device by a user. Taking APP as a third party APP, when an HDR video needs to be generated, a user clicks the HDR video on a desktop to generate the APP, thereby starting the APP. Besides a shooting button, the APP interface also has options for a user to select, such as delayed shooting, slow motion, video, photos, panorama and the like. After a user selects a video (shown as a dotted line frame in the figure), two options of a flash and an HDR are started by the terminal equipment for the user to select, and when the user selects the HDR option, the terminal equipment shoots the HDR video in real time.

Fig. 4 is another schematic diagram of an input processing request in an HDR video generating method according to an embodiment of the present application. Referring to fig. 4, a difference between this embodiment and fig. 3 is that, in this embodiment, after a user starts an APP, an interface of the APP has options for the user to select, such as options of delayed shooting, slow motion, normal video, HDR video, photo, panorama, and the like, and when the user selects the HDR video, the HDR video can be shot in real time.

It should be noted that, although fig. 3 and fig. 4 describe the embodiment of the present application by taking an HDR video in real time as an example, the embodiment of the present application is not limited, in other feasible implementations, after a user starts an APP, a selection button is further displayed on an interface of the APP, when the user clicks the selection button, the terminal device pops up a selection page, a series of videos downloaded or recorded by the terminal device are displayed on the selection page, and after the user selects an original video, the terminal device generates the HDR video based on the selected original video.

102. According to the processing request, a first image frame contained in the original video is divided into a plurality of areas, the plurality of areas contain at least two types of areas, and the first image characteristics of the areas in different types are different.

Illustratively, when the terminal device directly captures the HDR video, the original video is a video captured in real time in a view finder of the terminal device, and the first image frame is an image frame in the original video. When the terminal device converts the local video into the HDR video, the original video is a video stored locally by the terminal device, and may be a video recorded in advance by the terminal device or downloaded from a server, and the first image frame is any one image frame in the original video. For each first image frame, the terminal device divides the first image frame into a plurality of regions, the regions can be divided into at least two types of regions, each type of region comprises at least one region, when the at least one region comprises two or more regions, the two or more regions are discontinuous in the first image frame, and the first image characteristics of the regions of different types are different. The image features include brightness, content, color, texture, shape, etc. Next, how to divide the image frame into a plurality of regions will be described by taking the image characteristics as the brightness and the contents included in the image as an example. For example, see fig. 5 and 6.

Fig. 5 is a schematic diagram of regions of a first image frame in an HDR video generation method according to an embodiment of the present application. In the embodiment of the present application, the image frame is a YUV image, and YUV color space data of the first image frame can be obtained according to the first image frame, where Y data represents brightness (luminance or luma), that is, brightness; and "U" and "V" denote chromaticity (chroma) which is used to describe the color and saturation of an image, Y data is also referred to as luminance information, and U data and V data are also referred to as color information. Assuming that all the luminances have 0-255 and have 255 gray levels of luminance, the 255 gray levels of luminance are divided into three types in advance: the gray scale interval of 0-193 is a low-brightness interval, the gray scale interval of 194-223 is a medium-brightness interval, and the gray scale interval of 224-255 is a high-brightness interval. Referring to fig. 5, the terminal device extracts the brightness of each pixel point of the first image frame, and determines the range to which the brightness of each pixel point belongs, so as to divide the first image frame into three brightness regions according to the brightness, which are a high brightness region, a medium brightness region, and a low brightness region, where the same brightness region may be discontinuous, as shown in fig. 5, the high brightness region includes a region a and a region B, the medium brightness region includes a region C and a region D, and the low brightness region includes a region E.

Fig. 6 is another schematic diagram of regions of a first image frame in an HDR video generation method according to an embodiment of the present application. Referring to fig. 6, the terminal device detects whether a first image frame includes specific content, such as people, sky, buildings, trees, and the like, by using a picture detection algorithm, such as an inter-frame difference method, a background modeling method, a point detection method, an image segmentation method, a cluster analysis method, a motion vector field method, and the like, and a region including the same specific content may not be continuous. The terminal device divides the first image frame into four types of regions according to the content, namely a human region, a sky region, a building region and a tree region.

103. And processing each of the plurality of regions by adopting a corresponding first enhancement processing mode to obtain a plurality of regions after enhancement processing.

Illustratively, the terminal device determines a corresponding first enhancement processing mode for each category of area, and the first enhancement processing modes of different categories of areas are different. And then, for each area, the terminal equipment performs enhancement processing on the area by adopting a first enhancement processing mode corresponding to the area. For example, referring to fig. 5, the terminal device performs color brightening processing on high-luminance areas, such as area a and area B; performing a process of increasing contrast for a middle brightness region such as the region C and the region D; the process of enhancing the exposure intensity is performed for a low-luminance area, such as area E. For another example, referring to fig. 6, the terminal device performs a skin color enhancement process on the human area; carrying out color brightening treatment on the sky area; carrying out fuzzy processing on the building area; and performing texture enhancement processing on the tree region to ensure the quality improvement of each region in the first image frame and further ensure the video quality improvement.

104. And splicing the plurality of regions after the enhancement processing to obtain a second image frame.

For example, in the step 103, a plurality of regions after the enhancement processing of each first image frame may be obtained, and in this step, for each first image frame, the plurality of regions after the enhancement processing are spliced by an alpha-blending (α -blending), laplacian blending (laplacian blending) or other manners, so as to obtain a second image frame corresponding to each first image frame.

105. Generating the HDR video from the second image frame.

Illustratively, the original video comprises a plurality of first image frames, with different first image frames corresponding to different second image frames. And after the terminal equipment obtains second image frames according to the first image frames, the second image frames are fused to obtain the HDR video.

In the HDR video generation method provided by the embodiment of the application, after receiving a processing request for requesting generation of an HDR video, which is input by a user, a terminal device divides each first image frame included in an original video into at least two types of regions, where first image features of the different types of regions are different, processes the different types of regions in a corresponding first enhancement processing manner to obtain enhanced regions, then splices multiple enhanced regions of each first image frame to obtain second image frames corresponding to each first image frame, and obtains the HDR video based on the second image frames. In the process, a plurality of images with different exposure durations are not required to be obtained by shooting each first image frame, so that the method is not limited to the long exposure process, the terminal equipment divides the image frame of the original video into different regions, the different regions are divided into at least two regions, and the enhancement schemes of the regions can be determined according to the first image characteristics of the regions, so that the regions of different types use different enhancement schemes. In addition, because the HDR video is obtained by processing each image frame in the embodiment of the present application, it is not necessary to capture an image with a long exposure duration, for example, it is not necessary to capture an image with a medium exposure duration and a long exposure duration, that is, the embodiment of the present application does not require the terminal device to obtain a high dynamic video by extending the exposure, and therefore, the obtained HDR video does not have a ghost.

In the above embodiment, instead of dividing the area, the first image frame may be subjected to enhancement processing by using a certain enhancement processing method, and the specific enhancement processing method may be determined according to the overall situation of the first image frame. This kind of mode has the limitation, can't promote the effect of full scene dynamic range, consequently, through regional customization enhancement processing scheme, can improve the limitation that single enhancement processing scheme brought.

In the above embodiment, each type of area corresponds to one first enhancement processing mode, and the terminal device performs processing by using the corresponding first enhancement processing mode for each area. Further, for each region, the terminal device may further divide the region into a plurality of sub-regions according to the second image feature, determine a second enhancement processing mode for each sub-region, and process each sub-region by using the second enhancement processing mode. Next, how the terminal device divides the sub-areas and the like will be described in detail.

In a possible implementation manner, the terminal device first divides the first image frame into a plurality of regions, where the regions include at least two types of regions, the first image features of the same type of region are the same, and the first image features of different types of regions are different. Then, as for any one of a plurality of regions included in one class of region, which is referred to as a first region hereinafter, the terminal device performs semantic segmentation on the first region to segment the first region into a plurality of sub-regions, where the sub-regions include at least two classes of sub-regions, the second image features of the sub-regions in the same class are the same, the second image features of the sub-regions in different classes are different, the first image features and the second image features are image features of different dimensions, a second enhancement processing mode of each sub-region is determined, each sub-region in the plurality of sub-regions is processed by a corresponding second enhancement processing mode, and the plurality of sub-regions after enhancement processing are spliced to obtain the first region.

Illustratively, the terminal device divides any one of a plurality of regions of the first image frame, hereinafter referred to as a first region, into a plurality of sub-regions in accordance with the second image feature. Fig. 7 is a schematic process diagram of distinguishing between segmentation and sub-region segmentation in the HDR video generation method provided in the embodiment of the present application. Referring to fig. 7, assuming that the first image characteristic is luminance and the second image characteristic is content included in an image, after the terminal device divides the first image frame into a high luminance region, a low luminance region and a medium luminance region, for each region, taking the high luminance region as an example, the terminal device further performs semantic segmentation on the high luminance region, and divides the high luminance region into a person sub-region, a sky sub-region, a grass sub-region, and the like (not shown in the figure). Then, determining a second enhancement processing mode corresponding to each sub-region, and performing enhancement processing on the corresponding sub-regions by adopting the second enhancement processing mode; and then, splicing the sub-regions to obtain corresponding regions, and then splicing the regions to obtain a second image frame.

In the above embodiment, after the first image frame is divided into at least two types of regions, each of the regions is processed by the first enhancement processing method corresponding to the region type, and then any one of the regions (hereinafter referred to as a first region) is divided into at least two types of sub-regions, and then each of the sub-regions is processed by the second enhancement processing method corresponding to the sub-region type. However, the embodiment of the present application is not limited, and in other possible implementations, after the first image frame is divided into a plurality of regions, the regions are not processed, but after each region is further divided into sub-regions, each sub-region may be processed. Or after each region is divided into sub-regions, different enhancement processing modes are adopted according to the image content in the sub-regions. For example, referring to fig. 6 again, taking the first region as an human region as an example, the terminal device performs semantic segmentation on the first region to obtain two sub-regions, where a human in one sub-region is a male and a human in the other sub-region is a female, and then the two human sub-regions are processed in different enhancement processing manners.

In this embodiment, the terminal device may further divide the region into a plurality of sub-regions according to the second image feature, determine a second enhancement processing mode for each sub-region, and process each sub-region by using the second enhancement processing mode, so as to achieve the purpose of further enhancing each sub-region.

Next, a detailed description will be given of how the terminal device divides the first image frame included in the original video into a plurality of regions according to the processing request, taking the first image feature as the brightness as an example. For example, referring to fig. 8, fig. 8 is a flowchart of segmenting a first image frame in an HDR video generation method provided in an embodiment of the present application, where the embodiment includes:

201. a first image frame is acquired.

Illustratively, the terminal device sequentially inputs any one image frame of the original video, namely, the first image frame, into the HDR video generating apparatus. For example, when the original video is a recorded video, the terminal device performs framing processing on the original video to obtain an image frame sequence, and sequentially inputs the image frames to the HDR video generating device in the order of the image frames in the image frame sequence. As another example, when the original video is a video that the terminal device is currently shooting, the terminal device captures each image frame by frame and inputs it into the HDR video generating apparatus in turn. The first image frame is a YUV image, and if the first image frame is not a YUV image, such as Red Green Blue (RGB), the first image frame needs to be converted into a YUV image, an image such as Hue Luminance Saturation (HLS) and the like, which facilitates extraction of luminance information.

202. And extracting Y data of each pixel point of the first image frame from the YUV color space data of the first image frame.

For example, Y data in YUV color space data represents brightness (luma), that is, brightness; and "U" and "V" denote chromaticity (chroma) which is used to describe image color and saturation. The terminal equipment extracts Y data of each pixel point of the first image frame, and therefore a plurality of Y data are obtained.

203. And the terminal equipment filters the data to obtain smooth Y data.

Illustratively, the terminal device performs smoothing processing on the Y data by using gaussian filtering or the like, thereby obtaining smoothed Y data.

204. And determining the gradient of each pixel point according to the Y data of each pixel point of the first image frame.

Illustratively, the terminal device determines the gradient G of each pixel point using a Robert (Robert) edge detection operator, a sobel (sobel) edge detection operator, or the like.

Wherein G is_xIs the gradient, G, of a pixel point in the direction of the horizontal axis_yThe gradient in the direction of the longitudinal axis of the corresponding pixel point. Assuming that the pixel point a and the pixel point b are two adjacent pixel points in the horizontal axis direction, and the pixel point a and the pixel point c are two adjacent pixel points in the vertical axis direction, then for the pixel point a, G _xEqual to the absolute value of the difference between the Y data of pixel a and the Y data of pixel b, G_yEqual to the absolute value of the difference between the Y data of pixel a and the Y data of pixel c.

In the segmentation process, a first threshold and a second threshold are preset, wherein the first threshold is represented as a high threshold (Highthreshold), and if a pixel is a high-brightness pixel, the gradient of the pixel cannot be lower than the first threshold; the second threshold is a low threshold (Lowthreshold), and if a pixel is a low-brightness pixel, the gradient of the pixel cannot be higher than the second threshold. For any pixel point in the first image frame, the gradient of the pixel point is recorded as G_pIf G is_pIf the pixel value is larger than or equal to the first threshold value, the pixel point is a high-brightness pixel point; if G is_pIf the pixel value is less than or equal to the second threshold value, the pixel point is a low-brightness pixel point; if G is_pAnd if the pixel point is between the first threshold value and the second threshold value, the pixel point is a middle-brightness pixel point.

205. And dividing the first image frame into a plurality of regions according to the gradient of each pixel point.

Illustratively, the terminal device takes the pixels in the first image frame which exceed the first threshold value and are gathered together as a region. Since it is highly likely that the pixels with a gradient exceeding the first threshold are distributed at different places in the first image frame, a threshold value, such as 100, may be set, and consecutive 100 or more pixels with a gradient exceeding the first threshold form a high brightness region. In this way, there is a high probability that there is more than one high brightness region in the first image frame. Similarly, the low-luminance region and the medium-luminance region are obtained in the same manner, so that the first image frame is divided into a plurality of regions.

In the present embodiment, the object of dividing the first image frame into a plurality of regions according to the brightness is achieved.

Next, in the above embodiment, a first enhancement processing method of how to determine each category of area is described in detail. For example, refer to fig. 9, fig. 9 is a flowchart for determining a first enhancement processing mode in an HDR video generation method provided in the embodiment of the present application, where the embodiment includes:

301. a histogram of each of the plurality of regions is obtained.

For example, for each of the plurality of regions, the terminal device may make a histogram of the respective region based on color characteristics or the like. Fig. 10 is a schematic diagram of regions and histograms in an HDR video generation method provided by an embodiment of the present application. Referring to fig. 10, for each of the plurality of regions, the histogram is f (x), where x is 0,1,2, … … 255.

302. And determining the dynamic range value of the corresponding area according to each histogram.

Illustratively, each region includes a target region, the dynamic range value includes a high exposure dynamic range value and a low brightness dynamic range value, for the target region, the terminal device extracts parameters such as a peak value, an average value, and a high exposure/low brightness area ratio based on a histogram corresponding to the target region, and determines a Dynamic Range (DR) value of the target region based on the parameters.

For example, referring to fig. 10 again, assume that the peak value p ═ { x | heremax (f (x)) };

mean value of

High exposure/low bright area ratio

Alternatively, the high exposure/low bright area ratio

High exposure

Low exposure

Wherein a, b and c are positive numbers, and the lower the DR value, the better the dynamic range. The high exposure DR is used for reflecting the high exposure degree of the target area, the low exposure DR is used for reflecting the low exposure degree of the target area, and if the values of the high exposure DR and the low exposure DR are both low, the dynamic range of the target area is better.

303. And respectively determining a first enhancement processing mode corresponding to each region according to each dynamic range value.

Illustratively, different dynamic range values correspond to different first enhancement processing modes, and after determining the dynamic range value of an area, the terminal device obtains the first enhancement processing mode of the area through a mode of table lookup and the like. For example, for the target area, the terminal device determines a first enhancement processing mode of the target area according to the exposure DR value and the low-brightness DR value.

In this embodiment, the terminal device obtains the first enhancement processing mode of the region according to the dynamic range value of the region.

The HDR video generation method described above will be described in detail below with an example. For example, referring to fig. 11, fig. 11 is a schematic process diagram of an HDR video generation method provided by an embodiment of the present application.

Referring to fig. 11, the terminal device is currently recording a video V0, the video V0 is the original video. In the process of recording the V0, the terminal device captures image frames frame by frame, wherein the captured first image frame comprises a bridge opening and scenery, the bridge opening is darker, and the scenery outside the bridge opening is brighter. The terminal device divides the first image frame into an area 1 and an area 2 according to the brightness, wherein the area 1 is a bridge hole part and the part is a low-brightness area, and the area 2 is a scene part and the part is a high-brightness area. The first enhancement processing mode corresponding to the area 1 is high exposure recovery processing, the first enhancement processing mode corresponding to the area 2 is dark area brightening processing, then the terminal device performs high exposure recovery processing on the area 1 to enhance the exposure of the bridge opening in the area 1, and the terminal device performs dark area brightening processing on the area 2 to enhance the brightness of the scenery in the area 2. And then, the terminal equipment splices the region 1 after the strong processing and the region 2 after the enhancement processing by using an edge fusion algorithm to obtain a second image frame. In the process of shooting the video, the terminal device performs the above processing on each captured first image frame and synthesizes second image frames to obtain second image frames corresponding to each image frame, and finally synthesizes each second image frame to obtain the HDR video.

It should be noted that, in the foregoing embodiment, the terminal device performs region division on the first image frame according to the first image feature when the terminal device performs region division on the first image frame, however, the embodiment of the present application is not limited, and in other possible implementations, the terminal device may also perform region division on the first image frame by combining the first image feature and the second image feature. For example, if the first image characteristic is brightness and the second image characteristic is a person, when the terminal device divides a first image frame including the person, the person in the first image frame may be divided and used as one region, and the remaining region may be further divided into a high-brightness region and a low-brightness region.

The following are embodiments of the apparatus of the present invention that may be used to perform embodiments of the method of the present invention. For details which are not disclosed in the embodiments of the apparatus of the present invention, reference is made to the embodiments of the method of the present invention.

Fig. 12 is a schematic structural diagram of a high dynamic range HDR video generating apparatus according to an embodiment of the present invention. The HDR video generating apparatus 100 may be implemented by software and/or hardware. As shown in fig. 12, the HDR video generation apparatus 100 includes:

A transceiving unit 11, configured to receive a processing request input by a user, where the processing request is used to request generation of an HDR video;

the processing unit 12, according to the processing request, a user divides a first image frame included in the original video into a plurality of regions, where the plurality of regions includes at least two types of regions, and first image features of different types of regions are different, processes each region in the plurality of regions by using a corresponding first enhancement processing method, obtains a plurality of regions after enhancement processing, splices the plurality of regions after enhancement processing, obtains a second image frame, and generates an HDR video according to the second image frame.

In a feasible implementation manner, the processing unit 12 is configured to perform semantic segmentation on a first region to segment the first region into a plurality of sub-regions, where the plurality of sub-regions include at least two types of sub-regions, second image features of different types of sub-regions are different, the first region is any one of the plurality of regions, the first image feature and the second image feature are image features of different dimensions, determine a second enhancement processing manner of each sub-region, perform processing on each sub-region in the plurality of sub-regions by using a corresponding second enhancement processing manner, and splice the plurality of sub-regions after enhancement processing to obtain the first region.

In a feasible implementation manner, the processing unit 12 is configured to extract Y data of each pixel point of the first image frame from YUV color space data of the first image frame, determine a gradient of each pixel point according to the Y data of each pixel point of the first image frame, and divide the first image frame into a plurality of regions according to the gradient of each pixel point.

In a possible implementation manner, the processing unit 12 is further configured to convert the first image frame into a YUV image before acquiring the YUV color space data of the first image frame.

In a possible implementation manner, before processing each of the multiple regions by using the corresponding first enhancement processing manner to obtain the multiple regions after enhancement processing, the processing unit 12 is further configured to obtain a histogram of each of the multiple regions, determine a dynamic range value of the corresponding region according to each histogram, and determine the first enhancement processing manner corresponding to each region according to each dynamic range value.

In a possible implementation manner, each of the regions includes a target region, the dynamic range values include a high exposure dynamic range value and a low brightness dynamic range value, the processing unit 12 is configured to extract a peak value, an average value, and a region proportion of the target region, where the region proportion is used to indicate a ratio of a high exposure region and a low exposure region, and the high exposure dynamic range value and the low brightness dynamic range value of the target region are determined according to the peak value, the average value, and the region proportion.

In a possible implementation, the original video is a video currently being captured by the terminal device, or an LDR video local to the terminal device.

The HDR video generation apparatus provided in the embodiment of the present invention may perform the actions of the terminal device in the above embodiments, and the implementation principle and the technical effect are similar, which are not described herein again.

It should be noted that, when the above transceiving unit is actually implemented, it may be a transceiver, and the processing unit may be implemented in a form called by software through the processing element; or may be implemented in hardware. For example, the processing unit may be a processing element separately set up, or may be implemented by being integrated into a chip of the apparatus, or may be stored in a memory of the apparatus in the form of program code, and a function of the processing unit may be called and executed by a processing element of the apparatus. In addition, all or part of the units can be integrated together or can be independently realized. The processing element described herein may be an integrated circuit having signal processing capabilities. In implementation, the steps of the method or the units above may be implemented by hardware integrated logic circuits in a processor element or instructions in software.

For example, the above units may be one or more integrated circuits configured to implement the above methods, such as: one or more Application Specific Integrated Circuits (ASICs), or one or more microprocessors (DSPs), or one or more Field Programmable Gate Arrays (FPGAs), among others. For another example, when some of the above units are implemented in the form of a processing element scheduler code, the processing element may be a general-purpose processor, such as a Central Processing Unit (CPU) or other processor capable of calling program code. As another example, these units may be integrated together and implemented in the form of a system-on-a-chip (SOC).

Fig. 13 is a schematic structural diagram of a terminal device according to an embodiment of the present application, and as shown in fig. 13, the terminal device 200 includes:

a processor 21 and a memory 22;

the memory 22 stores computer-executable instructions;

the processor 21 executes the computer-executable instructions stored in the memory 22, so that the processor 21 executes the HDR video generation method corresponding to the terminal device.

For a specific implementation process of the processor 21, reference may be made to the above method embodiments, which implement similar principles and technical effects, and this embodiment is not described herein again.

Optionally, the terminal device 200 further comprises a communication interface 23. The processor 21, the memory 22, and the communication interface 23 may be connected by a bus 24.

An embodiment of the present invention further provides a storage medium, where the storage medium stores computer-executable instructions, and the computer-executable instructions are executed by a processor to implement the HDR video generation method executed by the terminal device.

The embodiment of the present invention further provides a computer program product, which is used for implementing the HDR video generation method executed by the terminal device when the computer program product runs on the terminal device.

Fig. 14 is a schematic diagram of a hardware structure of a terminal device according to an embodiment of the present application. As shown in fig. 14, terminal device 1000 includes, but is not limited to: a radio frequency unit 101, a network module 102, an audio output unit 103, an input unit 104, a sensor 105, a display unit 106, a user input unit 107, an interface unit 108, a memory 109, a processor 110, a power supply 111, and the like. Those skilled in the art will appreciate that the terminal device configuration shown in fig. 14 does not constitute a limitation of the terminal device, and that terminal device 1000 may include more or fewer components than shown, or some components may be combined, or a different arrangement of components. In the embodiment of the present application, the terminal device includes, but is not limited to, a mobile phone, a tablet computer, a palmtop computer, and the like.

A user input unit 107 for receiving user input; a display unit 106 for displaying contents according to an input in response to an input received by the user input unit 107.

It should be understood that, in the embodiment of the present application, the radio frequency unit 101 may be used for receiving and transmitting signals during a message transmission or call process, and specifically, receive downlink data from a primary base station or a secondary base station, and then process the received downlink data to the processor 110; in addition, the uplink data is transmitted to the primary base station or the secondary base station. Typically, radio frequency unit 101 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency unit 101 can also communicate with a network and other devices through a wireless communication system.

The terminal device 1000 provides wireless broadband internet access to the user through the network module 102, such as helping the user send and receive e-mails, browse webpages, access streaming media, and the like.

The audio output unit 103 may convert audio data received by the radio frequency unit 101 or the network module 102 or stored in the memory 109 into an audio signal and output as sound. Also, the audio output unit 103 may also provide audio output related to a specific function performed by the terminal apparatus 1000 (e.g., a call signal reception sound, a message reception sound, etc.). The audio output unit 103 includes a speaker, a buzzer, a receiver, and the like.

The input unit 104 is used to receive an audio or video signal. The input Unit 104 may include a Graphics Processing Unit (GPU) 1041 and a microphone 1042, the Graphics Processing Unit 1041 being configured to process image data of a picture or video captured by a camera or the like. The processed image frames may be displayed on the display unit 106. The image frames processed by the graphic processor 1041 may be stored in the memory 109 (or other storage medium) or transmitted via the radio frequency unit 101 or the network module 102. The microphone 1042 may receive sound and may be capable of processing such sound into audio data. The processed audio data may be converted into a format output transmittable to a mobile communication base station via the radio frequency unit 101 in case of a phone call mode.

The terminal device 1000 can also include at least one sensor 105, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor includes an ambient light sensor that can adjust the brightness of the display panel 1061 according to the brightness of ambient light, and a proximity sensor that can turn off the display panel 1061 and/or the backlight when the terminal device 1000 moves to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally three axes), detect the magnitude and direction of gravity when stationary, and can be used to identify the terminal device posture (such as horizontal and vertical screen switching, related games, magnetometer posture calibration), vibration identification related functions (such as pedometer, tapping), and the like; the sensors 105 may also include fingerprint sensors, pressure sensors, iris sensors, molecular sensors, gyroscopes, barometers, hygrometers, thermometers, infrared sensors, etc., which are not described in detail herein.

The display unit 106 is used to display information input by a user or information provided to the user. The Display unit 106 may include a Display panel 1061, and the Display panel 1061 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like.

The user input unit 107 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the terminal device. Specifically, the user input unit 107 includes a touch panel 1071 and other input devices 1072. Touch panel 1071, also referred to as a touch screen, may collect touch operations by a user on or near the touch panel 1071 (e.g., operations by a user on or near touch panel 1071 using a finger, stylus, or any suitable object or attachment). The touch panel 1071 may include two parts of a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 110, and receives and executes commands sent by the processor 110. In addition, the touch panel 1071 may be implemented in various types, such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. In addition to the touch panel 1071, the user input unit 107 may include other input devices 1072. Specifically, other input devices 1072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, and a joystick, which are not described in detail herein.

Further, the touch panel 1071 may be overlaid on the display panel 1061, and when the touch panel 1071 detects a touch operation thereon or nearby, the touch panel 1071 transmits the touch operation to the processor 110 to determine the type of the touch event, and then the processor 110 provides a corresponding visual output on the display panel 1061 according to the type of the touch event. Although in fig. 14, the touch panel 1071 and the display panel 1061 are two independent components to implement the input and output functions of the terminal device, in some embodiments, the touch panel 1071 and the display panel 1061 may be integrated to implement the input and output functions of the terminal device, and is not limited herein.

The interface unit 108 is an interface for connecting an external device to the terminal apparatus 1000. For example, the external device may include a wired or wireless headset port, an external power supply (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device having an identification module, an audio input/output (I/O) port, a video I/O port, an earphone port, and the like. Interface unit 108 can be used to receive input from external devices (e.g., data information, power, etc.) and transmit the received input to one or more elements within terminal apparatus 1000 or can be used to transmit data between terminal apparatus 1000 and external devices.

The memory 109 may be used to store software programs as well as various data. The memory 109 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. Further, the memory 109 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.

The processor 110 is a control center of the terminal device, connects various parts of the entire terminal device by using various interfaces and lines, and performs various functions of the terminal device and processes data by running or executing software programs and/or modules stored in the memory 109 and calling data stored in the memory 109, thereby performing overall monitoring of the terminal device. Processor 110 may include one or more processing units; alternatively, the processor 110 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 110.

Referring to fig. 14, in the embodiment of the present application, a memory 109 stores a computer program, wherein the processor 110 runs the computer program to enable the terminal device to execute the HDR video generation method described above

In the embodiments of the present application, the processor may be a general-purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component, and may implement or execute the methods, steps, and logic blocks disclosed in the embodiments of the present application. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware processor, or may be implemented by a combination of hardware and software modules in a processor.

In the embodiment of the present application, the memory may be a nonvolatile memory, such as a Hard Disk Drive (HDD) or a solid-state drive (SSD), and may also be a volatile memory, for example, a random-access memory (RAM). The memory is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. The memory in the embodiments of the present application may also be circuitry or any other device capable of performing a storage function for storing program instructions and/or data.

Embodiments of the present application provide methods in which the methods may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, it may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, special purpose computer, computer network, network appliance, user equipment, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website, computer, server, or data center to another website, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., Digital Video Disk (DVD)), or a semiconductor medium (e.g., SSD), among others.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims

1. A method for generating a High Dynamic Range (HDR) video, comprising:

the terminal equipment receives a processing request input by a user, wherein the processing request is used for requesting to generate an HDR video;

the terminal equipment divides a first image frame contained in an original video into a plurality of areas according to the processing request, wherein the plurality of areas contain at least two types of areas, and the first image characteristics of the areas in different types are different;

the terminal equipment respectively processes each area in the plurality of areas in a corresponding first enhancement processing mode to obtain a plurality of areas after enhancement processing;

splicing the multiple regions subjected to the enhancement processing by the terminal equipment to obtain a second image frame;

the terminal equipment generates HDR video according to the second image frame;

the terminal device processes each of the plurality of regions by using the corresponding first enhancement processing method, and after obtaining the plurality of regions after enhancement processing, the method further includes:

The terminal equipment carries out semantic segmentation on a first region so as to segment the first region into a plurality of sub-regions, wherein the plurality of sub-regions comprise at least two types of sub-regions, second image features of different types of sub-regions are different, the first region is any one of the plurality of regions, and the first image feature and the second image feature are image features with different dimensions;

the terminal equipment determines a second enhancement processing mode of each sub-area;

the terminal equipment respectively adopts a corresponding second enhancement processing mode to process each subarea in the plurality of subareas;

and splicing the plurality of sub-areas subjected to the enhancement processing by the terminal equipment to obtain the first area.

2. The method according to claim 1, wherein the terminal device divides a first image frame contained in an original video into a plurality of regions according to the processing request, and comprises:

the terminal equipment extracts Y data of each pixel point of the first image frame from the YUV color space data of the first image frame;

the terminal equipment determines the gradient of each pixel point according to the Y data of each pixel point of the first image frame;

And the terminal equipment divides the first image frame into a plurality of areas according to the gradient of each pixel point.

3. The method according to claim 2, wherein before the terminal device extracts Y data of each pixel point of the first image frame from the YUV color space data of the first image frame, the method further comprises:

and the terminal equipment converts the first image frame into a YUV image.

4. The method according to any one of claims 1 to 3, wherein before the terminal device processes each of the plurality of regions in a corresponding first enhancement processing manner to obtain the plurality of regions after enhancement processing, the method further comprises:

the terminal equipment acquires a histogram of each area in the plurality of areas;

the terminal equipment determines a dynamic range value of a corresponding area according to each histogram;

and the terminal equipment respectively determines a first enhancement processing mode corresponding to each region according to each dynamic range value.

5. The method of claim 4, wherein each region contains a target region, the dynamic range values include a high exposure dynamic range value and a low brightness dynamic range value, and the determining the dynamic range value for the corresponding region from each histogram comprises:

The terminal equipment extracts the peak value, the average value and the area ratio of the target area, wherein the area ratio is used for indicating the ratio of a high exposure area to a low exposure area;

and the terminal equipment determines the high exposure dynamic range value and the low brightness dynamic range value of the target area according to the peak value, the average value and the area ratio.

6. The method according to any one of claims 1 to 3,

the original video is a video currently shot by the terminal equipment;

alternatively, the first and second electrodes may be,

and the original video is the local LDR video of the terminal equipment.

7. A terminal device, comprising: a processor, a memory, and a computer program stored on the memory and executable on the processor, wherein the processor executes the program to perform the steps of:

receiving a processing request input by a user, wherein the processing request is used for requesting to generate HDR video;

according to the processing request, a first image frame contained in an original video is divided into a plurality of areas, the plurality of areas contain at least two types of areas, and the first image characteristics of the areas in different types are different;

Processing each of the plurality of regions by adopting a corresponding first enhancement processing mode respectively to obtain a plurality of regions after enhancement processing;

splicing the multiple regions after the enhancement processing to obtain a second image frame;

generating an HDR video from the second image frame;

after the processing is performed on each of the plurality of regions by using the corresponding first enhancement processing method, to obtain a plurality of regions after enhancement processing, the method further includes:

performing semantic segmentation on a first region to segment the first region into a plurality of sub-regions, wherein the plurality of sub-regions comprise at least two types of sub-regions, second image features of different types of sub-regions are different, the first region is any one of the plurality of regions, and the first image feature and the second image feature are image features with different dimensions;

determining a second enhancement processing mode of each sub-area;

processing each sub-area in the plurality of sub-areas by adopting a corresponding second enhancement processing mode respectively;

and splicing the plurality of sub-regions after the enhancement treatment to obtain the first region.

8. The terminal device according to claim 7, wherein the dividing a first image frame contained in an original video into a plurality of regions according to the processing request comprises:

9. The terminal device according to claim 8, wherein before the terminal device extracts Y data of each pixel point of the first image frame from the YUV color space data of the first image frame, the terminal device further comprises:

and converting the first image frame into a YUV image.

10. The terminal device according to any one of claims 7 to 9, wherein before the processing is performed on each of the plurality of regions by using the corresponding first enhancement processing method, the method further includes:

acquiring a histogram of each of the plurality of regions;

determining a dynamic range value of a corresponding area according to each histogram;

and respectively determining a first enhancement processing mode corresponding to each region according to each dynamic range value.

11. The terminal device of claim 10, wherein each region includes a target region, the dynamic range values include a high exposure dynamic range value and a low brightness dynamic range value, and the determining the dynamic range value of the corresponding region according to each histogram comprises:

extracting a peak value, an average value and an area ratio of the target area, wherein the area ratio is used for indicating a ratio of a high exposure area to a low exposure area;

and determining the high exposure dynamic range value and the low brightness dynamic range value of the target area according to the peak value, the average value and the area ratio.

12. The terminal device according to any of claims 7-9,

the original video is a video currently shot by the terminal equipment;

alternatively, the first and second electrodes may be,

and the original video is the local LDR video of the terminal equipment.

13. A computer storage medium having stored therein instructions that, when run on a terminal device, cause the terminal device to perform the method of any of claims 1-6.