CN111754412A - Method and device for constructing data pairs and terminal equipment - Google Patents

Method and device for constructing data pairs and terminal equipment Download PDF

Info

Publication number
CN111754412A
CN111754412A CN201910249132.0A CN201910249132A CN111754412A CN 111754412 A CN111754412 A CN 111754412A CN 201910249132 A CN201910249132 A CN 201910249132A CN 111754412 A CN111754412 A CN 111754412A
Authority
CN
China
Prior art keywords
hdr
sdr
picture
sample
pictures
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910249132.0A
Other languages
Chinese (zh)
Other versions
CN111754412B (en
Inventor
孟俊彪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
TCL Research America Inc
Original Assignee
TCL Research America Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by TCL Research America Inc filed Critical TCL Research America Inc
Priority to CN201910249132.0A priority Critical patent/CN111754412B/en
Priority claimed from CN201910249132.0A external-priority patent/CN111754412B/en
Publication of CN111754412A publication Critical patent/CN111754412A/en
Application granted granted Critical
Publication of CN111754412B publication Critical patent/CN111754412B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • G06T5/90
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20172Image enhancement details
    • G06T2207/20208High dynamic range [HDR] image processing

Abstract

The application is applicable to the technical field of data processing, and provides a method, a device, a terminal device and a computer readable storage medium for constructing a data pair, which comprise the following steps: acquiring a plurality of HDR sample pictures; acquiring an SDR sample picture corresponding to each HDR sample picture in the HDR sample pictures; gamma correcting the SDR sample picture, and carrying out different tone mapping on the SDR sample picture after gamma correction to obtain a plurality of SDR pictures; and performing the different tone mapping on each HDR sample picture to obtain a plurality of HDR pictures corresponding to the plurality of SDR pictures, wherein the SDR pictures and the HDR pictures obtained by using the same tone mapping are a pair of SDR and HDR data. Pairs of SDR and HDR data may be constructed by the present application.

Description

Method and device for constructing data pairs and terminal equipment
Technical Field
The present application belongs to the technical field of data processing, and in particular, to a method and an apparatus for constructing a data pair, a terminal device, and a computer-readable storage medium.
Background
The human eye can probably sense the brightness in the range of 1 to 10 under the condition of no change of the pupil5A rank. In the television field, Standard Dynamic Range (SDR) video is commonly used, with a luminance range of 1 to 103Class, High Dynamic Range (HDR) video, which has been recently developed, has a luminance Range of 1 to 105The level can satisfy the brightness range which can be perceived by the pupils of the human beings. Compared with an SDR video, the HDR video is greatly improved in dynamic range, quantization depth, color gamut, frame rate and the like. However, how to construct pairs of SDR and HDR data is urgently needed to be solvedThe technical problem is solved.
Disclosure of Invention
In view of the above, embodiments of the present application provide a method, an apparatus, a terminal device, and a computer-readable storage medium for constructing a data pair, so as to construct an SDR and HDR data pair.
A first aspect of an embodiment of the present application provides a method for constructing a data pair, where the method includes:
acquiring a plurality of HDR sample pictures;
acquiring an SDR sample picture corresponding to each HDR sample picture in the HDR sample pictures;
gamma correcting the SDR sample picture, and carrying out different tone mapping on the SDR sample picture after gamma correction to obtain a plurality of SDR pictures;
and performing the different tone mapping on each HDR sample picture to obtain a plurality of HDR pictures corresponding to the plurality of SDR pictures, wherein the SDR pictures and the HDR pictures obtained by using the same tone mapping are a SDR and HDR data pair.
A second aspect of an embodiment of the present application provides an apparatus for constructing a data pair, the apparatus including:
a first obtaining module, configured to obtain multiple HDR sample pictures;
a second obtaining module, configured to obtain an SDR sample picture corresponding to each HDR sample picture in the multiple HDR sample pictures;
the image processing module is used for carrying out gamma correction on the SDR sample image and carrying out different tone mapping on the SDR sample image after gamma correction to obtain a plurality of SDR images;
and the tone mapping module is used for carrying out the different tone mapping on each HDR sample picture to obtain a plurality of HDR pictures corresponding to the SDR pictures, wherein the SDR pictures and the HDR pictures obtained by using the same tone mapping are a SDR and HDR data pair.
A third aspect of embodiments of the present application provides a terminal device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements the steps of the method according to the first aspect when executing the computer program.
A fourth aspect of embodiments of the present application provides a computer-readable storage medium, in which a computer program is stored, which, when executed by a processor, performs the steps of the method according to the first aspect.
A fifth aspect of the application provides a computer program product comprising a computer program which, when executed by one or more processors, performs the steps of the method as described in the first aspect above.
As can be seen from the above, after a plurality of HDR sample pictures are obtained, an SDR sample picture corresponding to each HDR sample picture is obtained, and gamma correction is performed on the SDR sample picture, so that the luminance of the SDR sample picture is the same as that of each HDR sample picture, different tone mapping is performed on the SDR sample picture, a plurality of SDR pictures can be obtained, and at the same time, the different tone mapping is also performed on each HDR sample picture, a plurality of HDR pictures corresponding to the plurality of SDR pictures can be obtained, and thus a plurality of SDR and HDR data pairs can be obtained. According to the scheme, different tone mapping is carried out on the HDR sample picture and the SDR sample picture, and a plurality of SDR and HDR data pairs can be obtained, so that a large number of SDR and HDR data pairs can be obtained on the basis of a small number of HDR sample pictures.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
FIG. 1 is a schematic flow chart illustrating an implementation of a method for constructing a data pair according to an embodiment of the present application;
FIG. 2 is a schematic flow chart of an implementation of a method for constructing a data pair according to a second embodiment of the present application;
FIG. 3 is a diagram of an example of the structure of a deep learning model;
FIG. 4 is a schematic diagram of an apparatus for constructing data pairs according to a third embodiment of the present application;
fig. 5 is a schematic diagram of a terminal device according to a fourth embodiment of the present application.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".
It should be understood that, the sequence numbers of the steps in this embodiment do not mean the execution sequence, and the execution sequence of each process should be determined by the function and the inherent logic of the process, and should not constitute any limitation to the implementation process of the embodiment of the present application.
In order to explain the technical solution described in the present application, the following description will be given by way of specific examples.
Referring to fig. 1, is a schematic flow chart of an implementation process of a method for constructing a data pair provided in an embodiment of the present application, where the method is applied to a terminal device, and as shown in the drawing, the method may include the following steps:
step S101, acquiring multiple HDR sample pictures.
In this application embodiment, the multiple HDR sample pictures may refer to multiple HDR pictures input by a user for acquiring multiple SDR and HDR data pairs. The plurality of HDR sample pictures is a basis for acquiring the plurality of SDR and HDR data pairs.
Step S102, acquiring an SDR sample picture corresponding to each HDR sample picture in the plurality of HDR sample pictures.
In the embodiment of the present application, an operation of losing saturation region information may be performed on each HDR sample picture, so as to obtain a picture (i.e., an SDR sample picture) of the lost saturation region information corresponding to the HDR sample.
Optionally, the obtaining of the SDR sample picture corresponding to each HDR sample picture in the multiple HDR sample pictures includes:
normalizing the value of the pixel point in each HDR sample picture in the HDR sample pictures;
counting pixel points of which the median value of each HDR sample picture is greater than a first threshold, and multiplying the value of the pixel points in each HDR sample picture by the reciprocal of the first threshold to obtain a first picture;
and setting the value of the pixel point with the median value larger than 1 in the first picture as 1 to obtain the SDR sample picture.
In the embodiment of the present application, the value of the pixel point in each HDR sample picture is normalized to be between 0 and 1, histogram statistics may be performed on the pixel point in each normalized HDR sample picture, and a pixel point whose value is greater than a first threshold is counted, where an area where the pixel point whose value is greater than the first threshold is a saturated area, and the value of the pixel point is multiplied by the reciprocal of the first threshold, so that the value of the pixel point is greater than 1, so as to obtain a first picture containing saturated area information (since the brightness of the saturated area in the first picture is changed, the saturated area information can be displayed, that is, the saturated area information is not lost), the value of the pixel point whose median value is greater than 1 in the first picture is set to be 1, so as to obtain an SDR sample picture with the saturated area information lost (since the values of the pixel points of the saturated area in the SDR sample picture are all set to be 1, so that the brightness, no saturation region information can be displayed, i.e. saturation region information is lost).
The first threshold is used to determine a saturated region in each HDR sample picture, for example, a region where a pixel having a value greater than the first threshold is located is a saturated region, and a region where a pixel having a value less than or equal to the first threshold is located is an unsaturated region. The first threshold may be preset by a user, or may be calculated according to a preset algorithm, where the preset algorithm may be histogram statistics on pixels in each normalized HDR sample picture, statistics of the total number of pixels in each HDR sample picture and the number of pixels corresponding to different values are performed, and according to the total number of pixels in each HDR sample picture and the number of pixels corresponding to different values, a minimum value of values of pixels located in the top n percent is calculated in an order from a maximum value to a minimum value (i.e., a value of a pixel), and the minimum value is the first threshold. The user may set the value of n, for example, a random number between 5 and 15, according to actual needs, which is not limited herein.
Step S103, gamma correction is carried out on the SDR sample picture, different tone mappings are carried out on the SDR sample picture after gamma correction, and a plurality of SDR pictures are obtained.
Step S104, performing the different tone mapping on each HDR sample picture to obtain multiple HDR pictures corresponding to the multiple SDR pictures.
Wherein, the SDR picture and the HDR picture obtained by using the same tone mapping are one SDR and HDR data pair.
Optionally, the performing the different tone mapping on each HDR sample picture comprises:
performing the different tone mapping on the first picture.
In the embodiment of the application, gamma correction is performed on the SDR sample picture, so that the brightness of the SDR sample picture is the same as that of the HDR sample picture, and different tone mappings are performed on the SDR sample picture and the first picture after gamma correction, so that a plurality of data pairs of the SDR picture and the HDR picture can be obtained. The value of the pixel point may be a gray value of the pixel point. For one HDR sample picture, multiple SDR pictures and HDR pictures with different exposure degrees can be obtained through different tone mapping. It should be noted that the tone mapping used by the corresponding SDR picture and the HDR picture is the same, that is, the exposure level is the same. The different tone mappings may be different parameters of the same tone mapping algorithm, or may refer to different tone mapping algorithms, and are not limited herein.
Exemplarily, a picture a is an HDR sample picture, values of pixel points in the picture a are normalized to be between 0 and 1, histogram statistics is performed on the pixel points of the picture a, it is counted that the minimum value of the values of the pixel points located in the top 10% of the picture a is 0.8 (that is, the pixel points with the values above 0.8 account for 10% of the total number of the pixel points in the picture a), the pixel points with the median value greater than 0.8 in the picture a are multiplied by 1.25, it can be ensured that the value of the pixel points with the median value greater than 0.8 in the picture a is changed from greater than 0.8 and less than or equal to 1 to be greater than 1, so as to obtain a first picture, the value of the pixel points with the median value greater than 1 in the first picture is set to be 1, an SDR sample picture with saturation area information loss is obtained, gamma correction is performed on the SDR sample picture, the brightness of the SDR sample picture a is the same as that of the picture a, the method comprises the steps of obtaining a first HDR picture by tone mapping when the first SDR picture is obtained, obtaining a second HDR picture by tone mapping when the second SDR picture is obtained, obtaining a third HDR picture by tone mapping when the third SDR picture is obtained, wherein the first SDR picture and the first HDR picture are an SDR and HDR data pair, the second SDR picture and the second HDR picture are an SDR and HDR data pair, and the third SDR picture and the third HDR picture are an SDR and HDR data pair.
According to the method and the device, different tone mapping is carried out on the HDR sample picture and the SDR sample picture, a plurality of SDR and HDR data pairs can be obtained, and therefore a large number of SDR and HDR data pairs are obtained on the basis of a small number of HDR sample pictures.
Referring to fig. 2, it is a schematic flow chart of an implementation process of a method for constructing a data pair provided in the second embodiment of the present application, where the method is applied to a terminal device, and as shown in the figure, the method may include the following steps:
in step S201, multiple HDR sample pictures are acquired.
The step is the same as step S101, and reference may be made to the related description of step S101, which is not repeated herein.
Step S202, acquiring an SDR sample picture corresponding to each HDR sample picture in the plurality of HDR sample pictures.
The step is the same as step S102, and reference may be made to the related description of step S102, which is not repeated herein.
Step S203, gamma correction is carried out on the SDR sample picture, different tone mappings are carried out on the SDR sample picture after gamma correction, and a plurality of SDR pictures are obtained.
The step is the same as step S103, and reference may be made to the related description of step S103, which is not described herein again.
Step S204, performing the different tone mapping on each HDR sample picture to obtain a plurality of HDR pictures corresponding to the plurality of SDR pictures.
Wherein, the SDR picture and the HDR picture obtained by using the same tone mapping are one SDR and HDR data pair.
The step is the same as step S104, and reference may be made to the related description of step S104, which is not repeated herein.
Step S205 trains a deep learning model according to a plurality of SDR and HDR data pairs.
In this embodiment of the application, the training of the deep learning model may be that the deep learning model learns a mapping relationship between an SDR picture (i.e., an unsaturated picture) and an HDR picture (i.e., a saturated picture), and reconstructs saturated region information of the SDR picture (i.e., recovers information lost by a saturated region in the SDR picture) through the trained deep learning model, so as to obtain the HDR picture (i.e., the SDR picture after reconstructing the saturated region information is the HDR picture).
In the embodiment of the application, a plurality of SDR and HDR data pairs can be used as training samples to train the deep learning model, parameters in the deep learning model are adjusted, and the accuracy of converting an SDR picture into an HDR picture is improved. The method comprises the steps that the content of an SDR picture in one SDR and HDR data pair is the same as that of an HDR picture, the difference is that the information of the saturated region of the SDR picture is lost, the information of the saturated region of the HDR picture is not lost, a plurality of SDR and HDR data pairs are used as the input of a deep learning model, a plurality of SDR and HDR data pairs are used as target pictures, the deep learning model is trained, the deep learning model can learn the mapping relation from the SDR picture to the HDR picture, and the information of the saturated region in the SDR picture is reconstructed through the trained deep learning model, so that the HDR picture is obtained.
Optionally, the deep learning model includes an encoding stage and a decoding stage; training a deep learning model from a plurality of SDR and HDR data pairs comprises:
convolving and downsampling an SDR picture in each SDR and HDR data pair in the encoding stage to obtain a feature map of the SDR picture in each SDR and HDR data pair;
in the decoding stage, the feature map of the SDR picture in each SDR and HDR data pair is subjected to convolution and upsampling, and a predicted HDR picture is output;
learning a difference between the predicted HDR picture and the HDR picture in each of the SDR and HDR data pairs by a preset penalty function, training the deep learning model.
The preset loss function may refer to a loss function preset by a user, and the loss function may be any loss function, such as an L1 loss function, an L2 loss function, and the like, which is not limited herein. The penalty function measures the difference between the predicted HDR picture and the HDR picture (i.e., the target picture) in the SDR and HDR data pairs, by which the deep learning model can be optimized, and when the penalty function reaches a convergence state, it is indicated that the deep learning model has been trained.
As shown in fig. 3, which is a diagram illustrating a structure example of a deep learning model, the deep learning model adopted in the embodiment of the present application mainly includes a coding stage and a decoding stage, in the coding stage, a convolution with a step size of 2 is used to perform feature extraction and downsampling on each SDR video frame input to the deep learning model to obtain a preset size (for example, the size of the SDR video frame)
Figure BDA0002011890710000091
) In the decoding stage, the picture is reconstructed by adopting an up-sampling and convolution mode to obtain an output HDR video frame. In order to reduce information loss caused in the downsampling process, a concat layer is added in the deep learning model, and the feature graph in the encoding stage and the feature graph corresponding to the decoding stage are spliced together, so that the information loss is reduced. The deep learning model in the embodiment of the application adopts an up-sampling and convolution mode to replace an anti-convolution layer commonly used in the deep learning field, grid artifacts in a reconstructed picture are avoided, and finally the deep learning model is trained through continuous iteration and through L2 loss function learning of differences between an output picture and a target picture.
Optionally, the embodiment of the present application further includes:
segmenting the SDR video into a plurality of SDR video frames;
processing each SDR video frame in the plurality of SDR video frames by using the deep learning model to obtain a corresponding HDR video frame, wherein the plurality of SDR video frames correspond to a plurality of HDR video frames;
the plurality of HDR video frames are composited into an HDR video.
Optionally, the segmenting the SDR video into a plurality of SDR video frames comprises:
slicing the SDR video into a plurality of SDR video frames by FFmpeg;
compositing the plurality of HDR video frames into an HDR video comprises:
the plurality of HDR video frames are composited into an HDR video by FFmpeg.
In this embodiment of the application, the terminal device may first acquire an SDR video to be converted into an HDR video (for example, acquired from a network or acquired from a storage device), and split the SDR video into a plurality of SDR video frames by using a preset splitting manner. Video is typically composed of multiple frames, each SDR video frame being a still picture, displaying SDR video frames in rapid succession results in moving SDR video. The preset splitting mode may refer to a mode preset by a user for splitting an SDR video into SDR video frames, including but not limited to FFmpeg, where FFmpeg is a set of open-source computer programs that can be used to record, convert digital audio and video, and convert them into streams, and includes a leading audio/video coding library libavcodec, etc.
In the embodiment of the application, the SDR video can be sequentially segmented into a plurality of SDR video frames according to the playing sequence of the SDR video frames, and each SDR video frame in the plurality of SDR video frames is sequentially processed by using the trained deep learning model.
It should be noted that before the terminal device divides the SDR video into multiple SDR video frames, the user may select whether the video conversion function needs to be turned on, when the user selects "yes", the terminal device may turn on the video conversion function and divide the SDR video into multiple SDR video frames, and when the user selects "no", the terminal device does not turn on the video conversion function, that is, does not convert the SDR video into the HDR video, and does not need to divide the SDR video into multiple SDR video frames. A physical button or a virtual button may be set on the terminal device, and a user may select whether to start the video conversion function through the physical button or the virtual button. The video conversion function may refer to a function of converting SDR video into HDR video.
One SDR video frame can be processed by a deep learning model to obtain one HDR video frame, and then a plurality of SDR video frames can be processed by the deep learning model to obtain a plurality of HDR video frames, namely the plurality of SDR video frames correspond to the plurality of HDR video frames.
In the embodiment of the application, after the SDR video is divided into a plurality of SDR video frames, each SDR video frame is processed by using a trained deep learning model, and the saturated region information of each SDR video frame can be recovered, so that the HDR video frame corresponding to each SDR video frame is obtained. Wherein the deep learning model is used to convert SDR video frames to HDR video frames. The saturated region may be a region where a pixel point in the SDR video frame whose median is greater than a preset saturation threshold is located, where information of the saturated region is saturated region information, which may also be referred to as saturated region details, for example, an SDR video frame includes a region where light rays such as sun and street lamps are relatively bright, details of the region are usually highlighted and cannot be clearly displayed in the SDR video frame (for example, clouds around sunlight), the details of the region may be recovered by processing the SDR video frame through a deep learning model, and the SDR video frame is an HDR video frame after recovering the details of the region. The preset saturation threshold may be a threshold preset by a user for determining a saturation region.
In the embodiment of the application, after acquiring a plurality of HDR video frames, the plurality of HDR video frames may be synthesized into an HDR video using FFmpeg. Wherein, the HDR video frames can be synthesized according to the sequence of obtaining the HDR video frames. For example, an SDR video is sequentially divided into a first SDR video frame, a second SDR video frame, and a third SDR video frame according to the playing sequence (i.e., when playing the SDR video, a first SDR video frame is played first, a second SDR video frame is played next, and a third SDR video frame is played last), the first SDR video frame, the second SDR video frame, and the third SDR video frame are sequentially processed by using a trained deep learning model, a first HDR video frame, a second HDR video frame, and a third HDR video frame corresponding to the first SDR video frame are sequentially obtained, and the first HDR video frame, the second HDR video frame, and the third HDR video frame are combined into an HDR video, that is, when playing the combined HDR video, the first HDR video frame is played first, the second HDR video frame is played next, and the third HDR video frame is played last.
According to the deep learning model training method and device, the deep learning model can be trained through the acquired multiple SDR and HDR data pairs, the deconvolution layer commonly used in the deep learning field is replaced by an up-sampling and convolution mode, the deep learning model can be better trained, when the well-trained deep learning model is used for converting the SDR video into the HDR video, the appearance of grid artifacts in a reconstructed picture can be avoided, and the picture quality of the HDR video is improved.
Referring to fig. 4, it is a schematic diagram of an apparatus for constructing data pairs provided in the third embodiment of the present application, and for convenience of description, only the parts related to the third embodiment of the present application are shown.
The device comprises:
a first obtaining module 41, configured to obtain multiple HDR sample pictures;
a second obtaining module 42, configured to obtain an SDR sample picture corresponding to each HDR sample picture in the multiple HDR sample pictures;
a picture processing module 43, configured to perform gamma correction on the SDR sample picture, and perform different tone mapping on the SDR sample picture after gamma correction, to obtain multiple SDR pictures;
a tone mapping module 44, configured to perform the different tone mapping on the each HDR sample picture, and obtain a plurality of HDR pictures corresponding to the plurality of SDR pictures, where the SDR picture and the HDR picture obtained by using the same tone mapping are one SDR and HDR data pair.
Optionally, the second obtaining module 42 includes:
a normalization unit, configured to normalize a value of a pixel in each HDR sample picture in the plurality of HDR sample pictures;
the pixel processing unit is used for counting pixel points of which the median values in each HDR sample picture are larger than a first threshold value, and multiplying the values of the pixel points in each HDR sample picture by the reciprocal of the first threshold value to obtain a first picture;
the image acquisition unit is used for setting the value of the pixel point with the median value larger than 1 in the first image to be 1 to obtain an SDR sample image;
the tone mapping module 44 is specifically configured to:
performing the different tone mapping on the first picture.
Optionally, the apparatus further comprises:
a model training module 45 for training the deep learning model from the plurality of SDR and HDR data pairs.
Optionally, the deep learning model includes an encoding stage and a decoding stage; the model training module 45 includes:
the characteristic diagram acquisition unit is used for carrying out convolution and downsampling on the SDR picture in each SDR and HDR data pair in the encoding stage to acquire the characteristic diagram of the SDR picture in each SDR and HDR data pair;
a picture output unit, configured to, in the decoding stage, perform convolution and upsampling on a feature map of an SDR picture in each of the SDR and HDR data pairs, and output a predicted HDR picture;
a difference prediction unit, configured to learn a difference between the predicted HDR picture and the HDR picture in each SDR and HDR data pair through a preset loss function, and train the deep learning model.
Optionally, the apparatus further comprises:
a video slicing module 46 for slicing the SDR video into a plurality of SDR video frames;
a video frame processing module 47, configured to process each SDR video frame in the multiple SDR video frames by using the deep learning model to obtain a corresponding HDR video frame, where the multiple SDR video frames correspond to multiple HDR video frames;
a video composition module 48 for composing the plurality of HDR video frames into an HDR video.
Optionally, the video segmentation module 46 is specifically configured to:
the SDR video is sliced by FFmpeg into a plurality of SDR video frames.
Optionally, the video composition module 48 is specifically configured to:
the plurality of HDR video frames are composited into an HDR video by FFmpeg.
The apparatus provided in the embodiment of the present application may be applied to the first method embodiment and the second method embodiment, and for details, reference is made to the description of the first method embodiment and the second method embodiment, and details are not repeated here.
Fig. 5 is a schematic diagram of a terminal device according to a fourth embodiment of the present application. As shown in fig. 5, the terminal device 5 of this embodiment includes: a processor 50, a memory 51 and a computer program 52 stored in said memory 51 and executable on said processor 50. The processor 50, when executing the computer program 52, implements the steps in the above-described embodiments of the method of constructing data pairs, such as the steps S101 to S104 shown in fig. 1. Alternatively, the processor 50, when executing the computer program 52, implements the functions of the modules/units in the above-mentioned device embodiments, such as the functions of the modules 41 to 48 shown in fig. 4.
Illustratively, the computer program 52 may be partitioned into one or more modules/units, which are stored in the memory 51 and executed by the processor 50 to accomplish the present application. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution process of the computer program 52 in the terminal device 5. For example, the computer program 52 may be divided into a first acquisition module, a second acquisition module, a picture processing module, a tone mapping module, a model training module, a video segmentation module, a video frame processing module, and a video composition module, and each module has the following specific functions:
a first obtaining module, configured to obtain multiple HDR sample pictures;
a second obtaining module, configured to obtain an SDR sample picture corresponding to each HDR sample picture in the multiple HDR sample pictures;
the image processing module is used for carrying out gamma correction on the SDR sample image and carrying out different tone mapping on the SDR sample image after gamma correction to obtain a plurality of SDR images;
and the tone mapping module is used for carrying out the different tone mapping on each HDR sample picture to obtain a plurality of HDR pictures corresponding to the SDR pictures, wherein the SDR pictures and the HDR pictures obtained by using the same tone mapping are a SDR and HDR data pair.
Optionally, the second obtaining module includes:
a normalization unit, configured to normalize a value of a pixel in each HDR sample picture in the plurality of HDR sample pictures;
the pixel processing unit is used for counting pixel points of which the median values in each HDR sample picture are larger than a first threshold value, and multiplying the values of the pixel points in each HDR sample picture by the reciprocal of the first threshold value to obtain a first picture;
the image acquisition unit is used for setting the value of the pixel point with the median value larger than 1 in the first image to be 1 to obtain an SDR sample image;
the tone mapping module is specifically configured to:
performing the different tone mapping on the first picture.
Optionally, the model training module is configured to train a deep learning model according to a plurality of SDR and HDR data pairs.
Optionally, the deep learning model includes an encoding stage and a decoding stage; the model training module 45 includes:
the characteristic diagram acquisition unit is used for carrying out convolution and downsampling on the SDR picture in each SDR and HDR data pair in the encoding stage to acquire the characteristic diagram of the SDR picture in each SDR and HDR data pair;
a picture output unit, configured to, in the decoding stage, perform convolution and upsampling on a feature map of an SDR picture in each of the SDR and HDR data pairs, and output a predicted HDR picture;
a difference prediction unit, configured to learn a difference between the predicted HDR picture and the HDR picture in each SDR and HDR data pair through a preset loss function, and train the deep learning model.
The video segmentation module is used for segmenting the SDR video into a plurality of SDR video frames;
a video frame processing module, configured to process each SDR video frame in the multiple SDR video frames using the deep learning model to obtain a corresponding HDR video frame, where the multiple SDR video frames correspond to multiple HDR video frames;
a video composition module to compose the plurality of HDR video frames into an HDR video.
Optionally, the video segmentation module is specifically configured to:
the SDR video is sliced by FFmpeg into a plurality of SDR video frames.
Optionally, the video composition module is specifically configured to:
the plurality of HDR video frames are composited into an HDR video by FFmpeg.
The terminal device 5 may be a desktop computer, a notebook, a palm computer, a cloud server, a television, or other computing devices. The terminal device may include, but is not limited to, a processor 50, a memory 51. Those skilled in the art will appreciate that fig. 5 is merely an example of a terminal device 5 and does not constitute a limitation of terminal device 5 and may include more or fewer components than shown, or some components may be combined, or different components, e.g., the terminal device may also include input-output devices, network access devices, buses, etc.
The Processor 50 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 51 may be an internal storage unit of the terminal device 5, such as a hard disk or a memory of the terminal device 5. The memory 51 may also be an external storage device of the terminal device 5, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the terminal device 5. Further, the memory 51 may also include both an internal storage unit and an external storage device of the terminal device 5. The memory 51 is used for storing the computer program and other programs and data required by the terminal device. The memory 51 may also be used to temporarily store data that has been output or is to be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other ways. For example, the above-described embodiments of the apparatus/terminal device are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow in the method of the embodiments described above can be realized by a computer program, which can be stored in a computer-readable storage medium and can realize the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims (10)

1. A method of constructing a data pair, the method comprising:
acquiring a plurality of HDR sample pictures;
acquiring an SDR sample picture corresponding to each HDR sample picture in the HDR sample pictures;
gamma correcting the SDR sample picture, and carrying out different tone mapping on the SDR sample picture after gamma correction to obtain a plurality of SDR pictures;
and performing the different tone mapping on each HDR sample picture to obtain a plurality of HDR pictures corresponding to the plurality of SDR pictures, wherein the SDR pictures and the HDR pictures obtained by using the same tone mapping are a SDR and HDR data pair.
2. The method of claim 1, wherein the obtaining the SDR sample picture corresponding to each HDR sample picture in the plurality of HDR sample pictures comprises:
normalizing the value of the pixel point in each HDR sample picture in the HDR sample pictures;
counting pixel points of which the median value of each HDR sample picture is greater than a first threshold, and multiplying the value of the pixel points in each HDR sample picture by the reciprocal of the first threshold to obtain a first picture;
setting the value of the pixel point with the median value larger than 1 in the first picture as 1 to obtain an SDR sample picture;
correspondingly, the performing the different tone mapping on each HDR sample picture comprises:
performing the different tone mapping on the first picture.
3. The method of claim 1, wherein the method further comprises:
a deep learning model is trained from a plurality of SDR and HDR data pairs.
4. The method of claim 3, wherein the deep learning model comprises an encoding phase and a decoding phase; the training a deep learning model from a plurality of SDR and HDR data pairs comprises:
convolving and downsampling an SDR picture in each SDR and HDR data pair in the encoding stage to obtain a feature map of the SDR picture in each SDR and HDR data pair;
in the decoding stage, the feature map of the SDR picture in each SDR and HDR data pair is subjected to convolution and upsampling, and a predicted HDR picture is output;
learning a difference between the predicted HDR picture and the HDR picture in each of the SDR and HDR data pairs by a preset penalty function, training the deep learning model.
5. The method of claim 4, wherein the method further comprises:
segmenting the SDR video into a plurality of SDR video frames;
processing each SDR video frame in the plurality of SDR video frames by using the deep learning model to obtain a corresponding HDR video frame, wherein the plurality of SDR video frames correspond to a plurality of HDR video frames;
the plurality of HDR video frames are composited into an HDR video.
6. The method of claim 5, wherein the slicing the SDR video into a plurality of SDR video frames comprises:
slicing the SDR video into a plurality of SDR video frames by FFmpeg;
compositing the plurality of HDR video frames into an HDR video comprises:
the plurality of HDR video frames are composited into an HDR video by FFmpeg.
7. An apparatus for constructing data pairs, the apparatus comprising:
a first obtaining module, configured to obtain multiple HDR sample pictures;
a second obtaining module, configured to obtain an SDR sample picture corresponding to each HDR sample picture in the multiple HDR sample pictures;
the image processing module is used for carrying out gamma correction on the SDR sample image and carrying out different tone mapping on the SDR sample image after gamma correction to obtain a plurality of SDR images;
and the tone mapping module is used for carrying out the different tone mapping on each HDR sample picture to obtain a plurality of HDR pictures corresponding to the SDR pictures, wherein the SDR pictures and the HDR pictures obtained by using the same tone mapping are a SDR and HDR data pair.
8. The apparatus of claim 7, wherein the second obtaining module comprises:
a normalization unit, configured to normalize a value of a pixel in each HDR sample picture in the plurality of HDR sample pictures;
the pixel processing unit is used for counting pixel points of which the median values in each HDR sample picture are larger than a first threshold value, and multiplying the values of the pixel points in each HDR sample picture by the reciprocal of the first threshold value to obtain a first picture;
the image acquisition unit is used for setting the value of the pixel point with the median value larger than 1 in the first image to be 1 to obtain an SDR sample image;
the tone mapping module is specifically configured to:
performing the different tone mapping on the first picture.
9. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 6 when executing the computer program.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 6.
CN201910249132.0A 2019-03-29 Method and device for constructing data pair and terminal equipment Active CN111754412B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910249132.0A CN111754412B (en) 2019-03-29 Method and device for constructing data pair and terminal equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910249132.0A CN111754412B (en) 2019-03-29 Method and device for constructing data pair and terminal equipment

Publications (2)

Publication Number Publication Date
CN111754412A true CN111754412A (en) 2020-10-09
CN111754412B CN111754412B (en) 2024-04-19

Family

ID=

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112738392A (en) * 2020-12-24 2021-04-30 上海哔哩哔哩科技有限公司 Image conversion method and system
US11803946B2 (en) * 2020-09-14 2023-10-31 Disney Enterprises, Inc. Deep SDR-HDR conversion

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103843058A (en) * 2011-09-27 2014-06-04 皇家飞利浦有限公司 Apparatus and method for dynamic range transforming of images
CN104995903A (en) * 2013-02-21 2015-10-21 皇家飞利浦有限公司 Improved HDR image encoding and decoding methods and devices
CN106210921A (en) * 2016-08-12 2016-12-07 深圳创维-Rgb电子有限公司 A kind of image effect method for improving and device thereof
CN106233706A (en) * 2014-02-25 2016-12-14 苹果公司 For providing the apparatus and method of the back compatible of the video with standard dynamic range and HDR
CN106686320A (en) * 2017-01-22 2017-05-17 宁波星帆信息科技有限公司 Tone mapping method based on numerical density balance
WO2017082175A1 (en) * 2015-11-12 2017-05-18 Sony Corporation Information processing apparatus, information recording medium, information processing method, and program
CN107005716A (en) * 2014-10-10 2017-08-01 皇家飞利浦有限公司 Specified for the saturation degree processing that dynamic range maps
CN107968919A (en) * 2016-10-20 2018-04-27 汤姆逊许可公司 Method and apparatus for inverse tone mapping (ITM)
CN108769804A (en) * 2018-04-25 2018-11-06 杭州当虹科技股份有限公司 A kind of format conversion method of high dynamic range video
US20190020852A1 (en) * 2017-07-13 2019-01-17 Samsung Electronics Co., Ltd. Electronics apparatus, display apparatus and control method thereof

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103843058A (en) * 2011-09-27 2014-06-04 皇家飞利浦有限公司 Apparatus and method for dynamic range transforming of images
CN104995903A (en) * 2013-02-21 2015-10-21 皇家飞利浦有限公司 Improved HDR image encoding and decoding methods and devices
CN106233706A (en) * 2014-02-25 2016-12-14 苹果公司 For providing the apparatus and method of the back compatible of the video with standard dynamic range and HDR
CN107005716A (en) * 2014-10-10 2017-08-01 皇家飞利浦有限公司 Specified for the saturation degree processing that dynamic range maps
WO2017082175A1 (en) * 2015-11-12 2017-05-18 Sony Corporation Information processing apparatus, information recording medium, information processing method, and program
CN106210921A (en) * 2016-08-12 2016-12-07 深圳创维-Rgb电子有限公司 A kind of image effect method for improving and device thereof
CN107968919A (en) * 2016-10-20 2018-04-27 汤姆逊许可公司 Method and apparatus for inverse tone mapping (ITM)
CN106686320A (en) * 2017-01-22 2017-05-17 宁波星帆信息科技有限公司 Tone mapping method based on numerical density balance
US20190020852A1 (en) * 2017-07-13 2019-01-17 Samsung Electronics Co., Ltd. Electronics apparatus, display apparatus and control method thereof
CN108769804A (en) * 2018-04-25 2018-11-06 杭州当虹科技股份有限公司 A kind of format conversion method of high dynamic range video

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11803946B2 (en) * 2020-09-14 2023-10-31 Disney Enterprises, Inc. Deep SDR-HDR conversion
CN112738392A (en) * 2020-12-24 2021-04-30 上海哔哩哔哩科技有限公司 Image conversion method and system

Similar Documents

Publication Publication Date Title
Zhang et al. Single image defogging based on multi-channel convolutional MSRCR
CN107358586B (en) Image enhancement method, device and equipment
US9299317B2 (en) Local multiscale tone-mapping operator
CN111080541B (en) Color image denoising method based on bit layering and attention fusion mechanism
EP4198875A1 (en) Image fusion method, and training method and apparatus for image fusion model
CN110675336A (en) Low-illumination image enhancement method and device
WO2020152521A1 (en) Systems and methods for transforming raw sensor data captured in low-light conditions to well-exposed images using neural network architectures
CN113518185B (en) Video conversion processing method and device, computer readable medium and electronic equipment
CN111079764B (en) Low-illumination license plate image recognition method and device based on deep learning
JP6502947B2 (en) Method and device for tone mapping high dynamic range images
Panetta et al. Tmo-net: A parameter-free tone mapping operator using generative adversarial network, and performance benchmarking on large scale hdr dataset
CN108510557B (en) Image tone mapping method and device
CN112348747A (en) Image enhancement method, device and storage medium
An et al. Single-shot high dynamic range imaging via deep convolutional neural network
CN111757172A (en) HDR video acquisition method, HDR video acquisition device and terminal equipment
Zhang et al. Deep tone mapping network in HSV color space
CN112019827B (en) Method, device, equipment and storage medium for enhancing video image color
Eilertsen The high dynamic range imaging pipeline
CN110717864B (en) Image enhancement method, device, terminal equipment and computer readable medium
US20140092116A1 (en) Wide dynamic range display
CN105979283A (en) Video transcoding method and device
Al-Samaraie A new enhancement approach for enhancing image of digital cameras by changing the contrast
Wang et al. Learning a self‐supervised tone mapping operator via feature contrast masking loss
CN111833262A (en) Image noise reduction method and device and electronic equipment
CN116468636A (en) Low-illumination enhancement method, device, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Country or region after: China

Address after: 516006 TCL science and technology building, No. 17, Huifeng Third Road, Zhongkai high tech Zone, Huizhou City, Guangdong Province

Applicant after: TCL Technology Group Co.,Ltd.

Address before: 516006 Guangdong province Huizhou Zhongkai hi tech Development Zone No. nineteen District

Applicant before: TCL Corp.

Country or region before: China

CB02 Change of applicant information
GR01 Patent grant