WO2024119922A1 - 处理视频的方法、显示设备及存储介质 - Google Patents
处理视频的方法、显示设备及存储介质 Download PDFInfo
- Publication number
- WO2024119922A1 WO2024119922A1 PCT/CN2023/116769 CN2023116769W WO2024119922A1 WO 2024119922 A1 WO2024119922 A1 WO 2024119922A1 CN 2023116769 W CN2023116769 W CN 2023116769W WO 2024119922 A1 WO2024119922 A1 WO 2024119922A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- frame image
- video frame
- video
- area
- dynamic range
- Prior art date
Links
- 238000003672 processing method Methods 0.000 title abstract 2
- 238000000034 method Methods 0.000 claims abstract description 83
- 238000012545 processing Methods 0.000 claims abstract description 46
- 238000013507 mapping Methods 0.000 claims abstract description 44
- 230000015654 memory Effects 0.000 claims description 16
- 238000004590 computer program Methods 0.000 claims description 15
- 230000001976 improved effect Effects 0.000 abstract description 65
- 230000006870 function Effects 0.000 description 31
- 238000010586 diagram Methods 0.000 description 27
- 230000008569 process Effects 0.000 description 23
- 230000000694 effects Effects 0.000 description 21
- 238000010801 machine learning Methods 0.000 description 11
- 238000005516 engineering process Methods 0.000 description 6
- 230000000007 visual effect Effects 0.000 description 6
- 239000003086 colorant Substances 0.000 description 5
- 238000004891 communication Methods 0.000 description 5
- 230000001133 acceleration Effects 0.000 description 4
- 238000011161 development Methods 0.000 description 4
- 229920001621 AMOLED Polymers 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 230000003190 augmentative effect Effects 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000003384 imaging method Methods 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 238000010295 mobile communication Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000009877 rendering Methods 0.000 description 2
- 230000000717 retained effect Effects 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 239000002699 waste material Substances 0.000 description 2
- 241001270131 Agaricus moelleri Species 0.000 description 1
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 1
- 101000827703 Homo sapiens Polyphosphoinositide phosphatase Proteins 0.000 description 1
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 1
- 241000023320 Luma <angiosperm> Species 0.000 description 1
- 102100023591 Polyphosphoinositide phosphatase Human genes 0.000 description 1
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 1
- 101100233916 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR5 gene Proteins 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 210000000988 bone and bone Anatomy 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 239000004020 conductor Substances 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000004907 flux Effects 0.000 description 1
- 230000005484 gravity Effects 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- OSWPMRLSEDHDFF-UHFFFAOYSA-N methyl salicylate Chemical compound COC(=O)C1=CC=CC=C1O OSWPMRLSEDHDFF-UHFFFAOYSA-N 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000013441 quality evaluation Methods 0.000 description 1
- 239000002096 quantum dot Substances 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000010079 rubber tapping Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 230000003238 somatosensory effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/44—Receiver circuitry for the reception of television signals according to analogue transmission standards
- H04N5/57—Control of contrast or brightness
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/70—Circuitry for compensating brightness variation in the scene
- H04N23/741—Circuitry for compensating brightness variation in the scene by increasing the dynamic range of the image compared to the dynamic range of the electronic image sensors
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/80—Camera processing pipelines; Components thereof
- H04N23/84—Camera processing pipelines; Components thereof for processing colour signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/14—Picture signal circuitry for video frequency region
- H04N5/20—Circuitry for controlling amplitude response
- H04N5/202—Gamma control
Definitions
- the present application relates to the field of image processing technology, and in particular to a method for processing a video, a display device, and a storage medium.
- high-dynamic range (HDR) video Compared with standard dynamic range (SDR) video, high-dynamic range (HDR) video has clearer light and dark levels, richer image details, and can more realistically reproduce real scenes, providing users with better video viewing experience.
- SDR standard dynamic range
- HDR high-dynamic range
- the brightness capabilities of display devices are getting higher and higher.
- the brightness capabilities of display devices are usually expressed by the dynamic range of display devices. The higher the dynamic range of a display device, the stronger its brightness capabilities.
- the present application provides a method for processing a video, a display device, and a storage medium.
- determining the brightness information of the original video and combining it with the brightness capability of the current display device an actual scalable dynamic range is determined, and tone mapping is performed based on the scalable dynamic range, thereby truly improving the dynamic range.
- the present application provides a method for processing a video, the method being applied to a display device, the method comprising:
- Obtain a decoded video of an original video wherein the decoded video includes a plurality of video frame images; determine brightness information corresponding to each video frame image; obtain the brightness capability of a display device and a dynamic range of the original video, and calculate an expandable dynamic range of each video frame image according to the brightness capability of the display device and the dynamic range of the original video; perform tone mapping on each video frame image using the brightness information corresponding to each video frame image and the expandable dynamic range of each video frame image, and obtain an enhanced image corresponding to each video frame image, wherein the dynamic range of the enhanced image is greater than the dynamic range of the video frame image.
- the display device provided in the embodiment of the present application may include a mobile phone, a tablet computer, a wearable device, a television, a vehicle-mounted device, an augmented reality (AR)/virtual reality (VR) device, a notebook computer, an ultra-mobile personal computer (UMPC), a netbook, a personal digital assistant (PDA), various camera devices, etc., or may be
- AR augmented reality
- VR virtual reality
- UMPC ultra-mobile personal computer
- PDA personal digital assistant
- various camera devices etc.
- the embodiments of the present application do not impose any limitation on the specific type of display device, as other devices or apparatuses capable of performing image processing may be used.
- the original video may include an SDR video.
- the SDR video may be a video played online by a display device.
- the method for processing video provided in the first aspect fully considers the brightness capability of the display device and the dynamic range of the original video when determining the expandable dynamic range of each video frame image, and then uses the expandable dynamic range to perform tone mapping on each video frame image, so that the dynamic range of the video frame image is truly improved, that is, the enhanced image is the image after the dynamic range is improved. It can be seen that the dynamic range of the video composed of multiple enhanced images is also improved, so that when the video after the dynamic range is improved is displayed on the display device, the display effect is improved and the user experience is improved.
- the method for processing a video provided in the present application may further include generating an enhanced video based on a plurality of enhanced images.
- the enhanced video is an extended dynamic range video, that is, the enhanced video is a video after the dynamic range is extended.
- the enhanced images corresponding to each video frame image are arranged in chronological order to obtain an enhanced video, which can be played on a display device.
- the enhanced image corresponding to the video frame image may be displayed on a display device. Since the display device processes very quickly, for the user, it is like watching a video smoothly, and the video being watched is a video after the dynamic range is enhanced, which improves the user experience.
- the brightness information may include a brightness value
- determining the brightness information corresponding to each video frame image includes:
- the brightness information may include a brightness histogram
- determining the brightness information corresponding to each video frame image includes: generating a brightness histogram for each video frame image.
- tone mapping is performed on each video frame image using the brightness information corresponding to each video frame image and the expandable dynamic range of each video frame image to obtain an enhanced image corresponding to each video frame image, including: determining the standard dynamic area and the extended dynamic area corresponding to each video frame image using the brightness information corresponding to each video frame image; determining the first coefficient corresponding to the standard dynamic area of each video frame image and the second coefficient corresponding to the extended dynamic area of each video frame image according to the expandable dynamic range of each video frame image; tone mapping is performed on the pixel points in the standard dynamic area according to the first coefficient, and tone mapping is performed on the pixel points in the extended dynamic area according to the second coefficient, to obtain an enhanced image corresponding to each video frame image.
- the standard dynamic area includes several pixels in the video frame image, and the standard dynamic area also includes several pixels in the video frame image, wherein the brightness information of the pixels in the standard dynamic area is less than a preset threshold, and the brightness information of the pixels in the extended dynamic area is greater than or equal to the preset threshold.
- the preset threshold may be set by a user, may be calculated by a grayscale ratio, or may be determined by a machine learning model.
- the first coefficient is smaller than the second coefficient.
- the brightness information corresponding to each video frame image is used to determine the adjustable area in each video frame image, namely the standard dynamic area and the extended dynamic area; different coefficients are determined for the standard dynamic area and the extended dynamic area, and then the pixel values of the pixels in different areas (such as the standard dynamic area and the extended dynamic area) are adjusted according to different coefficients (such as the first coefficient and the second coefficient), so that the dynamic range of the enhanced image finally generated is truly improved. Then, the dynamic range of the video composed of multiple enhanced images is also improved, so that when the video with the improved dynamic range is displayed on the display device, the display effect is improved and the user experience is improved.
- tone mapping is performed on pixels in a standard dynamic area according to a first coefficient
- tone mapping is performed on pixels in an extended dynamic area according to a second coefficient to obtain an enhanced image corresponding to each video frame image, including: calculating a first product of an original pixel value of each pixel in the standard dynamic area and a first coefficient, and updating the pixel value of each pixel in the standard dynamic area according to the first product; calculating a second product of an original pixel value of each pixel in the extended dynamic area and a second coefficient, and updating the pixel value of each pixel in the extended dynamic area according to the second product; generating an enhanced image corresponding to each video frame image according to each updated pixel in the standard dynamic area and each updated pixel in the extended dynamic area.
- the pixel values of the pixels in different areas are adjusted respectively according to different coefficients (such as the first coefficient and the second coefficient).
- the first coefficient is less than the second coefficient and the second coefficient is 1, the pixel values of the pixels in the extended dynamic area are maintained, and the pixel values of the pixels in the standard dynamic area are reduced, so that the dynamic range of the enhanced image generated finally is truly improved.
- the first coefficient is less than the second coefficient and the first coefficient is 1, the pixel values of the pixels in the standard dynamic area are maintained, and the pixel values of the pixels in the extended dynamic area are increased, so that the dynamic range of the enhanced image generated finally is truly improved.
- tone mapping is performed on each video frame image using the brightness information corresponding to each video frame image and the expandable dynamic range of each video frame image to obtain an enhanced image corresponding to each video frame image, including: using the brightness information corresponding to each video frame image to determine the low grayscale area, medium grayscale area and high grayscale area corresponding to each video frame image; according to the expandable dynamic range of each video frame image, adjusting the brightness of at least one of the low grayscale area, the medium grayscale area and the high grayscale area to obtain an enhanced image corresponding to each video frame image.
- the current brightness of the low grayscale area and the middle grayscale area is maintained, and the brightness of the high grayscale area is increased.
- the current brightness of the middle grayscale area and the high grayscale area is maintained, and the brightness of the low grayscale area is reduced.
- the brightness histogram of each video frame image can be divided by T1 and T2 to obtain a low grayscale area, a medium grayscale area, and a high grayscale area.
- the values of T1 and T2 can be set by the user, or can be set according to the ratio of the brightness grayscale distribution, or can be determined by a machine learning model.
- the brightness information corresponding to each video frame image is used to determine the adjustable area in each video frame image, namely, the low grayscale area, the medium grayscale area, and the high grayscale area; then, according to the expandable dynamic range of each video frame image, the brightness of at least one of the low grayscale area, the medium grayscale area, and the high grayscale area is adjusted. For example, the current brightness of the low grayscale area and the medium grayscale area is maintained, and the brightness of the high grayscale area is adjusted. The brightness is improved, for example, the current brightness of the middle grayscale area and the high grayscale area is maintained, and the brightness of the low grayscale area is reduced. In this way, the dynamic range of the enhanced image generated finally can be truly improved.
- the brightness of at least one of the low grayscale area, the medium grayscale area, and the high grayscale area is adjusted according to the expandable dynamic range of each video frame image to obtain an enhanced image corresponding to each video frame image, including: determining the adjusted grayscale value of the pixel points in the low grayscale area according to the expandable dynamic range of each video frame image; determining the adjusted grayscale value of the pixel points in the medium grayscale area; determining the adjusted grayscale value of the pixel points in the high grayscale area; generating an enhanced image corresponding to each video frame image according to the adjusted grayscale values of each pixel point in the low grayscale area, the medium grayscale area, and the high grayscale area.
- the brightness information corresponding to each video frame image is used to determine the adjustable area in each video frame image, namely, the low grayscale area, the medium grayscale area, and the high grayscale area; then, according to the expandable dynamic range of each video frame image, the grayscale values of each pixel point after adjustment in the low grayscale area, the medium grayscale area, and the high grayscale area are determined, thereby generating an enhanced image corresponding to each video frame image. Since the grayscale values of each pixel point after adjustment in each area are determined according to the expandable dynamic range of each video frame image, the grayscale values of each pixel point after adjustment are adjusted to different degrees, thereby truly improving the dynamic range of the enhanced image finally generated.
- the adjusted grayscale value of the pixel points in the low grayscale area is determined according to the expandable dynamic range of each video frame image, including: determining the third coefficient corresponding to the low grayscale area according to the expandable dynamic range of each video frame image; performing tone mapping on the pixel points in the low grayscale area according to the third coefficient to obtain the adjusted grayscale value of the pixel points in the low grayscale area.
- the third coefficient may be determined according to the dynamic range of the original video and the scalable dynamic range of each video frame image.
- the grayscale value of each pixel in the low grayscale area after adjustment and the grayscale value before adjustment may satisfy the following relationship:
- y 1 (A/B)*k*x 1 , wherein x 1 represents the grayscale value of the pixel in the low grayscale area before adjustment, y 1 represents the grayscale value of the pixel in the low grayscale area after adjustment, A/B represents the third coefficient, A represents the dynamic range of the original video, B represents the expandable dynamic range of each video frame image, and k represents a constant.
- the grayscale value of each pixel in the low grayscale area after adjustment and the grayscale value before adjustment may satisfy the following relationship:
- y 1 M*(A/B)*k*x 1 , (M>1), wherein x 1 represents the grayscale value of the pixel in the low grayscale area before adjustment, y 1 represents the grayscale value of the pixel in the low grayscale area after adjustment, M*(A/B) represents the third coefficient, A represents the dynamic range of the original video, B represents the expandable dynamic range of each video frame image, and M and k represent constants.
- the grayscale value of each pixel in the medium grayscale area after adjustment and the grayscale value before adjustment may satisfy the following relationship:
- y 2 k1*x 2 + b1
- x 2 represents the grayscale value of the pixel in the middle grayscale area before adjustment
- y 2 represents the grayscale value of the pixel in the middle grayscale area after adjustment
- k1 and b1 represent constants.
- the values of k1 and b1 can be set according to the proportion of the brightness grayscale distribution, or can be determined by a machine learning model.
- the grayscale value of each pixel in the high grayscale area after adjustment and the grayscale value before adjustment may satisfy the following conditions:
- y 3 k2*x 3 + b2, where x 3 represents the grayscale value of the pixel in the high grayscale area before adjustment, y 3 represents the grayscale value of the pixel in the high grayscale area after adjustment, and k2 and b2 represent constants.
- the values of k2 and b2 can be set according to the proportion of brightness grayscale distribution, or can be determined by a machine learning model.
- the grayscale values of each pixel after adjustment in the low grayscale area, the middle grayscale area, and the high grayscale area are determined according to the expandable dynamic range of each video frame image, and finally an enhanced image corresponding to each video frame image is generated. Since the grayscale values of each pixel after adjustment in each area are determined according to the expandable dynamic range of each video frame image, the grayscale values of each pixel after adjustment are adjusted to different degrees, so that the dynamic range of the enhanced image finally generated is truly improved.
- the present application provides a display device, which is included in a display device, and the display device has the function of realizing the display device behavior in the above-mentioned first aspect and the possible implementation of the above-mentioned first aspect.
- the function can be implemented by hardware, or the corresponding software can be implemented by hardware.
- the hardware or software includes one or more modules or units corresponding to the above-mentioned functions. For example, a first acquisition module or unit, a determination module or unit, a second acquisition module or unit, a processing module or unit, etc.
- the present application provides a display device, comprising: a processor, a memory, and an interface; the processor, the memory, and the interface cooperate with each other so that the display device executes any one of the methods in the technical solution provided in the first aspect.
- the present application provides a chip, including a processor.
- the processor is used to read and execute a computer program stored in a memory to perform the method in the first aspect and any possible implementation thereof.
- the chip also includes a memory, and the memory is connected to the processor via a circuit or wire.
- the chip also includes a communication interface.
- the present application provides a computer-readable storage medium, in which a computer program is stored.
- the computer program is executed by a processor, the processor executes any one of the methods in the technical solution of the first aspect.
- the present application provides a computer program product, the computer program product comprising: a computer program code, when the computer program code runs on a display device, the display device executes any one of the methods in the technical solution of the first aspect.
- FIG1 is a diagram showing a frame of an SDR video when it is played on a display device in a related art according to an exemplary embodiment of the present application;
- FIG2 is a one-frame image of a processed SDR video played on a display device, shown in an exemplary embodiment of the present application;
- FIG3 is a schematic diagram of an application scenario provided by an embodiment of the present application.
- FIG4 is a schematic diagram of an enhanced image in another application scenario provided by an embodiment of the present application.
- FIG5 is a schematic diagram of a flow chart of a method for processing a video provided in an embodiment of the present application
- FIG6 is a diagram showing brightness information provided in an embodiment of the present application.
- FIG. 7 is another brightness information display diagram provided in an embodiment of the present application.
- FIG8 is a schematic diagram of a process for generating an enhanced image corresponding to a video frame image provided by an embodiment of the present application
- FIG9 is another schematic diagram of a process for generating an enhanced image corresponding to a video frame image provided by an embodiment of the present application.
- FIG10 is a schematic diagram of a region division provided in an embodiment of the present application.
- FIG11 is a schematic diagram of brightness area division provided in an embodiment of the present application.
- FIG12 is a schematic diagram of tone mapping provided in an embodiment of the present application.
- FIG13 is a schematic diagram of an implementation process shown in an exemplary embodiment of the present application.
- FIG14 shows a schematic structural diagram of a display device provided in an embodiment of the present application.
- FIG15 is a schematic diagram of the structure of a display device provided in an embodiment of the present application.
- FIG. 16 is a schematic diagram of the structure of a chip provided in an embodiment of the present application.
- first and second are used for descriptive purposes only and are not to be understood as indicating or implying relative importance or implicitly indicating the number of the indicated technical features.
- a feature defined as “first” or “second” may explicitly or implicitly include one or more of the features.
- plural means two or more.
- references to "one embodiment” or “some embodiments” etc. described in the specification of this application mean that one or more embodiments of the present application include specific features, structures or characteristics described in conjunction with the embodiment. Therefore, the statements “in one embodiment”, “in some embodiments”, “in some other embodiments”, “in some other embodiments”, etc. that appear in different places in this specification do not necessarily refer to the same embodiment, but mean “one or more but not all embodiments", unless otherwise specifically emphasized in other ways.
- the terms “including”, “comprising”, “having” and their variations all mean “including but not limited to”, unless otherwise specifically emphasized in other ways.
- dynamic range refers to the ratio of the brightest light intensity to the darkest light intensity in an image. Dynamic range is one of the most important dimensions for image quality evaluation.
- the dynamic range of natural scenes is relatively large, usually up to 10 to the 9th power.
- the dynamic range of the human eye is also very wide, generally believed to be at least 10 to the 6th power.
- 8-bit integer data is usually used to record the pixel value of the image, traditional images and video content can only distinguish 256 different brightness levels. When displayed on an ordinary monitor, its brightness dynamic range is about 10 to the 3rd power.
- HDR is also called "High Dynamic Range Lighting Rendering".
- the significance of HDR is to make the display image closer to the real world observed by the human eye.
- video is actually a sequence of images composed of static images.
- High-Dynamic Range Image (HDRI) can provide more dynamic range and image details.
- LDR Low-Dynamic Range
- the final HDR image is synthesized using the LDR image with the best details corresponding to each exposure time, which can better reflect the visual effects in the real environment.
- HDR video is composed of such HDRI frames, so HDR video is more vivid and textured than ordinary video.
- SDR is a traditional technology for processing brightness and color values in images and is a common color display method.
- traditional videos SDR videos In order to display the original scene within a limited brightness range, SDR videos will severely compress the brightness range of natural scenes, resulting in a significant lack of contrast in the video content.
- RGB Red, Green, Blue
- RGB domain refers to a color model related to the structure of the human visual system. According to the structure of the human eye, all colors are regarded as different combinations of red, green and blue.
- each pixel corresponds to a set of three primary color components, where the three primary color components are the red component R, the green component G, and the blue component B.
- YUV domain refers to a color encoding method.
- Y represents brightness (Luminance or Luma)
- U and V represent chrominance (Chrominance or Chroma).
- the above RGB color space focuses on the human eye's perception of color, while the YUV color space focuses on the visual sensitivity to brightness.
- the RGB color space and the YUV color space can be converted to each other.
- Brightness is the luminous flux emitted by a light source unit in a given direction, per unit area and per unit solid angle.
- the symbol of brightness is L and the unit is nit.
- each point on the LCD screen that is, a pixel, is composed of three sub-pixels: red, green, and blue (RGB).
- the light source behind each sub-pixel can show different brightness levels, and the grayscale represents the different brightness levels from the darkest to the brightest. The more levels in between, the more delicate the picture effect can be presented.
- the screen can usually show 256 brightness levels, which we call 256 grayscales.
- Tone mapping is a computer graphics technique for approximating high dynamic range images on limited dynamic range media.
- the brightness histogram is a quantitative tool used to detect the brightness of an image.
- the brightness information of an image is usually divided into 0 to 255.
- the horizontal axis of the brightness histogram represents the brightness (brightness value) from 0 to 255, and the vertical axis represents the number of pixels of the corresponding brightness in the image.
- the leftmost of the horizontal axis is 0, and the rightmost is 255.
- 0 represents the darkest place
- 255 represents the brightest place.
- the value in the middle represents gray of different brightness. The larger the value, the brighter it is.
- High-Dynamic Range (HDR) video compared with Standard Dynamic Range (SDR) video, has clearer light and dark levels, richer image details, and can reproduce real scenes more realistically.
- HDR video can display video content with a higher dynamic range, thus providing users with a better video viewing experience.
- the brightness capability of a display device is usually expressed by the dynamic range of the display device. For example, the higher the dynamic range of a display device, the stronger the brightness capability of the display device.
- the dynamic range of the display device has exceeded the dynamic range of SDR video.
- Using the display device to display SDR video will result in the inability to fully utilize the brightness capability of the display device, resulting in a waste of display resources.
- SDR video is usually stored in a data width of 8 bits (binary digit, bit), and the dynamic range it can express is about 100 nits.
- the dynamic range that the display device can express can reach more than 1000nit. Therefore, using such a display device to display SDR video will result in the inability to fully utilize the brightness capability of the display device, resulting in a waste of display resources.
- an embodiment of the present application provides a method for processing video, which determines the actual scalable dynamic range of SDR video by determining the brightness information of SDR video and combining it with the brightness capability of the current display device, and performs tone mapping on SDR video based on the scalable dynamic range, thereby truly improving the dynamic range of SDR video.
- the display effect is improved.
- FIG. 1 is a one-frame image when an SDR video is played on a display device in the related art shown in an exemplary embodiment of the present application.
- the frame image has low brightness and low dynamic range, and cannot realistically reproduce the real scene, resulting in a poor display effect.
- the method for processing a video provided by an embodiment of the present application, for example, by determining the brightness information of the SDR video and combining it with the brightness capability of the current display device (such as a mobile phone), determines the actual scalable dynamic range of the SDR video. Based on the scalable dynamic range, tone mapping is performed on the SDR video, which can truly improve the dynamic range of the SDR video.
- the display device such as a mobile phone
- FIG. 2 is a one-frame image of a processed SDR video played on a display device according to an exemplary embodiment of the present application.
- the brightness and dynamic range of the frame image are significantly improved compared to the image in FIG. 1 , and the real scene can be realistically reproduced, thereby improving the display effect.
- Figure 3 is a schematic diagram of an application scenario provided by an embodiment of the present application.
- the method for processing a video provided by the present application can be used to improve the dynamic range of an SDR video.
- the display device is a mobile phone.
- the display device detects that the user clicks the icon of the video application on the application interface, the video application can be started and the video application interface can be displayed.
- the user can click on the video he likes in the video application interface and select the full-screen playback mode.
- the display device displays a graphical user interface (GUI) as shown in (a) of FIG. 3.
- GUI can be called a video playback interface.
- the user can use a touch object (such as a user's finger or a stylus, etc.) to tap the left side of the screen (only the left side is shown in this embodiment, in a possible implementation, the user can also tap the right side of the screen) and slide upward, and the display device responds to the user's touch operation and displays a brightness control as shown in (c) of Figure 3 on the video playback interface.
- the screen brightness of the display device is enhanced (not shown in (c) of Figure 3).
- the frame image displayed on the video playback interface is processed, such as determining the brightness information corresponding to the frame image, the dynamic range corresponding to the frame image, and determining the dynamic range of the current display device.
- the frame image is tone mapped to obtain an enhanced image corresponding to the frame image.
- Figure 4 is a schematic diagram of an enhanced image in another application scenario provided by an embodiment of the present application.
- the enhanced image displayed in the video playback interface shown in Figure 4 has a significantly improved dynamic range, can realistically reproduce the real scene, and improves the display effect.
- the display device is still used as a mobile phone for example.
- the display device detects that the user clicks the icon of the video application on the application interface
- the video application can be started and the video application interface can be displayed.
- the user can click on the video he likes in the video application interface and select the full-screen playback mode.
- the display device displays a graphical user interface (GUI) as shown in (a) of Figure 3.
- the GUI can be called a video playback interface.
- the video playback interface can include an HDR control.
- the display device detects that the user clicks the HDR control, the HDR mode is turned on and the video is played in the HDR mode.
- each frame of the video is processed by the method for processing the video provided in the embodiment of the present application.
- the brightness information corresponding to each frame of the image, the dynamic range corresponding to each frame of the image, and the dynamic range of the current display device are determined.
- the information, dynamic range, and dynamic range of the current display device are used to perform tone mapping on each frame of the image to obtain an enhanced image corresponding to each frame of the image.
- Multiple frames of enhanced images constitute an enhanced video, which is a video with an enhanced dynamic range. Therefore, when the enhanced video is played on the display device (such as a mobile phone), the display effect is improved.
- enhanced videos have brighter and more vivid colors and clearer details. They can provide more dynamic range and image details, greatly improve the light and dark contrast of picture details, and better reflect the visual effects in the real environment.
- the display device is still used as a mobile phone for example. If the user turns on the automatic brightness adjustment function in the display device in advance, the display device will automatically adjust the brightness when playing the video.
- each frame of the image at the brightness is processed by the method for processing the video provided in the embodiment of the present application. For example, the brightness information corresponding to each frame of the image at the brightness, the dynamic range corresponding to each frame of the image at the brightness, and the dynamic range of the current display device are determined. According to the brightness information, dynamic range and dynamic range corresponding to each frame of the image at the brightness, tone mapping is performed on each frame of the image at the brightness to obtain an enhanced image corresponding to each frame of the image at the brightness.
- the multiple frames of enhanced images at the brightness constitute an enhanced video at the brightness, and the enhanced video at the brightness is a video after the dynamic range is improved. Therefore, when the enhanced video at the brightness is played on the display device (such as a mobile phone), the display effect is improved.
- FIG. 3 and FIG. 4 are examples of application scenarios and do not limit the application scenarios of the present application.
- the method for processing a video provided in the embodiment of the present application can be applied but not limited to the following scenarios:
- the method for processing a video provided in the embodiment of the present application can be used in application scenarios related to video, wherein the application scenarios related to video may include a display device playing a video online; or, the application scenarios related to video may include a display device performing video recording; or, the application scenarios related to video may include a display device performing video live broadcast, etc.
- the application scenarios related to video may include a display device playing a video online; or, the application scenarios related to video may include a display device performing video recording; or, the application scenarios related to video may include a display device performing video live broadcast, etc.
- This is only an exemplary description and is not limited to this.
- Figure 5 is a schematic flow chart of a method for processing a video provided in an embodiment of the present application. As shown in Figure 5, the method for processing a video includes the following S101-S104.
- the original video may include SDR video, video in a call, video in a conference application, video in a live broadcast application, video in a long and short video application, video in surveillance, video recorded by a system camera recording function, etc.
- the original video is obtained, and the original video is decoded to obtain a decoded video of the original video.
- the decoded video includes a plurality of video frame images.
- the decoded video may include two or more video frames. There is no limitation in the embodiments of the present application.
- the SDR video can be a video played online by a display device.
- the SDR video is decoded to obtain a decoded video of the SDR video.
- the SDR video can be decoded by a graphics processor (Graphics Processing Unit, GPU) in the display device to obtain a decoded video of the SDR video.
- the SDR video can be decoded by a video processing card in the display device to obtain a decoded video of the SDR video.
- the SDR video can be decoded by a video transcoding server to obtain a decoded video of the SDR video. This is only an exemplary description and is not limited to this.
- S102 Determine brightness information corresponding to each video frame image.
- the decoded video is processed, for example, the brightness information of each video frame image contained in the decoded video is determined. It is understandable that there are multiple ways to determine the brightness information corresponding to each video frame image, and each way is described below.
- the pixel format of the video frame image contained in the decoded video may be determined first, and the brightness information corresponding to each video frame image may be determined according to the pixel format.
- the pixel format may include a YUV format and an RGB format. It is understood that for the same decoded video, the pixel format of each video frame image contained in the decoded video is consistent.
- the pixel format of the video frame image is in YUV format, and Y represents the brightness of the video frame image
- the Y value corresponding to each video frame image can be directly obtained, and the Y value represents the brightness information corresponding to a video frame image.
- L represents the brightness information of the video frame image
- R represents the value of the red component of the video frame image
- G represents the value of the green component of the video frame image
- B represents the value of the blue component of the video frame image
- the pixel format of the video frame image can also be converted from YUV format to RGB format, and then the brightness information of the video frame image can be determined by the above formula (1).
- the pixel format of the video frame image is converted from YUV format to RGB format, which can be achieved by the following formula (2).
- R represents the value of the red component of the video frame image
- G represents the value of the green component of the video frame image
- B represents the value of the blue component of the video frame image
- Y represents the brightness of the video frame image
- U and V represent the chromaticity of the video frame image.
- the brightness information can be digitized into 0-255, that is, the brightness level of each pixel in the video frame image can be represented by 0-255.
- FIG. 6 is a brightness information display diagram provided by an embodiment of the present application.
- FIG. 6 (a) shows any video frame image.
- (b) shows the brightness information corresponding to the video frame image.
- the brightness information corresponding to each video frame image can be determined using a brightness histogram.
- the brightness histogram corresponding to each video frame image can be determined by running a preset code, and each brightness histogram is statistically analyzed to calculate the histogram distribution of each video frame image.
- the histogram distribution of each video frame image is used to represent the brightness information corresponding to each video frame image.
- the brightness level of each pixel in the video frame image can be represented by the statistical frequency of each grayscale.
- Figure 7 is another brightness information display diagram provided in an embodiment of the present application.
- Figure 7 (a) shows a brightness information histogram in one form of expression
- Figure 7 (b) shows a brightness information histogram in another form of expression.
- the horizontal axes in Figure 7 (a) and Figure 7 (b) both represent brightness (i.e., brightness value)
- the vertical axes both represent the number of pixels in the video frame image corresponding to the brightness of the horizontal axis.
- the leftmost horizontal axis represents the darkest part of the video frame image
- the rightmost horizontal axis represents the brightest part of the video frame image.
- the values in the middle represent grays of different brightness, and the larger the value, the brighter it is.
- the display device detects that multiple consecutive video frame images are similar, the brightness information of one of the multiple consecutive video frame images is determined, and the brightness information of the video frame image is used as the brightness information of the remaining similar video frame images. For example, if the display device detects that four consecutive video frame images are similar (such as the similarity is greater than or equal to the preset similarity threshold), the display device determines the brightness information of the first video frame image among the four video frame images, and uses the brightness information of the first video frame image as the brightness information of the other three video frame images. This is only an exemplary description and is not limited to this.
- the display device playing videos online is a process of displaying images frame by frame in chronological order.
- the display device determines the brightness information corresponding to each video frame image, it also determines the brightness information corresponding to each video frame image frame by frame in chronological order.
- S103 Acquire the brightness capability of the display device and the dynamic range of the original video, and calculate the scalable dynamic range of each video frame image according to the brightness capability of the display device and the dynamic range of the original video.
- the brightness capability of the display device is obtained.
- the brightness capability can be represented by the dynamic range of the display device. It should be understood that each dynamic range corresponds to a peak brightness, and the higher the peak brightness, the higher the dynamic range, that is, the stronger the brightness capability of the display device.
- the brightness capability of the display device can be adjusted, for example, the current brightness capability of the display device can be flexibly adjusted according to different user needs.
- the brightness capability of the display device may be the same or different in different scenarios. For example, when the display device plays a video online, the brightness capability of the display device may be the same or different when playing different scene pictures.
- obtaining the brightness capability of the display device may be to obtain the maximum brightness capability of the display device, that is, to obtain the maximum brightness capability that the display device can reach. For example, if the dynamic range that the display device can express can reach 1000nit, then the maximum brightness capability of the display device is 1000nit, or in other words, the brightness value of the dynamic range of the display device is 1000nit. This is only an exemplary description and is not limited to this.
- the display device can automatically adjust the brightness.
- the display device will automatically adjust the brightness when playing a video online.
- obtaining the brightness capability of the display device may be to obtain the current brightness capability of the display device in real time.
- the display device automatically adjusts the brightness to 800nit when playing a video online, and the current brightness capability of the display device obtained in real time is 800nit.
- the display device automatically adjusts the brightness to 400nit when playing a video online, and the current brightness capability of the display device obtained in real time is 400nit. This is only an exemplary description and is not limited to this.
- the user can adjust the brightness according to his or her needs. For example, when the display device plays a video online, in the current video playback interface, the user can tap the left side of the screen (in a possible implementation, the user can also tap the right side of the screen) and slide up with a touch object (such as a user's finger or a stylus, etc.), and the display device responds to the user's touch operation, and the screen brightness is enhanced. If the user taps the left side (or right side) of the screen with a touch object and slides down, the display device responds to the user's touch operation, and the screen brightness dims.
- a touch object such as a user's finger or a stylus, etc.
- obtaining the brightness capability of the display device may be to obtain the brightness capability of the display device after adjusting the brightness.
- the display device adjusts the brightness to 600nit in response to the user's touch operation, and the current brightness capability of the display device is obtained to be 600nit. This is only an exemplary description and is not limited to this.
- the dynamic range of the original video is obtained.
- obtaining the dynamic range of the original video means obtaining the dynamic range corresponding to the original video.
- the actual brightness statistical level of each video frame image may be considered. Therefore, obtaining the dynamic range of the original video may be to obtain the dynamic range of the original video in the scene corresponding to each video frame image.
- the dynamic range of the SDR video may be obtained, or the dynamic range of the SDR video in the scene corresponding to each video frame image may be obtained.
- the display device has the ability to automatically obtain the dynamic range of the original video in the scene corresponding to each video frame image, that is, the display device can automatically obtain the dynamic range of each video frame image in the scene corresponding to the image.
- the display device obtains that the dynamic range of the SDR video in the scene corresponding to the first video frame image is 200nit, or in other words, the display device obtains that the brightness value of the dynamic range in the scene corresponding to the first video frame image is 200nit.
- the first video frame image is used to represent any one of the multiple video frame images contained in the SDR video. This is only an exemplary description and is not limited to this.
- the scalable dynamic range of each video frame image is calculated using the brightness capability of the display device and the dynamic range of the original video. It should be understood that the acquired brightness capability of the display device and the dynamic range of the original video can both be represented by brightness values, and the difference between the two brightness values is calculated, and the difference is used as the scalable dynamic range of the video frame image.
- the current brightness capability of the display device is obtained to be 600 nit
- the brightness value of the dynamic range in the scene corresponding to the first video frame image is 200 nit
- the difference between 600 nit and 200 nit is calculated to be 400 nit
- the difference, i.e. 400 nit is used as the expandable dynamic range of the first video frame image.
- the display device detects that multiple consecutive video frame images are similar, it calculates the scalable dynamic range of one of the multiple consecutive video frame images, and uses the scalable dynamic range of the video frame image as the scalable dynamic range of the remaining similar video frame images. For example, if the display device detects that three consecutive video frame images are similar (such as the similarity is greater than or equal to a preset similarity threshold), the display device calculates the scalable dynamic range of the first video frame image among the three video frame images, and uses the scalable dynamic range of the first video frame image as the scalable dynamic range of the other three video frame images. This is only an exemplary description and is not limited to this.
- the display device playing videos online is a process of displaying images frame by frame in chronological order.
- the display device calculates the scalable dynamic range of each video frame image, it also calculates the scalable dynamic range of each video frame image frame by frame in chronological order.
- the brightness information corresponding to each video frame image is used to determine the extended area of each video frame image.
- the extended area includes different areas in different implementations, and the extended area is used for tone mapping, which will be described in detail in the following embodiments.
- the extended area of each video frame image is tone mapped.
- the enhanced image corresponding to each video frame image is generated according to the mapping result.
- the enhanced images Compared with the corresponding video frame images, the enhanced images have brighter and more vivid colors and clearer details. They can provide more dynamic range and image details, greatly improve the light and dark contrast of the picture details, and better reflect the visual effects in the real environment.
- the method for processing video determines the brightness information corresponding to each video frame image in the decoded video of the original video; calculates the scalable dynamic range of each video frame image through the brightness capability of the display device and the dynamic range of the original video; and then uses the brightness information corresponding to each video frame image and the scalable dynamic range of each video frame image to perform tone mapping on each video frame image to obtain an enhanced image corresponding to each video frame image.
- the scalable dynamic range of each video frame image When determining the scalable dynamic range of each video frame image, the brightness capability of the display device and the dynamic range of the original video are fully considered, and then the scalable dynamic range is used to perform tone mapping on each video frame image, so that the dynamic range of the video frame image is truly improved, that is, the enhanced image is the image after the dynamic range is improved. It can be seen from this that the dynamic range of the video composed of multiple enhanced images is also improved, so that when the video with the improved dynamic range is displayed on the display device, the display effect is improved and the user experience is improved.
- the method for processing a video provided in the embodiment of the present application may further include S105 on the basis of including S101 to S104, as follows:
- S105 Generate an enhanced video according to the multiple enhanced images.
- the enhanced video is an extended dynamic range video, that is, the enhanced video is a video with an extended dynamic range.
- the enhanced images corresponding to each video frame image are arranged in chronological order to obtain an enhanced video, which can be played on a display device.
- the enhanced image corresponding to the video frame image is displayed on the display device. Since the display device processes very quickly, for the user, it is like watching a video smoothly, and the video watched is a video with an enhanced dynamic range, which improves the user experience.
- Fig. 8 is a schematic diagram of a process of generating an enhanced image corresponding to a video frame image provided by an embodiment of the present application.
- the above S104 may include S1041-S1043.
- S1041. Determine a standard dynamic range and an extended dynamic range corresponding to each video frame image by using brightness information corresponding to each video frame image.
- the extended area in the above S104 in this implementation includes a standard dynamic area and an extended dynamic area.
- the extended dynamic area is an area divided in a video frame image, and the extended dynamic area includes a number of pixel points in the video frame image.
- the pixel values of the pixel points in the extended dynamic area do not need to be adjusted, that is, the pixel values of the pixel points in the extended dynamic area maintain the original pixel values.
- the standard dynamic area is another area divided in the video frame image.
- the standard dynamic area also includes several pixels in the video frame image. Unlike the pixels in the extended dynamic area, the pixel values of the pixels in the standard dynamic area need to be adjusted.
- the pixels of each video frame image can be divided by a preset threshold and the brightness information of each video frame image.
- the pixels of each video frame image are divided into pixels whose brightness information is less than the preset threshold, and pixels whose brightness information is greater than or equal to the preset threshold.
- the area composed of pixels whose brightness information is less than the preset threshold is the standard dynamic area, and the area composed of pixels whose brightness information is greater than or equal to the preset threshold is the extended dynamic area.
- the preset threshold value may be set by a user, calculated by grayscale ratio, or determined by a machine learning model. This is only an exemplary description and is not limited to this.
- a preset threshold is used to divide two areas, the standard dynamic area and the extended dynamic area.
- a more refined division can be achieved through multiple preset threshold ranges, that is, the pixels of each video frame image are divided through multiple preset threshold ranges and the brightness information of each video frame image. For example, for the pixels of each video frame image, the pixels whose brightness information belongs to the first preset threshold range are divided into the pixels of the standard dynamic area, and the pixels whose brightness information belongs to the second preset threshold range are divided into the pixels of the extended dynamic area.
- the first preset threshold range and the second preset threshold range are different. It should be understood that the brightness information of the pixels belonging to the first preset threshold range is less than the brightness information of the pixels belonging to the second preset threshold range.
- S1042 Determine a first coefficient corresponding to the standard dynamic range and a second coefficient corresponding to the extended dynamic range.
- the first coefficient is smaller than the second coefficient.
- the first coefficient corresponding to the standard dynamic area is determined.
- the dynamic range of the original video is recorded as A
- the scalable dynamic range of each video frame image is recorded as B
- the quotient between A and B is calculated
- the quotient is used as the first coefficient corresponding to the standard dynamic area.
- the value of A is less than the value of B. Therefore, the first coefficient corresponding to the standard dynamic area is less than 1.
- the dynamic range of the original video may include the dynamic range of the SDR video in the scene corresponding to each video frame image.
- the second coefficient corresponding to the extended dynamic area may be set to 1.
- the second coefficient corresponding to the extended dynamic area can be set to 1, so that the pixel value of the pixel point in the extended dynamic area can maintain the original pixel value.
- the first coefficient corresponding to the standard dynamic area is less than 1, so that the pixel value of the pixel point in the standard dynamic area can be reduced.
- the pixel values of the pixels in the standard dynamic area do not need to be adjusted, so the first coefficient corresponding to the standard dynamic area can be set to 1.
- the pixel values of the pixels in the extended dynamic area are increased, so the second coefficient corresponding to the extended dynamic area can be set to a value greater than 1.
- the dynamic range of the original video is recorded as A
- the expandable dynamic range of each video frame image is recorded as B
- the quotient between B and A is calculated, and the quotient is used as the second coefficient corresponding to the standard dynamic area.
- the value of A is smaller than the value of B. Therefore, the second coefficient corresponding to the extended dynamic area is greater than 1.
- the second coefficient corresponding to the extended dynamic area is greater than 1, so that the pixel value of the pixel point in the extended dynamic area can be improved.
- the first coefficient corresponding to the standard dynamic area is set to 1, so that the original pixel value of the pixel point in the standard dynamic area can be maintained.
- Tone mapping is performed on the pixels in the standard dynamic area according to the first coefficient, and tone mapping is performed on the pixels in the extended dynamic area according to the second coefficient, so as to generate an enhanced image corresponding to each video frame image.
- the product between the original pixel value of the pixel point and the first coefficient is calculated, and the product is used as the new pixel value of the pixel point.
- the product between the original pixel value of the pixel point and the second coefficient is calculated, and the product is used as the new pixel value of the pixel point.
- the second coefficient corresponding to the extended dynamic area is 1, for each pixel point in the extended dynamic area, the original pixel values of these pixel points do not need to be processed, and the original pixel values of these pixel points can be saved.
- an enhanced image corresponding to the video frame image is generated based on the pixel points after the pixel values are adjusted. It should be understood that in this embodiment, if the pixel values of all the pixel points in the video frame image are adjusted, the enhanced image corresponding to the video frame image is generated based on the pixel points after all the pixel values are adjusted. If the pixel values of some of the pixel points in the video frame image are adjusted, the enhanced image corresponding to the video frame image is jointly generated based on the pixel points after the pixel values are adjusted and the pixel points without adjusting the pixel values. This is only an exemplary description and is not limited to this.
- tone mapping is performed on each video frame image, which means adjusting the pixel values of pixels in different areas (such as the standard dynamic area and the extended dynamic area) according to different coefficients (such as the first coefficient and the second coefficient).
- the brightness information corresponding to each video frame image is used to determine the adjustable area in each video frame image, namely the standard dynamic area and the extended dynamic area; different coefficients are determined for the standard dynamic area and the extended dynamic area, and then the pixel values of the pixels in different areas (such as the standard dynamic area and the extended dynamic area) are adjusted according to different coefficients (such as the first coefficient and the second coefficient), and the pixel values of the pixels in the standard dynamic area are reduced while maintaining the pixel values of the pixels in the extended dynamic area, so that the dynamic range of the enhanced image finally generated is truly improved. Then, the dynamic range of the video composed of multiple enhanced images is also improved, so that when the video with the improved dynamic range is displayed on the display device, the display effect is improved and the user experience is improved.
- the second coefficient corresponding to the extended dynamic area is greater than 1 and the first coefficient corresponding to the standard dynamic area is 1, then for each pixel in the extended dynamic area, the product between the original pixel value of the pixel and the second coefficient is calculated, and the product is used as the new pixel value of the pixel.
- the product between the original pixel value of the pixel point and the first coefficient is calculated, and the product is used as the new pixel value of the pixel point.
- the first coefficient corresponding to the standard dynamic area is 1 at this time, for each pixel point in the standard dynamic area, the original pixel values of these pixel points may not be processed, and the original pixel values of these pixel points may be saved.
- the brightness information corresponding to each video frame image is used to determine the adjustable area in each video frame image, namely the standard dynamic area and the extended dynamic area; different coefficients are determined for the standard dynamic area and the extended dynamic area, and then the pixel values of the pixels in different areas (such as the standard dynamic area and the extended dynamic area) are adjusted according to different coefficients (such as the first coefficient and the second coefficient), and the pixel values of the pixels in the extended dynamic area are increased while maintaining the pixel values of the pixels in the standard dynamic area, so that the dynamic range of the enhanced image generated is truly improved. Then, the dynamic range of the video composed of multiple enhanced images is also improved, so that when the video with the improved dynamic range is displayed on the display device, the display effect is improved and the user experience is improved.
- the pixel values of pixels in different areas are adjusted according to different coefficients (such as the first coefficient and the second coefficient).
- the coefficient corresponding to each pixel can also be calculated one by one according to the actual brightness level of each pixel in the video frame image.
- the pixel value of the pixel is adjusted according to the coefficient corresponding to the pixel.
- the product between the original pixel value of the pixel and the coefficient is calculated, and the product is used as the new pixel value of the pixel.
- an enhanced image corresponding to the video frame image is generated according to all the pixels after the pixel values are adjusted.
- each pixel is adjusted individually in a targeted manner, which can improve the dynamic range of the enhanced image generated in the final generation, and at the same time improve the quality of the enhanced image generated in the final generation.
- FIG. 9 is another example of generating an enhanced image corresponding to a video frame image provided by an embodiment of the present application.
- the above S104 may include S1044-S1046. It is worth noting that S1044-S1046 are parallel to the above S1041-S1043, and S1041-S1043 or S1044-S1046 may be selected for execution according to actual conditions, and S1044-S1046 is not executed after S1041-S1043.
- S1044. Determine a first region, a second region, and a third region corresponding to each video frame image by using brightness information corresponding to each video frame image.
- the extended area in the above S104 in this implementation manner includes a first area, a second area, and a third area.
- the first region may represent a low grayscale region
- the second region may represent a medium grayscale region
- the third region may represent a high grayscale region
- At least one of the first area, the second area, and the third area is an area that needs to be re-tone mapped.
- the current brightness of the first area and the second area can be maintained, and the third area is used as an area that needs to be re-tone mapped.
- the current brightness of the low grayscale area and the middle grayscale area is maintained, and the brightness of the high grayscale area is improved.
- the first area, the second area, and the third area are all used as areas that need to be re-tone mapped.
- the brightness of the low grayscale area, the middle grayscale area, and the high grayscale area is improved. It should be understood that the magnitude of the brightness increased in different areas may be the same or different.
- the current brightness of the second area and the third area can be maintained, and the brightness of the first area can be reduced.
- the current brightness of the middle grayscale area and the high grayscale area is maintained, and the brightness of the low grayscale area is reduced.
- the brightness histogram of each video frame image can be divided by T1 and T2.
- T1 and T2 are schematic diagrams of region division provided in an embodiment of the present application.
- the brightness distribution represented by the brightness histogram is divided into 3 regions, such as the first region: [0, T1), the second region [T1, T2], and the third region (T2, 255).
- the values of T1 and T2 can be set by the user or determined by a machine learning model.
- the machine learning model learns the grayscale distribution of pixels in the brightness histogram of the video frame image, and determines two numerical values based on the distribution, which are the values of T1 and T2.
- the brightness histogram of the video frame image can be input into the machine learning model, and the machine learning model analyzes and processes the brightness histogram and outputs the values of T1 and T2. This is only an exemplary description and is not limited to this.
- Figure 11 is a schematic diagram of brightness area division provided in an embodiment of the present application.
- the light gray area including the unevenly distributed light gray points shown in Figure 11
- the black area including all black areas shown in Figure 11
- the dark gray area is divided into a third area, i.e., a high grayscale area.
- the colors shown in Figure 11 are only used to distinguish the brightness areas and do not represent the true colors of the video frame image.
- a finer division can be achieved through multiple values (such as values similar to T1 and T2), that is, the brightness distribution represented by the brightness histogram is divided into multiple areas through multiple values.
- S1045 Determine a first adjustment strategy corresponding to the first area, a second adjustment strategy corresponding to the second area, and a third adjustment strategy corresponding to the third area.
- Figure 12 is a tone mapping schematic diagram provided in an embodiment of the present application.
- the horizontal axis in Figure 12 represents the value after normalization of 255 grayscale, and the vertical axis represents the grayscale value after tone mapping.
- first area, the second area and the third area correspond to different straight lines in Figure 12.
- first area corresponds to the straight line GH
- second area corresponds to the straight line HI
- third area corresponds to the straight line IJ.
- the first adjustment strategy can be used to determine the adjusted grayscale value of the pixel points in the low grayscale area. For example, according to the expandable dynamic range of each video frame image, the third coefficient corresponding to the low grayscale area is determined; according to the third coefficient, the pixel points in the low grayscale area are tone mapped to obtain the adjusted grayscale value of the pixel points in the low grayscale area.
- y 1 (A/B)*k*x 1 with the grayscale value before adjustment.
- the slope of the straight line GH is adjusted.
- x1 represents the grayscale value of the pixel in the low grayscale area before adjustment
- y1 represents the grayscale value of the pixel in the low grayscale area after adjustment
- A/B represents the third coefficient
- A represents the dynamic range of the original video
- B represents the expandable dynamic range of each video frame image
- k represents a constant.
- x1 represents the grayscale value of the pixel in the low grayscale area before adjustment
- y1 represents the grayscale value of the pixel in the low grayscale area after adjustment
- M*(A/B) represents the third coefficient
- A represents the dynamic range of the original video
- B represents the expandable dynamic range of each video frame image
- M and k represent constants.
- the adjusted grayscale values of the pixels in the middle grayscale area may be determined by the second adjustment strategy.
- the straight line HI corresponding to the second area may be adjusted to a straight line HK, that is, the original linear tone mapping corresponding to the second area is expanded.
- x2 represents the grayscale value of the pixel point in the middle grayscale area before adjustment
- y2 represents the grayscale value of the pixel point in the middle grayscale area after adjustment
- k1 and b1 represent constants.
- the adjusted grayscale values of the pixels in the high grayscale area can be determined.
- the straight line IJ corresponding to the third region may be adjusted to a straight line KL, that is, the original linear tone mapping corresponding to the third region is expanded.
- x3 represents the grayscale value of the pixel point in the high grayscale area before adjustment
- y3 represents the grayscale value of the pixel point in the high grayscale area after adjustment
- k2 and b2 represent constants.
- the values of k1, b1, k2 and b2 can be set according to the ratio of the brightness grayscale distribution, or can be determined by a machine learning model.
- the machine learning model learns the grayscale distribution of pixels in the brightness histogram of the video frame image in different scenes, and predicts the values of k1, b1, k2 and b2 based on the distribution. This is only an exemplary description and is not limited to this.
- functions corresponding to the adjusted first area, second area, and third area are determined in S1045.
- an x value is input and a y value is output correspondingly.
- the y value represents the grayscale value after tone mapping, that is, the adjusted brightness value is obtained.
- the brightness of the video frame image is adjusted according to the adjusted brightness value to obtain an enhanced image corresponding to the video frame image.
- the original grayscale value of the pixel is x, and the grayscale value corresponding to the pixel after adjustment is determined to be y through these functions.
- the brightness of the pixel in the video frame image is adjusted according to the y value to obtain an enhanced image corresponding to the video frame image.
- a similar method is adopted to make corresponding adjustments to each pixel point in the first region and the second region to obtain an enhanced image corresponding to the video frame image.
- the brightness information corresponding to each video frame image is used to determine the adjustable area in each video frame image, namely, the low grayscale area, the medium grayscale area, and the high grayscale area; different adjustment strategies are determined for the low grayscale area, the medium grayscale area, and the high grayscale area, and then the grayscale values of the pixels in different areas are adjusted according to different adjustment strategies, so that the dynamic range of the enhanced image generated in the end is truly improved. Then, the dynamic range of the video composed of multiple enhanced images is also improved, so that when the video with the improved dynamic range is displayed on the display device, the display effect is improved and the user experience is improved.
- FIG. 13 is a schematic diagram of an implementation process shown in an exemplary embodiment of the present application.
- the SDR video is decoded to obtain a decoded video of the SDR video, which may include two or more video frame images.
- FIG. 13 shows a video frame image in the decoded video of the SDR video.
- the brightness information corresponding to the video frame image is shown in the form of a brightness histogram in Figure 13.
- the brightness area is divided using the brightness information corresponding to the video frame image to obtain a low grayscale area, a medium grayscale area, and a high grayscale area.
- tone mapping is performed on the low grayscale area, medium grayscale area, and high grayscale area corresponding to the video frame image to obtain an enhanced image corresponding to the video frame image.
- the dynamic range of the video frame image is truly improved, that is, the enhanced image is the image after the dynamic range is improved. It can be seen that the dynamic range of the video composed of multiple enhanced images (i.e., the extended dynamic range video shown in Figure 13) is also improved, so that when the video with improved dynamic range is displayed on a display device, the display effect is improved and the user experience is improved.
- the method for processing video provided in the embodiment of the present application can be applicable to various display devices.
- the display device provided in the embodiment of the present application can be a display device in various forms.
- the display device can be various camera devices such as SLR cameras and compact cameras, mobile phones, tablet computers, wearable devices, televisions, vehicle-mounted devices, augmented reality (AR)/virtual reality (VR) devices, laptop computers, ultra-mobile personal computers (UMPC), netbooks, personal digital assistants (PDA), etc., or can be other devices or apparatuses capable of performing image processing.
- camera devices such as SLR cameras and compact cameras, mobile phones, tablet computers, wearable devices, televisions, vehicle-mounted devices, augmented reality (AR)/virtual reality (VR) devices, laptop computers, ultra-mobile personal computers (UMPC), netbooks, personal digital assistants (PDA), etc.
- FIG14 shows a schematic diagram of the structure of a display device provided in an embodiment of the present application.
- the display device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, a mobile communication module 150, a wireless communication module 160, an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, a sensor module 180, a button 190, a motor 191, an indicator 192, a camera 193, a display screen 194, and a subscriber identification module (SIM) card interface 195, etc.
- SIM subscriber identification module
- the sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, a bone conduction sensor 180M, etc.
- the structure shown in FIG14 does not constitute a specific limitation on the display device 100.
- the display device 100 may include more or fewer components than those shown in FIG14, or the display device 100 may include a combination of some of the components shown in FIG14, or the display device 100 may include sub-components of some of the components shown in FIG14.
- the components shown in FIG14 may be implemented in hardware, software, or a combination of software and hardware.
- the processor 110 may include one or more processing units, for example, the processor 110 may include an application processor (AP), a modem processor, a graphics processor (GPU), an image signal processor (ISP), a controller, a video codec, a digital signal processor (DSP), a baseband processor, and/or a neural-network processing unit (NPU). Different processing units may be independent devices. It may also be integrated into one or more processors.
- AP application processor
- modem processor e.g., a modem processor, a graphics processor (GPU), an image signal processor (ISP), a controller, a video codec, a digital signal processor (DSP), a baseband processor, and/or a neural-network processing unit (NPU).
- AP application processor
- GPU graphics processor
- ISP image signal processor
- DSP digital signal processor
- NPU neural-network processing unit
- Different processing units may be independent devices. It may also be integrated into one or more processors.
- the controller may be the nerve center and command center of the display device 100.
- the controller may generate an operation control signal according to the instruction operation code and the timing signal to complete the control of fetching and executing instructions.
- the processor 110 may also be provided with a memory for storing instructions and data.
- the memory in the processor 110 is a cache memory.
- the memory may store instructions or data that the processor 110 has just used or cyclically used. If the processor 110 needs to use the instruction or data again, it may be directly called from the memory. This avoids repeated access, reduces the waiting time of the processor 110, and thus improves the efficiency of the system.
- the processor 110 can run the software code of the method for processing video provided in the embodiment of the present application, thereby effectively improving the dynamic range of the original video.
- modules of the display device 100 may also adopt a combination of multiple connection modes in the above embodiments.
- the wireless communication function of the display device 100 can be implemented through components such as the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem processor, and the baseband processor.
- Antenna 1 and antenna 2 are used to transmit and receive electromagnetic wave signals.
- Each antenna in display device 100 can be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve the utilization of the antennas.
- antenna 1 can be reused as a diversity antenna for a wireless local area network.
- the antenna can be used in combination with a tuning switch.
- the display device 100 can realize the display function through a GPU, a display screen 194, and an application processor.
- the GPU is a microprocessor for image processing, which is connected to the display screen 194 and the application processor.
- the GPU can be used to decode the original video (such as an SDR video) to obtain a decoded video of the original video (such as an SDR video).
- the GPU can also be used to perform mathematical and posture calculations for graphics rendering, etc.
- the processor 110 may include one or more GPUs that execute program instructions to generate or change display information.
- the steps of obtaining a decoded video of the original video; determining the brightness information corresponding to each video frame image; obtaining the brightness capability of the display device and the dynamic range of the original video, and calculating the scalable dynamic range of each video frame image based on the brightness capability of the display device and the dynamic range of the original video; and performing tone mapping on each video frame image using the brightness information corresponding to each video frame image and the scalable dynamic range of each video frame image to obtain an enhanced image corresponding to each video frame image can be executed in the processor 110.
- the display screen 194 can be used to display images or videos.
- the display screen 194 includes a display panel.
- the display panel can be a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode or an active-matrix organic light-emitting diode (AMOLED), a flexible light-emitting diode (FLED), Miniled, MicroLed, Micro-oLed, quantum dot light-emitting diodes (QLED), etc.
- the display device 100 may include 1 or N display screens 194, where N may be a positive integer greater than 1.
- the display screen 194 can be used to display the original video, multiple video frame images included in the decoded video, enhanced images, enhanced videos composed of enhanced images (ie, extended dynamic range videos), etc.
- the display screen 194 in the embodiment of the present application may be a touch screen.
- a touch sensor 180K may be integrated in the display screen 194.
- the touch sensor 180K may also be referred to as a "touch panel”. That is, the display screen 194 may include a display panel and a touch panel, and the touch sensor 180K and the display screen 194 form a touch screen, also known as a "touch screen”.
- the touch sensor 180K is used to detect touch operations acting on or near it, such as a user tapping the left side of the screen of the display device 100 with a touch object (such as a user's finger or a stylus, etc.) and sliding upward.
- the touch operation detected by the touch sensor 180K can be passed to the upper layer by the driver of the kernel layer (such as a TP driver) to determine the type of touch event.
- the driver of the kernel layer such as a TP driver
- Visual output related to the touch operation can be provided by the display screen 194.
- the touch sensor 180K may also be arranged on the surface of the display device 100, which is different from the position of the display screen 194.
- the display device 100 can realize shooting and recording functions through an ISP, a camera 193, a video codec, a GPU, a display screen 194, and an application processor.
- the ISP is used to process the data fed back by the camera 193. For example, when shooting a video, light is transmitted to the camera photosensitive element through the lens, and the light signal is converted into an electrical signal. The camera photosensitive element transmits the electrical signal to the ISP for processing and converts it into an image visible to the naked eye.
- the ISP can perform algorithmic optimization on the noise, brightness and color of the image. The ISP can also optimize the parameters such as the exposure and color temperature of the recording scene. In some embodiments, the ISP can be set in the camera 193.
- the camera 193 is used to capture images or videos. It can be triggered to start by application instructions to realize shooting and recording functions, such as in scenes such as live video, video conferencing, video calls, video surveillance, etc., you can record and obtain videos.
- the camera may include components such as imaging lenses, filters, and image sensors. The light emitted or reflected by the object enters the imaging lens, passes through the filter, and finally converges on the image sensor.
- the image sensor is mainly used to converge the light emitted or reflected by all objects in the recording angle of view; the filter is mainly used to filter out the redundant light waves in the light (for example, light waves other than visible light, such as infrared); the image sensor is mainly used to perform photoelectric conversion on the received light signal, convert it into an electrical signal, and input it into the processor 110 for subsequent processing.
- the camera 193 can be located in front of the display device 100, or on the back of the display device 100. The specific number and arrangement of the cameras can be set according to needs, and this application does not impose any restrictions.
- the camera 193 can obtain recorded video.
- the digital signal processor is used to process digital signals, and can process not only digital image signals but also other digital signals. For example, when the display device 100 is selecting a frequency point, the digital signal processor is used to perform Fourier transform on the frequency point energy.
- the video codec is used to compress or decompress digital video.
- the display device 100 may support one or more video codecs.
- the video codec may be used to decode the original video (such as SDR video) to obtain a decoded video of the original video (such as SDR video).
- the gyro sensor 180B can be used to determine the motion posture of the display device 100.
- the angular velocity of the display device 100 around three axes i.e., the x-axis, the y-axis, and the z-axis
- the gyro sensor 180B can be used for anti-shake in shooting and recording. For example, when shooting or recording, the gyro sensor 180B detects the angle of the shaking of the display device 100, calculates the distance that the lens module needs to compensate based on the angle, and allows the lens to offset the shaking of the display device 100 through reverse movement to achieve anti-shake.
- the gyro sensor 180B can also be used in scenes such as navigation and somatosensory games.
- the acceleration sensor 180E can detect the magnitude of acceleration of the display device 100 in various directions (generally x-axis, y-axis and z-axis). When the display device 100 is stationary, the magnitude and direction of gravity can be detected. The acceleration sensor 180E can also be used to identify the posture of the display device 100 as an input parameter for applications such as horizontal and vertical screen switching and pedometers.
- the distance sensor 180F is used to measure the distance.
- the display device 100 can measure the distance by infrared or laser. In some embodiments, for example, in a shooting or recording scene, the display device 100 can use the distance sensor 180F to measure the distance to achieve fast focusing.
- the ambient light sensor 180L is used to sense the ambient light brightness.
- the display device 100 can adaptively adjust the brightness of the display screen 194 according to the perceived ambient light brightness.
- the ambient light sensor 180L can also be used to automatically adjust the white balance when taking pictures.
- the ambient light sensor 180L can also cooperate with the proximity light sensor 180G to detect whether the display device 100 is in a pocket to prevent accidental touch.
- the fingerprint sensor 180H is used to collect fingerprints.
- the display device 100 can use the collected fingerprint characteristics to implement functions such as unlocking, accessing application locks, taking photos, and answering calls.
- the pressure sensor 180A is used to sense the pressure signal and can convert the pressure signal into an electrical signal.
- the pressure sensor 180A can be set on the display screen 194.
- a capacitive pressure sensor can be a parallel plate including at least two conductive materials.
- the display device 100 determines the intensity of the pressure based on the change in capacitance.
- the display device 100 detects the intensity of the touch operation based on the pressure sensor 180A.
- the display device 100 can also calculate the position of the touch based on the detection signal of the pressure sensor 180A.
- touch operations acting on the same touch position but with different touch operation intensities can correspond to different operation instructions.
- the button 190 includes a power button, a volume button, etc.
- the button 190 can be a mechanical button. It can also be a touch button.
- the display device 100 can receive a button input and generate a key signal input related to the user settings and function control of the display device 100.
- the display device 100 plays a video online, if the left side of the screen is used to control the screen brightness, then the right side of the screen is used to control the volume. If the right side of the screen is used to control the screen brightness, then the left side of the screen is used to control the volume.
- the user taps the left side of the screen of the display device 100 with a touch object (such as a user's finger or a stylus, etc.) and slides up or down, and the display device 100 responds to the touch operation to increase or dim the screen brightness.
- a touch object such as a user's finger or a stylus, etc.
- the display device 100 responds to the touch operation to increase or dim the screen brightness.
- the user taps the right side of the screen of the display device 100 with a touch object and slides up or down, and the display device 100 responds to the touch operation to increase or decrease the sound of the display device 100.
- Motor 191 can generate vibration prompts. Motor 191 can be used for incoming call vibration prompts, and can also be used for touch vibration feedback. Indicator 192 can be an indicator light, which can be used to indicate charging status, power changes, messages, missed calls, notifications, etc.
- SIM card interface 195 is used to connect a SIM card. The SIM card can be connected to and separated from the display device 100 by inserting it into the SIM card interface 195 or pulling it out from the SIM card interface 195.
- the display device 100 can support 1 or N SIM card interfaces, where N is a positive integer greater than 1.
- the SIM card interface 195 can support Nano SIM cards, Micro SIM cards, SIM cards, etc.
- the methods in the above embodiments can all be implemented in the display device 100 having the above hardware structure.
- Fig. 15 is a schematic diagram of the structure of a display device provided in an embodiment of the present application.
- the display device 200 includes a first acquisition module 210 , a determination module 220 , a second acquisition module 230 and a processing module 240 .
- the display device 200 can perform the following schemes:
- a first acquisition module 210 used to acquire a decoded video of an original video
- a determination module 220 configured to determine brightness information corresponding to each video frame image
- a second acquisition module 230 is used to acquire the brightness capability of the display device and the dynamic range of the original video, and calculate the scalable dynamic range of each video frame image according to the brightness capability of the display device and the dynamic range of the original video;
- the processing module 240 is used to perform tone mapping on each video frame image by using the brightness information corresponding to each video frame image and the scalable dynamic range of each video frame image to obtain an enhanced image corresponding to each video frame image.
- the display device 200 is implemented in the form of a functional module.
- module here can be implemented in the form of software and/or hardware, and is not specifically limited to this.
- a “module” may be a software program, a hardware circuit, or a combination of the two that implements the above functions.
- the hardware circuit may include an application specific integrated circuit (ASIC), an electronic circuit, a processor (such as a shared processor, a dedicated processor, or a group processor, etc.) and a memory for executing one or more software or firmware programs, a combined logic circuit, and/or other suitable components that support the described functions.
- ASIC application specific integrated circuit
- processor such as a shared processor, a dedicated processor, or a group processor, etc.
- memory for executing one or more software or firmware programs, a combined logic circuit, and/or other suitable components that support the described functions.
- modules of each example described in the embodiments of the present application can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Professional and technical personnel can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of the present application.
- the embodiment of the present application also provides a computer-readable storage medium, in which computer instructions are stored; when the computer-readable storage medium is run on a display device, the display device executes the method shown above.
- the computer instructions can be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
- the computer instructions can be transmitted from a website, computer, server or data center to another website, computer, server or data center by wired (e.g., coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means.
- wired e.g., coaxial cable, optical fiber, digital subscriber line (DSL)
- wireless e.g., infrared, wireless, microwave, etc.
- the computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that can be integrated with one or more available media.
- the available medium can be a magnetic medium (e.g., a floppy disk, a hard disk, a tape), an optical medium, or a semiconductor medium (e.g., a solid state drive (SSD)), etc.
- An embodiment of the present application further provides a computer program product including computer instructions.
- the computer program product includes: a computer program code.
- the display device can execute the technical solution shown above.
- FIG16 is a schematic diagram of the structure of a chip provided in an embodiment of the present application.
- the chip shown in FIG16 can be a general-purpose processor or a dedicated processor.
- the chip includes a processor 301.
- the processor 301 is used to support the display device to execute the technical solution shown above.
- the chip further includes a transceiver 302, and the transceiver 302 is used to accept the control of the processor 301, and to support the display device to execute the technical solution shown above.
- the chip shown in FIG. 16 may further include: a storage medium 303 .
- the chip shown in Figure 16 can be implemented using the following circuits or devices: one or more field programmable gate arrays (FPGA), programmable logic devices (PLD), controllers, state machines, gate logic, discrete hardware components, any other suitable circuits, or any combination of circuits that can perform the various functions described throughout this application.
- FPGA field programmable gate arrays
- PLD programmable logic devices
- controllers state machines
- gate logic discrete hardware components
- any other suitable circuits any combination of circuits that can perform the various functions described throughout this application.
- the display device, display apparatus, computer storage medium, computer program product, and chip provided in the above-mentioned embodiments of the present application are all used to execute the method provided above. Therefore, the beneficial effects that can be achieved can refer to the corresponding beneficial effects of the method provided above, and will not be repeated here.
- pre-setting and “pre-definition” can be achieved by pre-saving corresponding codes, tables or other methods that can be used to indicate relevant information in a device (for example, including a display device), and the present application does not limit its specific implementation method.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Image Processing (AREA)
Abstract
本申请提供了一种处理视频的方法、显示设备及存储介质,涉及图像处理技术领域,该方法包括:获取原始视频的解码视频;确定每个视频帧图像对应的亮度信息;获取显示设备的亮度能力和原始视频的动态范围,并根据显示设备的亮度能力和原始视频的动态范围计算每个视频帧图像可扩展的动态范围;利用每个视频帧图像对应的亮度信息和每个视频帧图像可扩展的动态范围,对每个视频帧图像进行色调映射,得到每个视频帧图像对应的增强图像。在确定每个视频帧图像可扩展的动态范围时,充分考虑到了显示设备的亮度能力和原始视频的动态范围,再利用可扩展的动态范围,对每个视频帧图像进行色调映射,使视频帧图像的动态范围真正地得到了提升。
Description
本申请要求于2022年12月09日提交国家知识产权局、申请号为202211585804.3、申请名称为“处理视频的方法、显示设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请涉及图像处理技术领域,尤其涉及一种处理视频的方法、显示设备及存储介质。
高动态范围(High-Dynamic Range,HDR)视频,相对于标准动态范围(Standard Dynamic Range,SDR)视频,图像的明暗层次更清晰,图像细节更丰富,能够更逼真的重现真实场景,给用户提供了更好的视频观看效果。
随着HDR技术的发展,为了能够更好地播放HDR视频,显示设备的亮度能力越来越高。通常用显示设备的动态范围来表示显示设备的亮度能力。显示设备的动态范围越高,其亮度能力越强。
在视频技术发展的历程中,积累了大量的SDR视频,但随着亮度能力强的显示设备越来越多,这些SDR视频在亮度能力强的显示设备上显示效果不好。
发明内容
本申请提供一种处理视频的方法、显示设备及存储介质,通过确定原始视频的亮度信息并结合当前显示设备的亮度能力,确定出实际可扩展的动态范围,基于该可扩展的动态范围进行色调映射,从而真正地提升动态范围。
第一方面,本申请提供了一种处理视频的方法,该方法应用于显示设备,该方法包括:
获取原始视频的解码视频,解码视频包括多个视频帧图像;确定每个视频帧图像对应的亮度信息;获取显示设备的亮度能力和原始视频的动态范围,并根据显示设备的亮度能力和原始视频的动态范围计算每个视频帧图像可扩展的动态范围;利用每个视频帧图像对应的亮度信息和每个视频帧图像可扩展的动态范围,对每个视频帧图像进行色调映射,得到每个视频帧图像对应的增强图像,增强图像的动态范围大于视频帧图像的动态范围。
可选地,本申请实施例提供的显示设备可以包括手机、平板电脑、可穿戴设备、电视、车载设备、增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR)设备、笔记本电脑、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本、个人数字助理(personal digital assistant,PDA)、各种摄像装置等,或者可
以为其他能够进行图像处理的设备或装置,对于显示设备的具体类型,本申请实施例不作任何限制。
可选地,原始视频可以包括SDR视频。该SDR视频可以为显示设备在线播放的视频。
第一方面提供的处理视频的方法,在确定每个视频帧图像可扩展的动态范围时,充分考虑到了显示设备的亮度能力和原始视频的动态范围,再利用可扩展的动态范围,对每个视频帧图像进行色调映射,使视频帧图像的动态范围真正地得到了提升,即增强图像是提升动态范围后的图像。由此可知,通过多个增强图像构成的视频的动态范围也得到了提升,从而使得在显示设备上显示提升动态范围后的视频时,提升了显示效果,提升了用户体验。
一种可能的实现方式中,本申请提供的处理视频的方法,还可以包括根据多个增强图像生成增强视频。该增强视频为扩展动态范围视频,即增强视频为扩展动态范围后的视频。按照时间顺序排列每个视频帧图像对应的增强图像,得到增强视频,可在显示设备上播放该增强视频。这种实现方式中,可以是处理完一个视频帧图像,在显示设备上显示该视频帧图像对应的增强图像,由于显示设备处理的速度很快,对于用户来说,就是在流畅地观看视频,且观看的视频是动态范围提升后的视频,提升了用户体验。
可选地,在一种可能的实现方式中,亮度信息可以包括亮度值,确定每个视频帧图像对应的亮度信息,包括:
确定每个视频帧图像的像素格式;当像素格式为YUV格式时,获取每个视频帧图像的Y值;或者,当像素格式为RGB格式时,利用预设公式计算每个视频帧图像的亮度值。其中,Y值表示亮度值。
可选地,在另一种可能的实现方式中,亮度信息可以包括亮度直方图,确定每个视频帧图像对应的亮度信息,包括:生成每个视频帧图像的亮度直方图。
一种可能的实现方式中,利用每个视频帧图像对应的亮度信息和每个视频帧图像可扩展的动态范围,对每个视频帧图像进行色调映射,得到每个视频帧图像对应的增强图像,包括:利用每个视频帧图像对应的亮度信息,确定每个视频帧图像对应的标准动态区域和扩展动态区域;根据每个视频帧图像可扩展的动态范围,确定每个视频帧图像的标准动态区域对应的第一系数,以及每个视频帧图像的扩展动态区域对应的第二系数;根据第一系数对标准动态区域中的像素点进行色调映射,根据第二系数对扩展动态区域的像素点进行色调映射,得到每个视频帧图像对应的增强图像。
可选地,标准动态区域中包含视频帧图像中的若干个像素点,标准动态区域中也包含视频帧图像中的若干个像素点。其中,标准动态区域中的像素点的亮度信息小于预设阈值,扩展动态区域中的像素点的亮度信息大于或等于预设阈值。
可选地,预设阈值可由用户设定,也可以通过灰阶比例计算得到,还可以通过机器学习模型确定。
可选地,第一系数小于第二系数。
这种实现方式中,利用每个视频帧图像对应的亮度信息,确定出每个视频帧图像中可调整的区域,即标准动态区域和扩展动态区域;为标准动态区域和扩展动态区域确定不同的系数,再根据不同的系数(如第一系数和第二系数)对不同的区域(如标准动态区域和扩展动态区域)的像素点的像素值分别进行调整,使最终生成的增强图像的动态范围得到了真正地提升。那么,通过多个增强图像构成的视频的动态范围也得到了提升,从而使得在显示设备上显示提升动态范围后的视频时,提升了显示效果,提升了用户体验。
一种可能的实现方式中,根据第一系数对标准动态区域中的像素点进行色调映射,根据第二系数对扩展动态区域的像素点进行色调映射,得到每个视频帧图像对应的增强图像,包括:计算标准动态区域中每个像素点的原始像素值与第一系数的第一乘积,并根据第一乘积更新标准动态区域中每个像素点的像素值;计算扩展动态区域中每个像素点的原始像素值与第二系数的第二乘积,并根据第二乘积更新扩展动态区域中每个像素点的像素值;根据标准动态区域中更新后的每个像素点,以及扩展动态区域中更新后的每个像素点,生成每个视频帧图像对应的增强图像。
这种实现方式中,根据不同的系数(如第一系数和第二系数)对不同的区域(如标准动态区域和扩展动态区域)的像素点的像素值分别进行调整。当第一系数小于第二系数,且第二系数为1时,使得保持了扩展动态区域的像素点的像素值,降低了标准动态区域的像素点的像素值,使最终生成的增强图像的动态范围得到了真正地提升。或者,当第一系数小于第二系数,且第一系数为1时,使得保持了标准动态区域的像素点的像素值,提升了扩展动态区域的像素点的像素值,使最终生成的增强图像的动态范围得到了真正地提升。
一种可能的实现方式中,利用每个视频帧图像对应的亮度信息和每个视频帧图像可扩展的动态范围,对每个视频帧图像进行色调映射,得到每个视频帧图像对应的增强图像,包括:利用每个视频帧图像对应的亮度信息,确定每个视频帧图像对应的低灰阶区域、中灰阶区域以及高灰阶区域;根据每个视频帧图像可扩展的动态范围,对所述低灰阶区域、所述中灰阶区域以及所述高灰阶区域中的至少一个区域的亮度进行调整,得到每个视频帧图像对应的增强图像。
例如,保持低灰阶区域和中灰阶区域当前的亮度,对高灰阶区域的亮度进行提升。又例如,保持中灰阶区域和高灰阶区域当前的亮度,对低灰阶区域的亮度进行降低。
可选地,若是通过亮度直方图确定每个视频帧图像对应的亮度信息,则可通过T1、T2,对每个视频帧图像的亮度直方图进行划分,得到低灰阶区域、中灰阶区域以及高灰阶区域。其中,T1、T2的值可由用户设定,也可以根据亮度灰阶分布的比例进行设置,还可以通过机器学习模型确定。
这种实现方式中,利用每个视频帧图像对应的亮度信息,确定出每个视频帧图像中可调整的区域,即低灰阶区域、中灰阶区域以及高灰阶区域;再根据每个视频帧图像可扩展的动态范围,对低灰阶区域、中灰阶区域以及高灰阶区域中的至少一个区域的亮度进行调整。例如,保持低灰阶区域和中灰阶区域当前的亮度,对高灰阶区域的
亮度进行提升,又例如,保持中灰阶区域和高灰阶区域当前的亮度,对低灰阶区域的亮度进行降低。这样可使得最终生成的增强图像的动态范围得到真正地提升。
一种可能的实现方式中,根据每个视频帧图像可扩展的动态范围,对低灰阶区域、中灰阶区域以及高灰阶区域中的至少一个区域的亮度进行调整,得到每个视频帧图像对应的增强图像,包括:根据每个视频帧图像可扩展的动态范围,确定低灰阶区域中的像素点调整后的灰阶值;确定中灰阶区域中的像素点调整后的灰阶值;确定高灰阶区域中的像素点调整后的灰阶值;根据低灰阶区域、中灰阶区域以及高灰阶区域中调整后的各个像素点的灰阶值,生成每个视频帧图像对应的增强图像。
这种实现方式中,利用每个视频帧图像对应的亮度信息,确定出每个视频帧图像中可调整的区域,即低灰阶区域、中灰阶区域以及高灰阶区域;再根据每个视频帧图像可扩展的动态范围,确定低灰阶区域、中灰阶区域以及高灰阶区域中调整后的各个像素点的灰阶值,从而生成每个视频帧图像对应的增强图像。由于是根据每个视频帧图像可扩展的动态范围确定的各个区域中调整后的各个像素点的灰阶值,使调整后的各个像素点的灰阶值得到了不同程度的调整,从而使最终生成的增强图像的动态范围得到了真正地提升。
一种可能的实现方式中,根据每个视频帧图像可扩展的动态范围,确定低灰阶区域中的像素点调整后的灰阶值,包括:根据每个视频帧图像可扩展的动态范围,确定低灰阶区域对应的第三系数;根据第三系数对低灰阶区域中的像素点进行色调映射,得到低灰阶区域中的像素点调整后的灰阶值。
可选地,第三系数可以根据原始视频的动态范围以及每个视频帧图像可扩展的动态范围确定。
可选地,低灰阶区域中的各像素点调整后的灰阶值与调整前的灰阶值可以满足如下关系:
y1=(A/B)*k*x1,其中,x1表示低灰阶区域中的像素点调整前的灰阶值,y1表示低灰阶区域中的该像素点调整后的灰阶值,A/B表示第三系数,A表示原始视频的动态范围,B表示每个视频帧图像可扩展的动态范围,k表示常数。
或者,低灰阶区域中的各像素点调整后的灰阶值与调整前的灰阶值可以满足如下关系:
y1=M*(A/B)*k*x1,(M>1),其中,x1表示低灰阶区域中的像素点调整前的灰阶值,y1表示低灰阶区域中的该像素点调整后的灰阶值,M*(A/B)表示第三系数,A表示原始视频的动态范围,B表示每个视频帧图像可扩展的动态范围,M、k表示常数。
可选地,中灰阶区域中的各像素点调整后的灰阶值与调整前的灰阶值可以满足如下关系:
y2=k1*x2+b1,其中,x2表示中灰阶区域中的像素点调整前的灰阶值,y2表示中灰阶区域中的该像素点调整后的灰阶值,k1、b1表示常数。k1、b1的值可以根据亮度灰阶分布的比例进行设置,也可以通过机器学习模型确定。
可选地,高灰阶区域中的各像素点调整后的灰阶值与调整前的灰阶值可以满足如
下关系:
y3=k2*x3+b2,其中,x3表示高灰阶区域中的像素点调整前的灰阶值,y3表示高灰阶区域中的该像素点调整后的灰阶值,k2、b2表示常数。k2、b2的值可以根据亮度灰阶分布的比例进行设置,也可以通过机器学习模型确定。
这种实现方式中,根据每个视频帧图像可扩展的动态范围,确定低灰阶区域、中灰阶区域以及高灰阶区域中调整后的各个像素点的灰阶值,最终生成每个视频帧图像对应的增强图像。由于是根据每个视频帧图像可扩展的动态范围确定的各个区域中调整后的各个像素点的灰阶值,使调整后的各个像素点的灰阶值得到了不同程度的调整,从而使最终生成的增强图像的动态范围得到了真正地提升。
第二方面,本申请提供了一种显示装置,该显示装置包含在显示设备中,该显示装置具有实现上述第一方面及上述第一方面的可能实现方式中显示设备行为的功能。功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。硬件或软件包括一个或多个与上述功能相对应的模块或单元。例如,第一获取模块或单元、确定模块或单元、第二获取模块或单元、处理模块或单元等。
第三方面,本申请提供一种显示设备,显示设备包括:处理器、存储器和接口;处理器、存储器和接口相互配合,使得显示设备执行第一方面提供的技术方案中任意一种方法。
第四方面,本申请提供一种芯片,包括处理器。处理器用于读取并执行存储器中存储的计算机程序,以执行第一方面及其任意可能的实现方式中的方法。
可选的,芯片还包括存储器,存储器与处理器通过电路或电线连接。
可选的,芯片还包括通信接口。
第五方面,本申请提供一种计算机可读存储介质,计算机可读存储介质中存储了计算机程序,当计算机程序被处理器执行时,使得该处理器执行第一方面的技术方案中任意一种方法。
第六方面,本申请提供一种计算机程序产品,计算机程序产品包括:计算机程序代码,当计算机程序代码在显示设备上运行时,使得该显示设备执行第一方面的技术方案中任意一种方法。
图1为本申请一示例性实施例示出的相关技术中在显示设备上播放SDR视频时的1帧图像;
图2为本申请一示例性实施例示出的在显示设备上播放处理后的SDR视频时的1帧图像;
图3是本申请实施例提供的一种应用场景的示意图;
图4为本申请实施例提供的另一种应用场景中的增强图像示意图;
图5为本申请实施例提供的处理视频的方法的流程示意图;
图6为本申请实施例提供的一种亮度信息展示图;
图7为本申请实施例提供的另一种亮度信息展示图;
图8为本申请实施例提供的生成视频帧图像对应的增强图像的一种流程示意图;
图9为本申请实施例提供的生成视频帧图像对应的增强图像的另一种流程示意图;
图10为本申请实施例提供的一种区域划分示意图;
图11为本申请实施例提供的一种亮度区域划分示意图;
图12为本申请实施例提供的一种色调映射示意图;
图13是本申请一示例性实施例示出的一种实现过程示意图;
图14示出了本申请实施例提供的一种显示设备的结构示意图;
图15为本申请实施例提供的一种显示装置的结构示意图;
图16为本申请实施例提供的一种芯片的结构示意图。
下面将结合附图,对本申请中的技术方案进行描述。
在本申请实施例的描述中,除非另有说明,“/”表示或的意思,例如,A/B可以表示A或B;本文中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,在本申请实施例的描述中,“多个”是指两个或多于两个。
以下,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。在本实施例的描述中,除非另有说明,“多个”的含义是两个或两个以上。
在本申请说明书中描述的参考“一个实施例”或“一些实施例”等意味着在本申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此,在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参考相同的实施例,而是意味着“一个或多个但不是所有的实施例”,除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”,除非是以其他方式另外特别强调。
首先,对本申请实施例中的部分用语进行解释说明,以便于本领域技术人员理解。
1、动态范围(Dynamic Range)
在图像处理领域中,动态范围指在图像上最亮的光照亮度和最暗的光照亮度的比值,动态范围是图像质量评价很重要的维度之一。
自然场景的动态范围比较大,通常可以达到10的9次方量级,人眼的动态范围也很广,一般认为至少在10的6次方量级。但由于通常采用8位整型数据记录图像的像素值,使得传统图像和视频内容只能够区分256个不同的亮度等级。在普通显示器上显示时,其亮度动态范围大概在10的3次方这个量级。
因此,从动态范围的角度来看,传统图像和视频与人眼所见的真实场景存在很大差距,限制了视频内容的丰富性。这也是传统图像和视频看上去总是不像真实世界的一个重要原因。
2、高动态范围(High-Dynamic Range,HDR)视频
HDR又叫“高动态光照渲染”,HDR的意义在于让显示器画面更接近人眼所观察到的现实世界。我们知道,视频其实是由一幅幅静态画面所组成的图像序列,高动态范
围图像(High-Dynamic Range Image,HDRI)相较于普通的图像,其可以提供更多的动态范围和图像细节,基于不同的曝光时间的低动态范围(Low-Dynamic Range,LDR)图像,利用每个曝光时间相对应最佳细节的LDR图像来合成最终的HDR图像,能够更好的反映出真实环境中的视觉效果。而HDR视频就是由一帧帧这样的HDRI组成,由此HDR视频相较于普通视频来说,画面更加生动有质感。
3、标准动态范围(Standard Dynamic Range,SDR)
SDR是处理图像中亮度和色值的传统技术,是一种常见的颜色显示方法。通常地,我们称传统视频为SDR视频,SDR视频为了在有限的亮度范围内展现原始场景,因此会较为严重地压缩自然场景的亮度范围,造成视频内容的对比度明显不足。
4、RGB(Red,Green,Blue)颜色空间
或者称为RGB域,指的是一种与人的视觉系统结构相关的颜色模型。根据人眼睛的结构,将所有颜色都当作是红色、绿色和蓝色的不同组合。
5、像素值
指位于RGB颜色空间的彩色图像中每个像素对应的一组颜色分量。例如,每个像素对应一组三基色分量。其中,三基色分量分别为红色分量R、绿色分量G和蓝色分量B。
6、YUV颜色空间
或者称为YUV域,指的是一种颜色编码方法。其中,Y表示亮度(Luminance or Luma),U和V表示色度(Chrominance or Chroma)。上述RGB颜色空间着重于人眼对色彩的感应,YUV颜色空间则着重于视觉对亮度的敏感程度,RGB颜色空间和YUV颜色空间可以互相转换。
7、亮度
亮度是一光源单位在给定方向上,单位面积单位立体角内所发出的光通量,亮度的符号是L,单位是尼特(nit)。
8、灰阶
通常来说,液晶屏幕上的每一个点,即一个像素,它是由红、绿、蓝(RGB)三个子像素组成的。每一个子像素,其背后的光源都可以显现出不同的亮度级别,而灰阶代表了由最暗到最亮之间不同亮度的层次级别,这中间层次级别越多,所能够呈现的画面效果也就越细腻。例如,通常屏幕能表现出256个亮度层次,我们就称之为256灰阶。
9、色调映射
色调映射是在有限动态范围媒介上近似显示高动态范围图像的一项计算机图形学技术。
10、亮度直方图(Brightness histogram)
亮度直方图是用来检测图片亮度的一种量化工具。通常将图像的亮度信息分为0至255,亮度直方图的横轴代表0至255的亮度(亮度数值),纵轴代表图像中对应亮度的像素数量。其中,横轴最左边是0,最右边是255,0表示最暗的地方,255表示最亮的地方,中间的数值代表不同亮度的灰色,数值越大就越亮。
以上是对本申请实施例所涉及的名词的简单介绍,以下不再赘述。
高动态范围(High-Dynamic Range,HDR)视频,相对于标准动态范围(Standard Dynamic Range,SDR)视频,图像的明暗层次更清晰,图像细节更丰富,能够更逼真的重现真实场景。或者说,HDR视频可将视频内容以更高的动态范围进行显示,因此给用户提供了更好的视频观看效果。
随着HDR技术的发展,为了能够更好地播放HDR视频,显示设备的亮度能力越来越高。通常用显示设备的动态范围来表示显示设备的亮度能力。例如,显示设备的动态范围越高,该显示设备的亮度能力越强。
亮度能力强的显示设备越来越多,但在视频技术发展的历程中,积累了大量的SDR视频,即目前主流视频还是SDR视频(或理解为目前主流视频的动态范围仍为SDR)。如此,这些SDR视频在亮度能力强的显示设备上,无法实现逼真的重现真实场景,也无法将视频内容以高于该SDR的动态范围进行显示,导致这些SDR视频在亮度能力强的显示设备上显示效果不好。
同时,对于亮度能力强的显示设备来说,该显示设备的动态范围已超过SDR视频的动态范围,用该显示设备显示SDR视频会导致不能充分利用该显示设备的亮度能力,造成了显示资源的浪费。例如,SDR视频通常以8比特(binary digit,bit)的数据宽度来存储,其能表达的动态范围为100尼特(nit)左右。对于亮度能力强的显示设备来说,该显示设备能表达的动态范围可达到1000nit以上。因此,用这种显示设备显示SDR视频会导致不能充分利用该显示设备的亮度能力,从而造成显示资源的浪费。
相关技术中,有时会通过提升视频帧的色彩对比度实现对SDR视频对比度的提升,但这样还是无法对SDR视频的动态范围进行提升,因此还是存在上述问题。为此,本申请实施例提供了一种处理视频的方法,该处理视频的方法通过确定SDR视频的亮度信息并结合当前显示设备的亮度能力,确定出SDR视频实际可扩展的动态范围,基于该可扩展的动态范围对SDR视频进行色调映射,从而真正地提升SDR视频的动态范围。使得在该显示设备上显示提升动态范围后的SDR视频时,提升了显示效果。
目前,使用显示设备播放视频已经成为人们生活中的一种日常行为方式。以显示设备为手机为例,手机的亮度能力越来越高,因此能够更好地播放HDR视频。但是,目前主流视频仍是SDR视频,这些SDR视频在亮度能力强的显示设备上,无法实现逼真的重现真实场景,也无法将视频内容以高于该SDR的动态范围进行显示,导致这些SDR视频在亮度能力强的显示设备上显示效果不好。
请参阅图1,图1为本申请一示例性实施例示出的相关技术中在显示设备上播放SDR视频时的1帧图像。
如图1所示,在显示设备上播放SDR视频时,该帧图像亮度较低、动态范围低,无法逼真的重现真实场景,导致显示效果不好。
而通过本申请实施例提供的一种处理视频的方法,譬如通过确定SDR视频的亮度信息并结合当前显示设备(如手机)的亮度能力,确定出SDR视频实际可扩展的动态
范围,基于该可扩展的动态范围对SDR视频进行色调映射,能够真正地提升SDR视频的动态范围。使得在该显示设备(如手机)上显示提升动态范围后的SDR视频时,提升了显示效果。
请参阅图2,图2为本申请一示例性实施例示出的在显示设备上播放处理后的SDR视频时的1帧图像。
如图2所示,在显示设备上播放处理后/增强后的SDR视频时,该帧图像相较于图1中的图像,亮度和动态范围显著提升,能够逼真的重现真实场景,提升了显示效果。
首先对本申请实施例的应用场景进行简要说明。
请参阅图3,图3是本申请实施例提供的一种应用场景的示意图。本申请提供的处理视频的方法可以用于提升SDR视频的动态范围。
在一个示例中,以显示设备为手机进行举例说明。当显示设备检测到用户点击应用界面上的视频应用的图标操作后,可以启动视频应用,并显示视频应用界面。用户可以在视频应用界面中点击自己喜欢的视频,并选择全屏播放模式,此时显示设备显示如图3中的(a)所示的图形用户界面(graphical user interface,GUI),该GUI可以称为视频播放界面。
如图3中的(b)所示,在该视频播放界面,用户可以通过触控物体(如用户手指或者触控笔等)轻按屏幕左侧(本实施例中仅示出左侧,在一种可能的实现方式中,用户也可以轻按屏幕右侧)并向上滑动,显示设备响应用户的该触控操作,在该视频播放界面显示如图3中的(c)所示的亮度控件。同时,显示设备的屏幕亮度增强(图3中的(c)未示出)。与此同时,通过本申请实施例提供的处理视频的方法,对该视频播放界面显示的该帧图像进行处理,譬如确定该帧图像对应的亮度信息、该帧图像对应的动态范围,以及确定当前显示设备的动态范围。根据该帧图像对应的亮度信息、动态范围以及当前显示设备的动态范围,对该帧图像进行色调映射,得到该帧图像对应的增强图像。
为了便于理解,请参阅图4,图4为本申请实施例提供的另一种应用场景中的增强图像示意图。如图4所示的视频播放界面显示的增强图像,相较于图3中视频播放界面显示的图像,其动态范围显著提升,能够逼真的重现真实场景,提升了显示效果。
在另一个示例中,依旧以显示设备为手机进行举例说明。当显示设备检测到用户点击应用界面上的视频应用的图标操作后,可以启动视频应用,并显示视频应用界面。用户可以在视频应用界面中点击自己喜欢的视频,并选择全屏播放模式,此时显示设备显示如图3中的(a)所示的图形用户界面(graphical user interface,GUI),该GUI可以称为视频播放界面。该视频播放界面可以包括HDR控件,当显示设备检测到用户点击该HDR控件后,开启HDR模式,通过HDR模式播放该视频。
在以HDR模式播放该视频的过程中,通过本申请实施例提供的处理视频的方法,对该视频中的每一帧图像进行处理。譬如确定每一帧图像对应的亮度信息、每一帧图像对应的动态范围,以及确定当前显示设备的动态范围。根据每一帧图像对应的亮度
信息、动态范围以及当前显示设备的动态范围,对每一帧图像进行色调映射,得到每一帧图像对应的增强图像。多帧增强图像构成了增强视频,该增强视频是提升动态范围后的视频,因此,在该显示设备(如手机)上播放该增强视频时,提升了显示效果。例如,增强视频相较于普通视频,画面色彩更加鲜艳生动,细节更加清晰明了,可以提供更多的动态范围和图像细节,大幅提高画面细节的明暗对比度,更好地反映出真实环境中的视觉效果。
在又一个示例中,依旧以显示设备为手机进行举例说明。若用户预先在显示设备中开启了自动调整亮度功能,则该显示设备在播放视频时会自动调整亮度。每当显示设备检测到亮度改变时,通过本申请实施例提供的处理视频的方法,对该亮度下的每一帧图像进行处理。譬如确定该亮度下的每一帧图像对应的亮度信息、该亮度下的每一帧图像对应的动态范围,以及确定当前显示设备的动态范围。根据该亮度下的每一帧图像对应的亮度信息、动态范围以及当前显示设备的动态范围,对该亮度下的每一帧图像进行色调映射,得到该亮度下的每一帧图像对应的增强图像。该亮度下的多帧增强图像构成了该亮度下的增强视频,该亮度下的增强视频是提升动态范围后的视频,因此,在该显示设备(如手机)上播放该亮度下的增强视频时,提升了显示效果。
应理解,上述图3和图4所示的场景为对应用场景的举例说明,并不对本申请的应用场景进行任何限制。本申请实施例提供的处理视频的方法可以应用但不限于以下场景中:
视频通话、视频会议应用、长短视频应用、视频直播类应用、视频网课应用、智能运镜应用场景、系统相机录像功能录制视频、系统相机拍摄功能拍摄照片、视频监控以及智能猫眼等场景。
下面结合说明书附图,对本申请实施例所提供的处理视频的方法进行详细介绍。
本申请实施例提供的处理视频的方法可以用于与视频相关的应用场景,其中,与视频相关的应用场景可以包括显示设备在线播放视频;或者,与视频相关的应用场景可以包括显示设备进行录像拍摄;又或者,与视频相关的应用场景可以包括显示设备进行视频直播等。此处仅为示例性说明,对此不做限定。
示例性的,以与视频相关的应用场景包括显示设备在线播放视频为例,对本申请实施例提供的处理视频的方法进行描述。
请参阅图5,图5为本申请实施例提供的处理视频的方法的流程示意图。如图5所示,该处理视频的方法包括以下S101-S104。
S101、获取原始视频的解码视频。
示例性地,该原始视频可以包括SDR视频、通话中的视频、会议应用中的视频、直播类应用中的视频、长短视频应用中的视频、监控中的视频、系统相机录像功能录制的视频等。
获取原始视频,对原始视频进行解码处理,得到该原始视频的解码视频。该解码视频包括多个视频帧图像。
其中,该解码视频可以包括两帧或两帧以上的视频帧图像,针对视频帧图像的数
量,本申请实施例没有任何限制。
本申请实施例以原始视频包括SDR视频为例进行说明。示例性地,该SDR视频可以为显示设备在线播放的视频。对该SDR视频进行解码处理,得到该SDR视频的解码视频。例如,可以通过显示设备中的图形处理器(Graphics Processing Unit,GPU)对SDR视频进行解码,得到该SDR视频的解码视频。又例如,可以通过显示设备中的视频处理卡对SDR视频进行解码,得到该SDR视频的解码视频。再例如,可以通过视频转码服务器对SDR视频进行解码,得到该SDR视频的解码视频。此处仅为示例性说明,对此不做限定。
S102、确定每个视频帧图像对应的亮度信息。
示例性地,对解码视频进行处理。譬如确定解码视频中包含的每个视频帧图像的亮度信息。可以理解的是,确定每个视频帧图像对应的亮度信息的方式有多种,下面对每种方式进行描述。
在一种可能的实现方式中,可以先确定解码视频中包含的视频帧图像的像素格式,根据该像素格式确定每个视频帧图像对应的亮度信息。其中,像素格式可以包括YUV格式以及RGB格式。可以理解的是,对于同一解码视频,该解码视频中包含的各个视频帧图像的像素格式一致。
若视频帧图像的像素格式为YUV格式,Y表示该视频帧图像的亮度,则可以直接获取每一个视频帧图像对应的Y值,该Y值即表示一个视频帧图像对应的亮度信息。
若视频帧图像的像素格式为RGB格式,则可以采用下述公式(1)确定该视频帧图像的亮度信息。
L=0.299*R+0.567*G+0.114*B, (1)
L=0.299*R+0.567*G+0.114*B, (1)
上述公式(1)中,L表示该视频帧图像的亮度信息,R表示该视频帧图像的红色分量的值,G表示该视频帧图像的绿色分量的值,B表示该视频帧图像的蓝色分量的值。
若视频帧图像的像素格式为YUV格式,为了便于显示设备确定视频帧图像的亮度信息,也可以先将视频帧图像的像素格式由YUV格式转换为RGB格式,再通过上述公式(1)确定该视频帧图像的亮度信息。其中,将视频帧图像的像素格式由YUV格式转换为RGB格式,可以通过下述公式(2)实现。
R=Y+1.4075*V
G=Y-0.3455*U-0.7169*V
B=Y+1.779*U, (2)
R=Y+1.4075*V
G=Y-0.3455*U-0.7169*V
B=Y+1.779*U, (2)
上述公式(2)中,R表示该视频帧图像的红色分量的值,G表示该视频帧图像的绿色分量的值,B表示该视频帧图像的蓝色分量的值,Y表示该视频帧图像的亮度,U和V表示该视频帧图像的色度。
应理解,在这种实现方式中,亮度信息可数字化为0-255,也就是说,视频帧图像中的每个像素点的亮度水平可以由0-255表示。
为了直观地展示视频帧图像对应的亮度信息,请参阅图6,图6为本申请实施例提供的一种亮度信息展示图。图6中的(a)所示的为任意一个视频帧图像,图6中的
(b)所示的为与该视频帧图像对应的亮度信息的展示图。
在另一种可能的实现方式中,可以利用亮度直方图确定每个视频帧图像对应的亮度信息。例如,可以通过运行预设代码确定每个视频帧图像对应的亮度直方图,对每个亮度直方图进行统计,计算出每个视频帧图像的直方图分布。每个视频帧图像的直方图分布用于表示每个视频帧图像对应的亮度信息。
应理解,在这种实现方式中,视频帧图像中的每个像素点的亮度水平可以由每个灰阶的统计频表示。
请参阅图7,图7为本申请实施例提供的另一种亮度信息展示图。图7中的(a)所示的为一种表现形式的亮度信息直方图,图7中的(b)所示的为另一种表现形式的亮度信息直方图。图7中的(a)以及图7中的(b)中的横轴均表示亮度(即亮度数值),纵轴均表示在视频帧图像中与横轴的亮度所对应的像素数量。其中,横轴最左边表示视频帧图像中最暗的地方,横轴最右边表示视频帧图像中最亮的地方,中间的数值代表不同亮度的灰色,数值越大就越亮。
值得说明的是,为了提升效率,若显示设备检测到连续的多个视频帧图像相似,则确定连续的多个视频帧图像中的一个视频帧图像的亮度信息,并将该视频帧图像的亮度信息,作为与其相似的其余几个视频帧图像的亮度信息。例如,若显示设备检测到连续的4个视频帧图像相似(如相似度大于或等于预设相似度阈值),则显示设备确定这4个视频帧图像中的第1个视频帧图像的亮度信息,并将第1个视频帧图像的亮度信息作为其他3个视频帧图像的亮度信息。此处仅为示例性说明,对此不做限定。
应理解,显示设备在线播放视频是按照时间顺序逐帧显示图像的一个过程,显示设备在确定每个视频帧图像对应的亮度信息时,也是按照时间顺序逐帧确定每个视频帧图像对应的亮度信息。
S103、获取显示设备的亮度能力和原始视频的动态范围,并根据显示设备的亮度能力和原始视频的动态范围计算每个视频帧图像可扩展的动态范围。
示例性地,获取显示设备的亮度能力。其中,亮度能力可以用显示设备的动态范围表示。应理解,每个动态范围都对应一个峰值亮度,峰值亮度越高,表示动态范围越高,即表示显示设备的亮度能力越强。
应理解,显示设备的亮度能力是可以调整的,譬如根据用户不同的需求灵活调整显示设备当前的亮度能力。也就是说,在不同场景下,显示设备的亮度能力可能相同,也可能不相同。例如,显示设备在线播放视频时,播放到不同场景画面时,显示设备的亮度能力可能相同,也可能不相同。
可选地,在一种可能的实现方式中,获取显示设备的亮度能力可以是,获取显示设备的最大亮度能力,即获取该显示设备可以达到的最大亮度能力。例如,显示设备能表达的动态范围可以达到1000nit,则获取到该显示设备的最大亮度能力1000nit,或者说,获取到该显示设备的动态范围的亮度值为1000nit。此处仅为示例性说明,对此不做限定。
可选地,在一种可能的实现方式中,若用户预先在显示设备中开启了自动调整亮度功能,则该显示设备可以自动调整亮度。譬如显示设备在线播放视频时会自动调整亮度。在这种实现方式中,获取显示设备的亮度能力可以是,实时获取显示设备当前的亮度能力。例如,在一个示例中,显示设备在线播放视频时自动将亮度调整至800nit,则实时获取到显示设备当前的亮度能力为800nit。又例如,在另一个示例中,显示设备在线播放视频时自动将亮度调整至400nit,则实时获取到显示设备当前的亮度能力为400nit。此处仅为示例性说明,对此不做限定。
可选地,在一种可能的实现方式中,用户可以根据自己的需求调整亮度。譬如显示设备在线播放视频时,在当前的视频播放界面,用户可以通过触控物体(如用户手指或者触控笔等)轻按屏幕左侧(在一种可能的实现方式中,用户也可以轻按屏幕右侧)并向上滑动,显示设备响应用户的该触控操作,屏幕亮度增强。若用户通过触控物体轻按屏幕左侧(或右侧)并向下滑动,显示设备响应用户的该触控操作,屏幕亮度变暗。在这种实现方式中,获取显示设备的亮度能力可以是,获取调整亮度后显示设备的亮度能力。例如,显示设备响应用户的触控操作,将亮度调整至600nit,则获取到显示设备当前的亮度能力为600nit。此处仅为示例性说明,对此不做限定。
示例性地,获取原始视频的动态范围。在一个示例中,获取原始视频的动态范围就是指获取该原始视频对应的动态范围。在另一个示例中,为了提升可扩展的动态范围的准确度,从而提升最终的显示效果,可以考虑每个视频帧图像实际的亮度统计水平,因此,获取原始视频的动态范围可以是,获取原始视频在每个视频帧图像所对应的场景下的动态范围。在本申请实施例中,可以获取SDR视频的动态范围,或者,获取SDR视频在每个视频帧图像所对应的场景下的动态范围。
应理解,本申请实施例提供的显示设备,具有自动获取原始视频在每个视频帧图像所对应的场景下的动态范围的能力,即显示设备可以自动获取到每个视频帧图像所对应的场景下的动态范围。在一种可能的实现方式中,显示设备获取到SDR视频在第一视频帧图像所对应的场景下的动态范围为200nit,或者说,显示设备获取到第一视频帧图像所对应的场景下的动态范围的亮度值为200nit。其中,第一视频帧图像用于表示SDR视频中包含的多个视频帧图像中的任意一个视频帧图像。此处仅为示例性说明,对此不做限定。
示例性地,利用显示设备的亮度能力和原始视频的动态范围,计算每个视频帧图像可扩展的动态范围。应理解,获取到的显示设备的亮度能力以及原始视频的动态范围,均可以用亮度值来表示,计算两个亮度值之间的差值,并将该差值作为视频帧图像可扩展的动态范围。
例如,获取到显示设备当前的亮度能力为600nit,第一视频帧图像所对应的场景下的动态范围的亮度值为200nit,计算600nit与200nit之间的差值为400nit,将该差值即400nit作为第一视频帧图像可扩展的动态范围。
值得说明的是,为了提升效率,若显示设备检测到连续的多个视频帧图像相似,则计算连续的多个视频帧图像中的一个视频帧图像可扩展的动态范围,并将该视频帧图像可扩展的动态范围,作为与其相似的其余几个视频帧图像可扩展的动态范围。例
如,若显示设备检测到连续的3个视频帧图像相似(如相似度大于或等于预设相似度阈值),则显示设备计算这3个视频帧图像中的第1个视频帧图像可扩展的动态范围,并将第1个视频帧图像可扩展的动态范围作为其他3个视频帧图像可扩展的动态范围。此处仅为示例性说明,对此不做限定。
应理解,显示设备在线播放视频是按照时间顺序逐帧显示图像的一个过程,显示设备在计算每个视频帧图像可扩展的动态范围时,也是按照时间顺序逐帧计算每个视频帧图像可扩展的动态范围。
S104、利用每个视频帧图像对应的亮度信息和每个视频帧图像可扩展的动态范围,对每个视频帧图像进行色调映射,得到每个视频帧图像对应的增强图像。
示例性地,利用每个视频帧图像对应的亮度信息,确定每个视频帧图像的扩展区域。该扩展区域在不同的实现方式中包括不同的区域,该扩展区域用于色调映射,在后面的实施例中将详细描述。根据每个视频帧图像可扩展的动态范围,对每个视频帧图像的扩展区域进行色调映射。根据映射结果生成每个视频帧图像对应的增强图像。
增强图像相较于其所对应的视频帧图像,画面色彩更加鲜艳生动,细节更加清晰明了,可以提供更多的动态范围和图像细节,大幅提高了画面细节的明暗对比度,能够更好地反映出真实环境中的视觉效果。
本申请实施例提供的处理视频的方法,确定原始视频的解码视频中的每个视频帧图像对应的亮度信息;通过显示设备的亮度能力和原始视频的动态范围,计算每个视频帧图像可扩展的动态范围;再利用每个视频帧图像对应的亮度信息和每个视频帧图像可扩展的动态范围,对每个视频帧图像进行色调映射,得到每个视频帧图像对应的增强图像。在确定每个视频帧图像可扩展的动态范围时,充分考虑到了显示设备的亮度能力和原始视频的动态范围,再利用可扩展的动态范围,对每个视频帧图像进行色调映射,使视频帧图像的动态范围真正地得到了提升,即增强图像是提升动态范围后的图像。由此可知,通过多个增强图像构成的视频的动态范围也得到了提升,从而使得在显示设备上显示提升动态范围后的视频时,提升了显示效果,提升了用户体验。
可选地,在一种可能的实现方式中,本申请实施例提供的处理视频的方法,在包括S101-S104的基础上,还可包括S105,具体如下:
S105、根据多个增强图像生成增强视频。
该增强视频为扩展动态范围视频,即增强视频为扩展动态范围后的视频。按照时间顺序排列每个视频帧图像对应的增强图像,得到增强视频,可在显示设备上播放该增强视频。在本实施方式中,可以是处理完一个视频帧图像,在显示设备上显示该视频帧图像对应的增强图像,由于显示设备处理的速度很快,对于用户来说,就是在流畅地观看视频,且观看的视频是动态范围提升后的视频,提升了用户体验。
下面对生成视频帧图像对应的增强图像的一种流程进行详细说明。
请参阅图8,图8为本申请实施例提供的生成视频帧图像对应的增强图像的一种流程示意图。如图8所示,上述S104可以包括S1041-S1043。
S1041、利用每个视频帧图像对应的亮度信息,确定每个视频帧图像对应的标准动态区域和扩展动态区域。
示例性地,上述S104中的扩展区域在本实现方式中包括标准动态区域和扩展动态区域。
其中,扩展动态区域是在一个视频帧图像中划分出的一个区域,该扩展动态区域包含该视频帧图像中的若干个像素点,处于该扩展动态区域中的像素点的像素值可以不用调整,即处于该扩展动态区域中的像素点的像素值保持原始像素值。
标准动态区域是在该视频帧图像中划分出的另一个区域,该标准动态区域也包含该视频帧图像中的若干个像素点,与处于扩展动态区域中的像素点不同的是,处于该标准动态区域中的像素点的像素值需要调整。
示例性地,若是通过先确定解码视频中包含的视频帧图像的像素格式,再根据该像素格式确定的每个视频帧图像对应的亮度信息,那么在这种实现方式中,亮度信息可数字化为0-255,也就是说,视频帧图像中的每个像素点的亮度水平可以由0-255表示。由此,可以通过预设阈值和每个视频帧图像的亮度信息,对每个视频帧图像的像素点进行划分。例如,将每个视频帧图像的像素点划分为亮度信息小于该预设阈值的像素点,和亮度信息大于或等于该预设阈值的像素点。应理解,亮度信息小于该预设阈值的像素点组成的区域为标准动态区域,亮度信息大于或等于该预设阈值的像素点组成的区域为扩展动态区域。
其中,预设阈值可由用户设定,也可以通过灰阶比例计算得到,还可以通过机器学习模型确定。此处仅为示例性说明,对此不做限定。
这种实现方式中,通过一个预设阈值划分出标准动态区域和扩展动态区域两个区域,为了使最终提升的动态范围效果更好,在一种可能的实现方式中,可以通过多个预设阈值范围实现更精细的划分,即通过多个预设阈值范围和每个视频帧图像的亮度信息,对每个视频帧图像的像素点进行划分。例如,针对每个视频帧图像的像素点,将亮度信息属于第一预设阈值范围的像素点划分为标准动态区域的像素点,将亮度信息属于第二预设阈值范围的像素点划分为扩展动态区域的像素点。第一预设阈值范围和第二预设阈值范围不同。应理解,属于第一预设阈值范围的像素点的亮度信息小于属于第二预设阈值范围的像素点的亮度信息。
S1042、确定标准动态区域对应的第一系数和扩展动态区域对应的第二系数。
示例性地,第一系数小于第二系数。
根据原始视频的动态范围和每个视频帧图像可扩展的动态范围,确定标准动态区域对应的第一系数。示例性地,将原始视频的动态范围记为A,将每个视频帧图像可扩展的动态范围记为B,计算A与B之间的商,将该商作为标准动态区域对应的第一系数。通常情况下,A的值小于B的值。因此,标准动态区域对应的第一系数小于1。
值得说明的是,在本申请实施例中,原始视频的动态范围可以包括SDR视频在每个视频帧图像所对应的场景下的动态范围。
示例性地,由于处于扩展动态区域中的像素点的像素值可以不用调整,因此扩展动态区域对应的第二系数可以设置为1。
应当理解的是,当显示设备的整体亮度调高后,视频帧图像的亮度也都整体提高,此时该视频帧图像的对比度或动态范围并未提升。当保留了视频帧图像中亮的区域,将其他区域的亮度降下去,这样才可使得视频帧图像的动态范围得到提升。因此,扩展动态区域对应的第二系数可以设置为1,这样可使处于扩展动态区域中的像素点的像素值保持原始像素值。标准动态区域对应的第一系数小于1,这样可使处于标准动态区域中的像素点的像素值减小。
可选地,在一种可能的实现方式中,对处于标准动态区域中的像素点的像素值可以不用调整,因此标准动态区域对应的第一系数可以设置为1。对处于扩展动态区域中的像素点的像素值增大,因此扩展动态区域对应的第二系数可以设置为大于1的值。例如,示例性地,将原始视频的动态范围记为A,将每个视频帧图像可扩展的动态范围记为B,计算B与A之间的商,将该商作为标准动态区域对应的第二系数。通常情况下,A的值小于B的值。因此,扩展动态区域对应的第二系数大于1。
应当理解的是,当显示设备的整体亮度调高后,视频帧图像的亮度也都整体提高,此时该视频帧图像的对比度或动态范围并未提升。当保留了视频帧图像中其他区域的亮度,将视频帧图像中亮的区域的亮度提升,可使得视频帧图像的动态范围得到提升。因此,扩展动态区域对应的第二系数大于1,这样可使处于扩展动态区域中的像素点的像素值提升。标准动态区域对应的第一系数设置为1,这样可使处于标准动态区域中的像素点的保持原始像素值。
S1043、根据每个视频帧图像的标准动态区域、第一系数、扩展动态区域以及第二系数,生成每个视频帧图像对应的增强图像。
根据第一系数对标准动态区域中的像素点进行色调映射,根据第二系数对扩展动态区域的像素点进行色调映射,生成每个视频帧图像对应的增强图像。
示例性地,针对处于标准动态区域中的每个像素点,计算该像素点的原始像素值与第一系数之间的乘积,并将该乘积作为该像素点新的像素值。
可选地,在一种可能的实现方式中,针对处于扩展动态区域中的每个像素点,计算该像素点的原始像素值与第二系数之间的乘积,并将该乘积作为该像素点新的像素值。
可选地,在一种可能的实现方式中,由于扩展动态区域对应的第二系数为1,针对处于扩展动态区域中的每个像素点,可以不对这些像素点的原始像素值进行处理,保存这些像素点的原始像素值即可。
针对每个视频帧图像,根据调整像素值后的像素点生成该视频帧图像对应的增强图像。应理解,在本实施方式中,若是对视频帧图像中所有像素点的像素值都进行调整,则根据所有调整像素值后的像素点生成该视频帧图像对应的增强图像。若是对视频帧图像中部分像素点的像素值进行了调整,则根据调整像素值后的像素点以及为调整像素值的像素点,共同生成该视频帧图像对应的增强图像。此处仅为示例性说明,对此不做限定。
应理解,在这种实现方式中,对每个视频帧图像进行色调映射,即指根据不同的系数(如第一系数和第二系数)对不同区域(如标准动态区域和扩展动态区域)的像素点的像素值进行调整。
这种实现方式中,利用每个视频帧图像对应的亮度信息,确定出每个视频帧图像中可调整的区域,即标准动态区域和扩展动态区域;为标准动态区域和扩展动态区域确定不同的系数,再根据不同的系数(如第一系数和第二系数)对不同的区域(如标准动态区域和扩展动态区域)的像素点的像素值分别进行调整,在保持扩展动态区域的像素点的像素值基础上,降低了标准动态区域的像素点的像素值,这样才使最终生成的增强图像的动态范围得到了真正地提升。那么,通过多个增强图像构成的视频的动态范围也得到了提升,从而使得在显示设备上显示提升动态范围后的视频时,提升了显示效果,提升了用户体验。
可选地,若扩展动态区域对应的第二系数大于1,标准动态区域对应的第一系数为1,则针对处于扩展动态区域中的每个像素点,计算该像素点的原始像素值与第二系数之间的乘积,并将该乘积作为该像素点新的像素值。
针对处于标准动态区域中的每个像素点,计算该像素点的原始像素值与第一系数之间的乘积,并将该乘积作为该像素点新的像素值。或者,由于此时标准动态区域对应的第一系数为1,针对处于标准动态区域中的每个像素点,可以不对这些像素点的原始像素值进行处理,保存这些像素点的原始像素值即可。
这种实现方式中,利用每个视频帧图像对应的亮度信息,确定出每个视频帧图像中可调整的区域,即标准动态区域和扩展动态区域;为标准动态区域和扩展动态区域确定不同的系数,再根据不同的系数(如第一系数和第二系数)对不同的区域(如标准动态区域和扩展动态区域)的像素点的像素值分别进行调整,在保持标准动态区域的像素点的像素值基础上,提升了扩展动态区域的像素点的像素值,使最终生成的增强图像的动态范围得到了真正地提升。那么,通过多个增强图像构成的视频的动态范围也得到了提升,从而使得在显示设备上显示提升动态范围后的视频时,提升了显示效果,提升了用户体验。
上述实现方式中,是根据不同的系数(如第一系数和第二系数)对不同区域(如标准动态区域和扩展动态区域)的像素点的像素值进行调整。在一种可能的实现方式中,也可以根据视频帧图像中各个像素点的实际亮度水平,逐个计算每个像素点对应的系数。针对每个像素点,根据该像素点对应的系数调整该像素点的像素值。例如,计算该像素点的原始像素值与该系数之间的乘积,并将该乘积作为该像素点新的像素值。对所有的像素点都进行调整后,根据所有调整像素值后的像素点生成视频帧图像对应的增强图像。这种实现方式中,有针对性地对每个像素点做出单独调整,可使最终生成的增强图像的动态范围的提升效果更好,同时提高了最终生成的增强图像的质量。
下面对生成视频帧图像对应的增强图像的另一种流程进行详细说明。
请参阅图9,图9为本申请实施例提供的生成视频帧图像对应的增强图像的另一
种流程示意图。如图9所示,上述S104可以包括S1044-S1046。值得说明的是,S1044-S1046与上述S1041-S1043并列,可根据实际情况选择执行S1041-S1043或S1044-S1046,并非在S1041-S1043后执行S1044-S1046。
S1044、利用每个视频帧图像对应的亮度信息,确定每个视频帧图像对应的第一区域、第二区域以及第三区域。
示例性地,上述S104中的扩展区域在本实现方式中包括第一区域、第二区域以及第三区域。
其中,第一区域可以表示低灰阶区域,第二区域可以表示中灰阶区域,第三区域可以表示高灰阶区域。
第一区域、第二区域以及第三区域中的至少一个区域为需要进行重新色调映射的区域。在一种可能的实现方式中,可以保持第一区域和第二区域当前的亮度,将第三区域作为需要进行重新色调映射的区域。例如,保持低灰阶区域和中灰阶区域当前的亮度,对高灰阶区域的亮度进行提升。在另一种可能的实现方式中,将第一区域、第二区域以及第三区域均作为需要进行重新色调映射的区域。例如,对低灰阶区域、中灰阶区域以及高灰阶区域的亮度均进行提升。应理解,对不同区域提升的亮度的幅度可以相同,也可以不相同。在又一种可能的实现方式中,可以保持第二区域和第三区域当前的亮度,对第一区域的亮度进行降低。例如,保持中灰阶区域和高灰阶区域当前的亮度,对低灰阶区域的亮度进行降低。
示例性地,若是通过亮度直方图确定每个视频帧图像对应的亮度信息,则可通过T1、T2,对每个视频帧图像的亮度直方图进行划分。为了便于理解,请参阅图10,图10为本申请实施例提供的一种区域划分示意图。如图10所示,通过T1、T2,将亮度直方图表示的亮度分布划分为3个区域,如第一区域:[0,T1)、第二区域[T1,T2]、第三区域(T2,255]。
其中,T1、T2的值可由用户设定,也可以是通过机器学习模型确定的。该机器学习模型在训练过程中,学习视频帧图像的亮度直方图中像素点的灰阶分布,并根据该分布确定出两个数值,这两个数值分别为T1、T2的值。例如,在本实施方式中,可以将视频帧图像的亮度直方图输入到该机器学习模型中,该机器学习模型对该亮度直方图进行分析处理,输出T1、T2的值。此处仅为示例性说明,对此不做限定。
为了更直观地展现亮度区域的划分,请参阅图11,图11为本申请实施例提供的一种亮度区域划分示意图。如图11所示,亮灰色的区域(包括图11中所示的分布不均匀的亮灰色的点)被划分为第一区域即低灰阶区域,黑色的区域(包括图11中所示的所有黑色区域)被划分为第二区域即中灰阶区域,暗灰色的区域被划分为第三区域即高灰阶区域。值得说明的是,图11现在所示的颜色仅为区分各个亮度区域,不表示视频帧图像真实的颜色。
这种实现方式中,通过T1、T2划分出低灰阶区域、中灰阶区域以及高灰阶区域三个区域,为了使最终提升的动态范围效果更好,在一种可能的实现方式中,可以通过多个数值(如类似T1、T2的值)实现更精细的划分,即通过多个数值将亮度直方图表示的亮度分布划分为多个区域。
S1045、确定第一区域对应的第一调整策略、第二区域对应的第二调整策略以及第三区域对应的第三调整策略。
请参阅图12,图12为本申请实施例提供的一种色调映射示意图。图12中的横轴表示255灰阶归一化后的值,纵轴表示色调映射后的灰阶值。示例性地,SDR通常为线性映射,如图12所示的黑色直线,该黑色直线对应的函数可以为:y=k*x。
应理解,第一区域、第二区域以及第三区域在图12中对应不同的直线。如图12所示,第一区域对应直线GH,第二区域对应直线HI,第三区域对应直线IJ。其中,直线GH对应的函数为:y=k*x。
通过第一调整策略,可以确定低灰阶区域中的像素点调整后的灰阶值。例如,根据每个视频帧图像可扩展的动态范围,确定低灰阶区域对应的第三系数;根据第三系数对低灰阶区域中的像素点进行色调映射,得到低灰阶区域中的像素点调整后的灰阶值。
对于第一区域的第一调整策略,可以是使低灰阶区域中的各像素点调整后的灰阶值与调整前的灰阶值满足y1=(A/B)*k*x1。例如,调整直线GH的斜率。调整后的第一区域的直线GH对应的函数为:y1=(A/B)*k*x1。
其中,x1表示低灰阶区域中的像素点调整前的灰阶值,y1表示低灰阶区域中的该像素点调整后的灰阶值,A/B表示第三系数,A表示原始视频的动态范围,B表示每个视频帧图像可扩展的动态范围,k表示常数。
可选地,在一种可能的实现方式中,为了提升整个视频亮度,调整后的第一区域的直线GH对应的函数也可以为:y1=M*(A/B)*k*x1,(M>1)。即低灰阶区域中的各像素点调整后的灰阶值与调整前的灰阶值也可以满足y1=M*(A/B)*k*x1,(M>1)。
x1表示低灰阶区域中的像素点调整前的灰阶值,y1表示低灰阶区域中的该像素点调整后的灰阶值,M*(A/B)表示第三系数,A表示原始视频的动态范围,B表示每个视频帧图像可扩展的动态范围,M、k表示常数。
应理解,M的值可由用户根据实际情况设置,对此不做限定。
通过第二调整策略,可以确定中灰阶区域中的像素点调整后的灰阶值。
对于第二区域的第二调整策略,可以是使中灰阶区域中的各个像素点调整后的灰阶值与调整前的灰阶值满足y2=k1*x2+b1。例如,可以将第二区域对应的直线HI调整为直线HK,即对第二区域对应的原始线性色调映射进行扩展。譬如原本第二区域的直线HI对应的函数为:y=k*x,调整后的第二区域的直线HK对应的函数为:y2=k1*x2+b1。
其中,x2表示中灰阶区域中的像素点调整前的灰阶值,y2表示中灰阶区域中的该像素点调整后的灰阶值,k1、b1表示常数。
通过第三调整策略,可以确定高灰阶区域中的像素点调整后的灰阶值。
对于第三区域的第三调整策略,可以是使高灰阶区域中的各像素点调整后的灰阶值与调整前的灰阶值可以满足y3=k2*x3+b2。例如,可以将第三区域对应的直线IJ调整为直线KL,即对第三区域对应的原始线性色调映射进行扩展。譬如原本第三区域
的直线IJ对应的函数为:y=k*x,调整后的第三区域的直线KL对应的函数为:y3=k2*x3+b2。
其中,x3表示高灰阶区域中的像素点调整前的灰阶值,y3表示高灰阶区域中的该像素点调整后的灰阶值,k2、b2表示常数。
值得说明的是,k1、b1、k2以及b2的值可以根据亮度灰阶分布的比例进行设置,也可以通过机器学习模型确定。例如,该机器学习模型在训练过程中,学习不同场景下视频帧图像的亮度直方图中像素点的灰阶分布,并根据该分布预测k1、b1、k2以及b2的值。此处仅为示例性说明,对此不做限定。
S1046、根据每个视频帧图像的第一区域、第一调整策略、第二区域、第二调整策略、第三区域以及第三调整策略,生成每个视频帧图像对应的增强图像。
示例性地,在S1045中确定了调整后的第一区域、第二区域以及第三区域分别对应的函数,对于任意一个函数,输入x值,对应输出y值,该y值表示色调映射后的灰阶值,即得到了调整后的亮度值。根据该调整后的亮度值对应调整视频帧图像的亮度,得到该视频帧图像对应的增强图像。通俗理解为,原本像素点的灰阶值为x,通过这些函数确定调整后该像素点对应的灰阶值为y,根据y值对应调整视频帧图像中像素点的亮度,得到该视频帧图像对应的增强图像。
例如,对于第三区域,根据y3=k2*x3+b2确定第三区域中各个像素点的x3值所对应的y3值,在视频帧图像中将属于第三区域的各个像素点的x3值调整为y3值。采用类似的方法,对第一区域和第二区域中的各个像素点也做相应调整,得到视频帧图像对应的增强图像。
这种实现方式中,利用每个视频帧图像对应的亮度信息,确定出每个视频帧图像中可调整的区域,即低灰阶区域、中灰阶区域以及高灰阶区域;为低灰阶区域、中灰阶区域以及高灰阶区域确定不同的调整策略,再根据不同的调整策略对不同的区域的像素点的灰阶值进行调整,使最终生成的增强图像的动态范围得到了真正地提升。那么,通过多个增强图像构成的视频的动态范围也得到了提升,从而使得在显示设备上显示提升动态范围后的视频时,提升了显示效果,提升了用户体验。
下面结合图13,对本申请实施例提供的处理视频的方法再次进行描述。请参考图13,图13是本申请一示例性实施例示出的一种实现过程示意图。示例性地,获取SDR视频后,对SDR视频进行解码,得到SDR视频的解码视频,该解码视频可以包括两帧或两帧以上的视频帧图像。图13展示出了SDR视频的解码视频中的一个视频帧图像。
确定该视频帧图像对应的亮度信息。图13中以亮度直方图的形式展示了视频帧图像对应的亮度信息。利用该视频帧图像对应的亮度信息进行亮度区域划分,得到低灰阶区域、中灰阶区域以及高灰阶区域。
结合显示设备的亮度能力和SDR视频的动态范围,对视频帧图像对应的低灰阶区域、中灰阶区域以及高灰阶区域进行色调映射,得到视频帧图像对应的增强图像。使视频帧图像的动态范围真正地得到了提升,即增强图像是提升动态范围后的图像。由
此可知,通过多个增强图像构成的视频(即图13中所示的扩展动态范围视频)的动态范围也得到了提升,从而使得在显示设备上显示提升动态范围后的视频时,提升了显示效果,提升了用户体验。
上文结合图1至图13,对本申请实施例提供的处理视频的方法进行了详细描述,下面将结合图14至图16,详细描述本申请适用的显示设备的硬件系统、装置以及芯片。应理解,本申请实施例中的硬件系统、装置以及芯片可以执行前述本申请实施例提供的各种处理视频的方法,即以下各种产品的具体工作过程,可以参考前述方法实施例中的对应过程。
本申请实施例提供的处理视频的方法可以适用于各种显示设备,对应的,本申请实施例提供的显示装置可以为多种形态的显示设备。
在本申请的一些实施例中,该显示设备可以为单反相机、卡片机等各种摄像装置、手机、平板电脑、可穿戴设备、电视、车载设备、增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR)设备、笔记本电脑、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本、个人数字助理(personal digital assistant,PDA)等,或者可以为其他能够进行图像处理的设备或装置,对于显示设备的具体类型,本申请实施例不作任何限制。
下文以显示设备为手机为例,图14示出了本申请实施例提供的一种显示设备的结构示意图。
显示设备100可以包括处理器110,外部存储器接口120,内部存储器121,通用串行总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193,显示屏194,以及用户标识模块(subscriber identification module,SIM)卡接口195等。其中传感器模块180可以包括压力传感器180A,陀螺仪传感器180B,气压传感器180C,磁传感器180D,加速度传感器180E,距离传感器180F,接近光传感器180G,指纹传感器180H,温度传感器180J,触摸传感器180K,环境光传感器180L,骨传导传感器180M等。
需要说明的是,图14所示的结构并不构成对显示设备100的具体限定。在本申请另一些实施例中,显示设备100可以包括比图14所示的部件更多或更少的部件,或者,显示设备100可以包括图14所示的部件中某些部件的组合,或者,显示设备100可以包括图14所示的部件中某些部件的子部件。图14所示的部件可以以硬件、软件、或软件和硬件的组合实现。
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,
也可以集成在一个或多个处理器中。
其中,控制器可以是显示设备100的神经中枢和指挥中心。控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。
在本申请的实施例中,处理器110可以运行本申请实施例提供的处理视频的方法的软件代码,从而有效提升原始视频的动态范围。
图14所示的各模块间的连接关系只是示意性说明,并不构成对显示设备100的各模块间的连接关系的限定。可选地,显示设备100的各模块也可以采用上述实施例中多种连接方式的组合。
显示设备100的无线通信功能可以通过天线1、天线2、移动通信模块150、无线通信模块160、调制解调处理器以及基带处理器等器件实现。
天线1和天线2用于发射和接收电磁波信号。显示设备100中的每个天线可用于覆盖单个或多个通信频带。不同的天线还可以复用,以提高天线的利用率。例如:可以将天线1复用为无线局域网的分集天线。在另外一些实施例中,天线可以和调谐开关结合使用。
显示设备100可以通过GPU、显示屏194以及应用处理器实现显示功能。GPU为图像处理的微处理器,连接显示屏194和应用处理器。在本申请的实施例中,GPU可用于对原始视频(如SDR视频)进行解码,得到原始视频(如SDR视频)的解码视频。GPU还可用于执行数学和位姿计算,用于图形渲染等。处理器110可包括一个或多个GPU,其执行程序指令以生成或改变显示信息。
示例性的,在本申请实施例中,可以在处理器110中执行获取原始视频的解码视频;确定每个视频帧图像对应的亮度信息;获取显示设备的亮度能力和原始视频的动态范围,并根据显示设备的亮度能力和原始视频的动态范围计算每个视频帧图像可扩展的动态范围;以及利用每个视频帧图像对应的亮度信息和每个视频帧图像可扩展的动态范围,对每个视频帧图像进行色调映射,得到每个视频帧图像对应的增强图像的步骤。
显示屏194可以用于显示图像或视频。显示屏194包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organic light emitting diode的,AMOLED),柔性发光二极管(flex light-emitting diode,FLED),Miniled,MicroLed,Micro-oLed,量子点发光二极管(quantum dot light emitting diodes,QLED)等。在一些实施例中,显示设备100可以包括1个或N个显示屏194,N可以为大于1的正整数。
在本申请实施例中,显示屏194可以用于显示原始视频、解码视频包括的多个视频帧图像、增强图像、由增强图像构成的增强视频(即扩展动态范围视频)等。
本申请实施例中的显示屏194可以是触摸屏。该显示屏194中可以集成有触摸传感器180K。该触摸传感器180K也可以称为“触控面板”。也就是说,显示屏194可以包括显示面板和触摸面板,由触摸传感器180K与显示屏194组成触摸屏,也称“触控屏”。触摸传感器180K用于检测作用于其上或附近的触控操作,如用户通过触控物体(如用户手指或者触控笔等)轻按显示设备100的屏幕左侧并向上滑动。触摸传感器180K检测到的触摸操作后,可以由内核层的驱动(如TP驱动)传递给上层,以确定触控事件类型。可以通过显示屏194提供与触控操作相关的视觉输出。在另一些实施例中,触摸传感器180K也可以设置于显示设备100的表面,与显示屏194所处的位置不同。
显示设备100可以通过ISP、摄像头193、视频编解码器、GPU、显示屏194以及应用处理器等实现拍摄、录制功能。
ISP用于处理摄像头193反馈的数据。例如,拍录制视频时,光线通过镜头被传递到摄像头感光元件上,光信号转换为电信号,摄像头感光元件将所述电信号传递给ISP处理,转化为肉眼可见的图像。ISP可以对图像的噪点、亮度和色彩进行算法优化,ISP还可以优化录制场景的曝光和色温等参数。在一些实施例中,ISP可以设置在摄像头193中。
摄像头193用于捕获图像或视频。可以通过应用程序指令触发开启,实现拍摄、录制功能,如在视频直播、视频会议、视频通话、视频监控等场景时,可以录制获取到视频。摄像头可以包括成像镜头、滤光片、图像传感器等部件。物体发出或反射的光线进入成像镜头,通过滤光片,最终汇聚在图像传感器上。图像传感器主要是用于对录制视角中的所有物体发出或反射的光汇聚成像;滤光片主要是用于将光线中的多余光波(例如除可见光外的光波,如红外)滤去;图像传感器主要是用于对接收到的光信号进行光电转换,转换成电信号,并输入处理器110进行后续处理。其中,摄像头193可以位于显示设备100的前面,也可以位于显示设备100的背面,摄像头的具体个数以及排布方式可以根据需求设置,本申请不做任何限制。
示例性的,在本申请的实施例中,摄像头193可以获取录制的视频。
数字信号处理器用于处理数字信号,除了可以处理数字图像信号,还可以处理其他数字信号。例如,当显示设备100在频点选择时,数字信号处理器用于对频点能量进行傅里叶变换等。
视频编解码器用于对数字视频压缩或解压缩。显示设备100可以支持一种或多种视频编解码器。在本申请的实施例中,视频编解码器可用于对原始视频(如SDR视频)进行解码,得到原始视频(如SDR视频)的解码视频。
陀螺仪传感器180B可以用于确定显示设备100的运动姿态。在一些实施例中,可以通过陀螺仪传感器180B确定显示设备100围绕三个轴(即,x轴、y轴和z轴)的角速度。陀螺仪传感器180B可以用于拍摄、录制防抖。例如,在拍摄或录制时,陀螺仪传感器180B检测显示设备100抖动的角度,根据角度计算出镜头模组需要补偿的距离,让镜头通过反向运动抵消显示设备100的抖动,实现防抖。陀螺仪传感器180B还可以用于导航和体感游戏等场景。
加速度传感器180E可检测显示设备100在各个方向上(一般为x轴、y轴和z轴)加速度的大小。当显示设备100静止时可检测出重力的大小及方向。加速度传感器180E还可以用于识别显示设备100的姿态,作为横竖屏切换和计步器等应用程序的输入参数。
距离传感器180F用于测量距离。显示设备100可以通过红外或激光测量距离。在一些实施例中,例如在拍摄、录制场景中,显示设备100可以利用距离传感器180F测距以实现快速对焦。
环境光传感器180L用于感知环境光亮度。显示设备100可以根据感知的环境光亮度自适应调节显示屏194亮度。环境光传感器180L也可用于拍照时自动调节白平衡。环境光传感器180L还可以与接近光传感器180G配合,检测显示设备100是否在口袋里,以防误触。
指纹传感器180H用于采集指纹。显示设备100可以利用采集的指纹特性实现解锁、访问应用锁、拍照和接听来电等功能。
压力传感器180A用于感受压力信号,可以将压力信号转换成电信号。在一些实施例中,压力传感器180A可以设置于显示屏194。压力传感器180A的种类很多,如电阻式压力传感器,电感式压力传感器,电容式压力传感器等。电容式压力传感器可以是包括至少两个具有导电材料的平行板。当有力作用于压力传感器180A,电极之间的电容改变。显示设备100根据电容的变化确定压力的强度。当有触摸操作作用于显示屏194,显示设备100根据压力传感器180A检测触摸操作强度。显示设备100也可以根据压力传感器180A的检测信号计算触摸的位置。在一些实施例中,作用于相同触摸位置,但不同触摸操作强度的触摸操作,可以对应不同的操作指令。
按键190包括开机键,音量键等。按键190可以是机械按键。也可以是触摸式按键。显示设备100可以接收按键输入,产生与显示设备100的用户设置以及功能控制有关的键信号输入。在本申请实施例中,显示设备100在线播放视频时,若屏幕左侧用于控制屏幕亮度,那么,屏幕右侧就用于控制音量。若屏幕右侧用于控制屏幕亮度,那么,屏幕左侧就用于控制音量。例如,用户通过触控物体(如用户手指或者触控笔等)轻按显示设备100的屏幕左侧并向上或向下滑动,显示设备100响应该触控操作,将屏幕亮度增强或变暗。用户通过触控物体轻按显示设备100的屏幕右侧并向上或向下滑动,显示设备100响应该触控操作,将显示设备100的声音调大或调小。
马达191可以产生振动提示。马达191可以用于来电振动提示,也可以用于触摸振动反馈。指示器192可以是指示灯,可以用于指示充电状态,电量变化,也可以用于指示消息,未接来电,通知等。SIM卡接口195用于连接SIM卡。SIM卡可以通过插入SIM卡接口195,或从SIM卡接口195拔出,实现和显示设备100的接触和分离。显示设备100可以支持1个或N个SIM卡接口,N为大于1的正整数。SIM卡接口195可以支持Nano SIM卡,Micro SIM卡,SIM卡等。
上述各实施例中的方法均可以在具有上述硬件结构的显示设备100中实现。
图15为本申请实施例提供的一种显示装置的结构示意图。如图15所示,该显示装置200包括第一获取模块210、确定模块220、第二获取模块230以及处理模块240。
该显示装置200可以执行以下方案:
第一获取模块210,用于获取原始视频的解码视频;
确定模块220,用于确定每个视频帧图像对应的亮度信息;
第二获取模块230,用于获取显示设备的亮度能力和原始视频的动态范围,并根据显示设备的亮度能力和原始视频的动态范围计算每个视频帧图像可扩展的动态范围;
处理模块240,用于利用每个视频帧图像对应的亮度信息和每个视频帧图像可扩展的动态范围,对每个视频帧图像进行色调映射,得到每个视频帧图像对应的增强图像。
需要说明的是,上述显示装置200以功能模块的形式体现。这里的术语“模块”可以通过软件和/或硬件形式实现,对此不作具体限定。
例如,“模块”可以是实现上述功能的软件程序、硬件电路或二者结合。所述硬件电路可能包括应用特有集成电路(application specific integrated circuit,ASIC)、电子电路、用于执行一个或多个软件或固件程序的处理器(例如共享处理器、专有处理器或组处理器等)和存储器、合并逻辑电路和/或其它支持所描述的功能的合适组件。
因此,在本申请的实施例中描述的各示例的模块,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本申请实施例还提供一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机指令;当所述计算机可读存储介质在显示设备上运行时,使得该显示设备执行如前述所示的方法。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或者数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可以用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带),光介质、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。
本申请实施例还提供了一种包含计算机指令的计算机程序产品,计算机程序产品包括:计算机程序代码,当计算机程序代码在显示设备上运行时,使得显示设备可以执行前述所示的技术方案。
图16为本申请实施例提供的一种芯片的结构示意图。图16所示的芯片可以为通用处理器,也可以为专用处理器。该芯片包括处理器301。其中,处理器301用于支持显示设备执行前述所示的技术方案。
可选的,该芯片还包括收发器302,收发器302用于接受处理器301的控制,用于支持显示装置执行前述所示的技术方案。
可选的,图16所示的芯片还可以包括:存储介质303。
需要说明的是,图16所示的芯片可以使用下述电路或者器件来实现:一个或多个现场可编程门阵列(field programmable gate array,FPGA)、可编程逻辑器件(programmable logic device,PLD)、控制器、状态机、门逻辑、分立硬件部件、任何其他适合的电路、或者能够执行本申请通篇所描述的各种功能的电路的任意组合。
上述本申请实施例提供的显示设备、显示装置、计算机存储介质、计算机程序产品、芯片均用于执行上文所提供的方法,因此,其所能达到的有益效果可参考上文所提供的方法对应的有益效果,在此不再赘述。
应理解,上述只是为了帮助本领域技术人员更好地理解本申请实施例,而非要限制本申请实施例的范围。本领域技术人员根据所给出的上述示例,显然可以进行各种等价的修改或变化,例如,上述检测方法的各个实施例中某些步骤可以是不必须的,或者可以新加入某些步骤等。或者上述任意两种或者任意多种实施例的组合。这样的修改、变化或者组合后的方案也落入本申请实施例的范围内。
还应理解,上文对本申请实施例的描述着重于强调各个实施例之间的不同之处,未提到的相同或相似之处可以互相参考,为了简洁,这里不再赘述。
还应理解,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
还应理解,本申请实施例中,“预先设定”、“预先定义”可以通过在设备(例如,包括显示设备)中预先保存相应的代码、表格或其他可用于指示相关信息的方式来实现,本申请对于其具体的实现方式不做限定。
还应理解,本申请实施例中的方式、情况、类别以及实施例的划分仅是为了描述的方便,不应构成特别的限定,各种方式、类别、情况以及实施例中的特征在不矛盾的情况下可以相结合。
还应理解,在本申请的各个实施例中,如果没有特殊说明以及逻辑冲突,不同的实施例之间的术语和/或描述具有一致性、且可以相互引用,不同的实施例中的技术特征根据其内在的逻辑关系可以组合形成新的实施例。最后应说明的是:以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何在本申请揭露的技术范围内的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。
Claims (10)
- 一种处理视频的方法,其特征在于,应用于显示设备,所述方法包括:获取原始视频的解码视频,所述解码视频包括多个视频帧图像;确定每个视频帧图像对应的亮度信息;获取所述显示设备的亮度能力和所述原始视频的动态范围,并根据所述显示设备的亮度能力和所述原始视频的动态范围计算每个视频帧图像可扩展的动态范围;利用每个视频帧图像对应的亮度信息和每个视频帧图像可扩展的动态范围,对每个视频帧图像进行色调映射,得到每个视频帧图像对应的增强图像,所述增强图像的动态范围大于所述视频帧图像的动态范围。
- 如权利要求1所述的方法,其特征在于,所述利用每个视频帧图像对应的亮度信息和每个视频帧图像可扩展的动态范围,对每个视频帧图像进行色调映射,得到每个视频帧图像对应的增强图像,包括:利用每个视频帧图像对应的亮度信息,确定每个视频帧图像对应的标准动态区域和扩展动态区域,所述标准动态区域中的像素点的亮度信息小于预设阈值,所述扩展动态区域中的像素点的亮度信息大于或等于所述预设阈值;根据每个视频帧图像可扩展的动态范围,确定每个视频帧图像的标准动态区域对应的第一系数,以及每个视频帧图像的扩展动态区域对应的第二系数;根据所述第一系数对所述标准动态区域中的像素点进行色调映射,根据所述第二系数对所述扩展动态区域的像素点进行色调映射,得到每个视频帧图像对应的增强图像。
- 如权利要求2所述的方法,其特征在于,所述根据所述第一系数对所述标准动态区域中的像素点进行色调映射,根据所述第二系数对所述扩展动态区域的像素点进行色调映射,得到每个视频帧图像对应的增强图像,包括:计算所述标准动态区域中每个像素点的原始像素值与所述第一系数的第一乘积,并根据所述第一乘积更新所述标准动态区域中每个像素点的像素值;计算所述扩展动态区域中每个像素点的原始像素值与所述第二系数的第二乘积,并根据所述第二乘积更新所述扩展动态区域中每个像素点的像素值;根据所述标准动态区域中更新后的每个像素点,以及所述扩展动态区域中更新后的每个像素点,生成每个视频帧图像对应的增强图像。
- 如权利要求1所述的方法,其特征在于,所述利用每个视频帧图像对应的亮度信息和每个视频帧图像可扩展的动态范围,对每个视频帧图像进行色调映射,得到每个视频帧图像对应的增强图像,包括:利用每个视频帧图像对应的亮度信息,确定每个视频帧图像对应的低灰阶区域、中灰阶区域以及高灰阶区域;根据每个视频帧图像可扩展的动态范围,对所述低灰阶区域、所述中灰阶区域以及所述高灰阶区域中的至少一个区域的亮度进行调整,得到每个视频帧图像对应的增强图像。
- 如权利要求4所述的方法,其特征在于,所述根据每个视频帧图像可扩展的动态范围,对所述低灰阶区域、所述中灰阶区域以及所述高灰阶区域中的至少一个区域 的亮度进行调整,得到每个视频帧图像对应的增强图像,包括:根据每个视频帧图像可扩展的动态范围,确定所述低灰阶区域中的像素点调整后的灰阶值;确定所述中灰阶区域中的像素点调整后的灰阶值;确定所述高灰阶区域中的像素点调整后的灰阶值;根据所述低灰阶区域、所述中灰阶区域以及所述高灰阶区域中调整后的各个像素点的灰阶值,生成每个视频帧图像对应的增强图像。
- 如权利要求5所述的方法,其特征在于,所述根据每个视频帧图像可扩展的动态范围,确定所述低灰阶区域中的像素点调整后的灰阶值,包括:根据所述每个视频帧图像可扩展的动态范围,确定所述低灰阶区域对应的第三系数;根据所述第三系数对所述低灰阶区域中的像素点进行色调映射,得到所述低灰阶区域中的像素点调整后的灰阶值。
- 一种显示装置,其特征在于,所述显示装置包括用于执行权利要求1至6中任一项所述的方法的单元。
- 一种显示设备,其特征在于,包括:一个或多个处理器;一个或多个存储器;所述存储器存储有一个或多个程序,当所述一个或者多个程序被所述处理器执行时,使得所述显示设备执行权利要求1至6中任一项所述的方法。
- 一种芯片,其特征在于,包括:处理器,用于从存储器中调用并运行计算机程序,使得安装有所述芯片的显示设备执行如权利要求1至6中任一项所述的方法。
- 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储了计算机程序,当所述计算机程序被处理器执行时,使得所述处理器执行权利要求1至6中任一项所述的方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211585804.3 | 2022-12-09 | ||
CN202211585804.3A CN118175246A (zh) | 2022-12-09 | 2022-12-09 | 处理视频的方法、显示设备及存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024119922A1 true WO2024119922A1 (zh) | 2024-06-13 |
Family
ID=91347380
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2023/116769 WO2024119922A1 (zh) | 2022-12-09 | 2023-09-04 | 处理视频的方法、显示设备及存储介质 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN118175246A (zh) |
WO (1) | WO2024119922A1 (zh) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106791865A (zh) * | 2017-01-20 | 2017-05-31 | 杭州当虹科技有限公司 | 基于高动态范围视频的自适应格式转换的方法 |
US20180048845A1 (en) * | 2015-05-12 | 2018-02-15 | Panasonic Intellectual Property Corporation Of America | Display method and display device |
US20200134792A1 (en) * | 2018-10-30 | 2020-04-30 | Microsoft Technology Licensing, Llc | Real time tone mapping of high dynamic range image data at time of playback on a lower dynamic range display |
CN111311524A (zh) * | 2020-03-27 | 2020-06-19 | 电子科技大学 | 一种基于msr的高动态范围视频生成方法 |
CN113099201A (zh) * | 2021-03-30 | 2021-07-09 | 北京奇艺世纪科技有限公司 | 视频信号处理方法、装置及电子设备 |
CN114501023A (zh) * | 2022-03-31 | 2022-05-13 | 深圳思谋信息科技有限公司 | 视频处理方法、装置、计算机设备、存储介质 |
-
2022
- 2022-12-09 CN CN202211585804.3A patent/CN118175246A/zh active Pending
-
2023
- 2023-09-04 WO PCT/CN2023/116769 patent/WO2024119922A1/zh unknown
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180048845A1 (en) * | 2015-05-12 | 2018-02-15 | Panasonic Intellectual Property Corporation Of America | Display method and display device |
CN106791865A (zh) * | 2017-01-20 | 2017-05-31 | 杭州当虹科技有限公司 | 基于高动态范围视频的自适应格式转换的方法 |
US20200134792A1 (en) * | 2018-10-30 | 2020-04-30 | Microsoft Technology Licensing, Llc | Real time tone mapping of high dynamic range image data at time of playback on a lower dynamic range display |
CN111311524A (zh) * | 2020-03-27 | 2020-06-19 | 电子科技大学 | 一种基于msr的高动态范围视频生成方法 |
CN113099201A (zh) * | 2021-03-30 | 2021-07-09 | 北京奇艺世纪科技有限公司 | 视频信号处理方法、装置及电子设备 |
CN114501023A (zh) * | 2022-03-31 | 2022-05-13 | 深圳思谋信息科技有限公司 | 视频处理方法、装置、计算机设备、存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN118175246A (zh) | 2024-06-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109783178B (zh) | 一种界面组件的颜色调整方法、装置、设备和介质 | |
WO2021036991A1 (zh) | 高动态范围视频生成方法及装置 | |
WO2021052342A1 (zh) | 电子设备调整画面色彩的方法和装置 | |
WO2020056744A1 (zh) | 拖影评价、改善方法和电子设备 | |
CN114463191B (zh) | 一种图像处理方法及电子设备 | |
WO2023160285A1 (zh) | 视频处理方法和装置 | |
CN116744120B (zh) | 图像处理方法和电子设备 | |
WO2022083325A1 (zh) | 拍照预览方法、电子设备以及存储介质 | |
CN110445986A (zh) | 图像处理方法、装置、存储介质及电子设备 | |
WO2023082859A1 (zh) | 图像处理方法、图像处理器、电子设备及存储介质 | |
WO2024174625A1 (zh) | 图像处理方法和电子设备 | |
CN117692788B (zh) | 一种图像处理方法及电子设备 | |
CN114038370A (zh) | 显示参数调整方法、装置、存储介质及显示设备 | |
WO2023035919A1 (zh) | 控制曝光的方法、装置与电子设备 | |
WO2024119922A1 (zh) | 处理视频的方法、显示设备及存储介质 | |
CN116709042B (zh) | 一种图像处理方法和电子设备 | |
CN115767290A (zh) | 图像处理方法和电子设备 | |
CN116668862A (zh) | 图像处理方法与电子设备 | |
CN115691370A (zh) | 显示控制方法及相关装置 | |
US10979628B2 (en) | Image processing apparatus, image processing system, image processing method, and recording medium | |
CN113891008A (zh) | 一种曝光强度调节方法及相关设备 | |
CN113805830A (zh) | 一种分布显示方法及相关设备 | |
WO2024179058A1 (zh) | 图像处理方法和电子设备 | |
CN117119316B (zh) | 图像处理方法、电子设备及可读存储介质 | |
CN117711300B (zh) | 一种图像的显示方法、电子设备、可读存储介质和芯片 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23899505 Country of ref document: EP Kind code of ref document: A1 |