WO2021213336A1 - 一种画质增强装置及相关方法 - Google Patents

一种画质增强装置及相关方法 Download PDF

Info

Publication number
WO2021213336A1
WO2021213336A1 PCT/CN2021/088171 CN2021088171W WO2021213336A1 WO 2021213336 A1 WO2021213336 A1 WO 2021213336A1 CN 2021088171 W CN2021088171 W CN 2021088171W WO 2021213336 A1 WO2021213336 A1 WO 2021213336A1
Authority
WO
WIPO (PCT)
Prior art keywords
affine transformation
target
image
portrait
frame image
Prior art date
Application number
PCT/CN2021/088171
Other languages
English (en)
French (fr)
Inventor
谢江荣
贾明波
王建
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2021213336A1 publication Critical patent/WO2021213336A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/02Affine transformations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4023Scaling of whole images or parts thereof, e.g. expanding or contracting based on decimating pixels or lines of pixels; based on inserting pixels or lines of pixels
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • G06T3/4076Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution using the original low-resolution images to iteratively correct the high-resolution images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods

Definitions

  • This application relates to the field of terminal artificial intelligence, and in particular to an image quality enhancement device and related methods.
  • the dynamic range of imaging is much smaller than that in the real scene; in addition, in order to facilitate network transmission, the video is often quantified, Compression processing makes the video quality finally presented on the large-screen device drop significantly.
  • manufacturers can combine video media shooting to produce new sources with high resolution, high contrast and higher bit depth, combined with high bit depth display panel hardware, so as to achieve outstanding picture quality effects; on the other hand,
  • the image quality can be adjusted in the automatic mode, and the brightness, color temperature, and contrast curves can be remapped through the preset image quality mode to achieve the effect of image enhancement.
  • the embodiments of the present application provide an image quality enhancement device and related methods to improve the image quality in a video.
  • the present application provides an image quality enhancement device, wherein the image quality enhancement device includes a general processing unit CPU and an image processing unit GPU; wherein, the CPU is used to: The frame image is down-sampled to obtain a low-resolution image; feature extraction is performed on the low-resolution image to obtain a target bilateral grid corresponding to the low-resolution image, and the target bilateral grid includes the low-resolution image Corresponding affine transformation information; the GPU is used to: according to the target bilateral grid and the affine transformation information, obtain the up-sampled affine transformation of the low-resolution image through the bilateral guided up-sampling BGU interpolation method Matrix, the affine transformation matrix includes affine transformation coefficients, the affine transformation coefficients are used to enhance the image quality of the target frame image; according to the affine transformation matrix, each pixel in the target frame image The image quality of the points is enhanced to obtain an enhanced target frame image.
  • the image quality enhancement device includes a general processing unit CPU and an image processing unit GPU; where
  • the image quality enhancement device can be used to simultaneously utilize the computing performance of the general-purpose processing unit CPU and the image processing unit GPU, and the image can be processed through parallel heterogeneous multi-stage processing (that is, the CPU and GPU simultaneously process the target video).
  • the quality enhancement program enhances the image quality of each frame of the target video.
  • the image quality enhancement device reasonably allocates the network structure of the image quality enhancement solution to the heterogeneous multi-stage pipeline of CPU and GPU for execution. Compared with separate CPU or GPU serial processing, it achieves lower single frame processing. Time delay shortens the time for real-time enhancement of the video stream.
  • the double-sided grid can be used to accelerate the image operation operator, that is, each frame of the target video is first down-sampled and converted into a double-sided grid with image affine transformation information; then the full range is obtained through the BGU interpolation method.
  • the resolved affine change coefficient is then applied to the original image, and finally an enhanced high-resolution image is obtained, which compresses the amount of calculation for video enhancement, so that the video image quality on the large screen can be enhanced quickly and efficiently.
  • the GPU is further configured to: receive the target bilateral grid sent by the CPU; the CPU is further configured to: receive the target bilateral grid sent by the CPU on the GPU After gridding, down-sampling and feature extraction are performed on the next frame image of the target frame image in the target video.
  • the embodiment of the present application does not require all image enhancement steps to be completed in sequence through the CPU or GPU within the frame rate time, and only needs to ensure The steps that take the longest time to run (such as obtaining the affine transformation matrix) are executed by the CPU or GPU within the frame rate time, and the purpose of real-time enhancement of the video stream can be achieved.
  • the CPU synchronously processes the next frame image of the target frame image, which greatly reduces the processing time for image quality enhancement.
  • the affine transformation coefficients include a portrait area affine transformation coefficient and a non-portrait area affine transformation coefficient.
  • the focus of the video picture of a large-screen product is mostly the portrait area, while the focus on the landscape, blurred background and other areas is low.
  • the persistence effect of human vision can be used to affine
  • the affine transformation coefficients contained in the transformation matrix are divided into portrait area affine transformation coefficients and non-portrait area affine transformation coefficients, so that the image quality enhancement device can perform different degrees of enhancement processing for the image quality of different areas, for example: due to large-screen products
  • the focus of the video picture is the portrait area, so the persistence effect of human vision is used to multiplex the affine transformation coefficients of the non-portrait area, which can further reduce the amount of calculation and improve real-time performance.
  • the GPU is specifically configured to: obtain the target corresponding to the portrait area in the low-resolution image through the BGU interpolation method according to the target bilateral grid and the affine transformation information Portrait region affine transformation coefficient; geometrically register the non-portrait region in the target frame image and the non-portrait region in the reference frame image to obtain first registration information, and the first registration information is used to indicate all The similarity between the non-portrait area of the target frame image and the non-portrait area of the reference frame image; in the case that the first registration information is less than a preset threshold, the low resolution is obtained by the BGU interpolation method
  • the target non-portrait region affine transformation coefficient corresponding to the non-portrait region in the image, and the target frame image is updated to the reference frame image of the next frame image; according to the target portrait region affine transformation coefficient and the The affine transformation coefficient of the target non-portrait area is obtained, and the affine transformation matrix is obtained.
  • the image quality enhancement device may perform the registration of the geometric position of the non-portrait area on the reference frame image and the target frame image to obtain Registration information after registration; after comparing the registration information with a preset threshold, determine whether the reference frame is available; if the registration information is less than the preset threshold, the reference frame is not available, that is, the reference frame corresponding to the target frame is invalid , The affine transformation coefficients of the non-portrait area cannot be provided for the target frame.
  • the affine transformation coefficients of the non-portrait area in the target frame image need to be regenerated according to the BGU method; finally, the enhancement coefficients of the portrait area and the background area are merged , And applied to the original image pixel by pixel, and output the enhanced image.
  • the target frame image can be used as the reference frame image of the next frame image to perform inter-frame image multiplexing.
  • the GPU is further configured to: in the case that the first registration information is greater than or equal to the preset threshold, obtain the affine transformation of the non-portrait region in the reference frame image Coefficient, and use the non-portrait area affine transformation coefficient of the reference frame image as the target non-portrait area affine transformation coefficient of the target frame image.
  • the affine transformation coefficient of the non-portrait region of the reference frame image can be applied to the target frame image. Therefore, the image enhancement device can apply the reference frame image to the target frame image.
  • the non-portrait region affine transformation coefficient of the image is used as the target non-portrait region affine transformation coefficient of the target frame image to enhance the image quality of the target frame image, save computing space, and improve the efficiency of image quality enhancement.
  • the GPU is further configured to: in the case that the first registration information is greater than or equal to the preset threshold, obtain the non-portrait region affine transformation coefficient of the reference frame image Obtain the affine transformation coefficient of the target non-portrait area corresponding to the non-portrait area in the low-resolution image; determine the affine transformation coefficient corresponding to the first pixel among all the pixels in the non-portrait area of the target frame image Is the affine transformation coefficient of the non-portrait region corresponding to the reference frame image, and the affine transformation coefficient corresponding to the second pixel is the affine transformation coefficient of the target non-portrait region, wherein the first pixel and the first pixel are Two pixels are distributed at intervals.
  • the image enhancement device can apply the reference frame image to the target frame image.
  • the affine transformation coefficient of the non-portrait area of the image is used as the affine transformation coefficient of a part of the target non-portrait area in the target frame image, and the remaining part of the affine transformation coefficient is generated by the BGU method, which greatly compresses the affine transformation in the non-portrait area. The amount of calculation generated by the coefficient, and to ensure that the degree of degradation of the enhancement effect of the region is within the acceptable range of the human eye.
  • the GPU is specifically configured to: obtain the target corresponding to the portrait area in the low-resolution image through the BGU interpolation method according to the target bilateral grid and the affine transformation information Portrait region affine transformation coefficient; geometrically register the non-portrait region in the target frame image and the non-portrait region in the previous frame image to obtain second registration information, the second registration information being used to indicate The similarity between the non-portrait area of the target frame image and the non-portrait area of the previous frame image; in the case that the second registration information is greater than or equal to a preset threshold, the low resolution is obtained
  • the affine transformation coefficient corresponding to the third pixel of the five pixels is the affine transformation coefficient of the target non-portrait area, and the affine transformation coefficients corresponding to the other four pixels except the third
  • the image quality enhancement device provides an affine transformation coefficient polling multiplexing strategy for the non-portrait area.
  • the visual radiation range of the update coefficient pixels can be arranged closely; as the time series frame progresses, the affine transformation coefficients of the non-portrait area update the pixels
  • the points can be polled in the direction of horizontal movement, vertical movement, diagonal movement, etc., so that a loop can be formed between every 4 preceding and following frames.
  • the calculation amount generated by the affine transformation coefficient BGU of the non-portrait area can be compressed to 1/5 of the original, and the degree of degradation of the enhancement effect of this area can be guaranteed by the human eye. Acceptable.
  • this application provides an image quality enhancement method, including:
  • the image processing unit GPU obtains the up-sampled affine transformation matrix of the low-resolution image through the bilateral guided up-sampling BGU interpolation method, the affine transformation matrix Including affine transformation coefficients, the affine transformation coefficients being used to enhance the image quality of the target frame image;
  • the GPU enhances the image quality of each pixel in the target frame image according to the affine transformation matrix to obtain an enhanced target frame image.
  • the method further includes: receiving, by the GPU, the target bilateral grid sent by the CPU; and receiving, by the CPU, the target bilateral grid sent by the CPU. After gridding, down-sampling and feature extraction are performed on the next frame image of the target frame image in the target video.
  • the affine transformation coefficients include a portrait area affine transformation coefficient and a non-portrait area affine transformation coefficient.
  • the image processing unit GPU obtains the up-sampled low-resolution image according to the target bilateral grid and the affine transformation information through the bilateral guided up-sampling BGU interpolation method.
  • the affine transformation matrix includes: obtaining the affine transformation coefficient of the target portrait region corresponding to the portrait region in the low-resolution image through the BGU interpolation method according to the target bilateral grid and the affine transformation information;
  • the non-portrait area in the target frame image and the non-portrait area in the reference frame image are geometrically registered to obtain first registration information.
  • the first registration information is used to indicate the non-portrait area and the non-portrait area of the target frame image.
  • the non-portrait area affine transformation coefficient, and the target frame image is updated to the reference frame image of the next frame image; according to the target portrait area affine transformation coefficient and the target non-portrait area affine transformation coefficient, Obtain the affine transformation matrix.
  • the method further includes: in a case where the first registration information is greater than or equal to the preset threshold, obtaining a non-portrait area in the reference frame image through the GPU An affine transformation coefficient, and the non-portrait region affine transformation coefficient of the reference frame image is used as the target non-portrait region affine transformation coefficient of the target frame image.
  • the method further includes: in a case where the first registration information is greater than or equal to the preset threshold, obtaining a non-portrait region simulation of the reference frame image through the GPU.
  • the affine transformation coefficient of the target non-portrait region corresponding to the non-portrait region in the low-resolution image is obtained by the GPU; all the pixels of the non-portrait region of the target frame image are determined by the GPU
  • the affine transformation coefficient corresponding to the first pixel is the affine transformation coefficient of the non-portrait area corresponding to the reference frame image
  • the affine transformation coefficient corresponding to the second pixel is the affine transformation coefficient of the target non-portrait area, wherein , The first pixel point and the second pixel point are spaced apart.
  • the image processing unit GPU obtains the up-sampled low-resolution image according to the target bilateral grid and the affine transformation information through the bilateral guided up-sampling BGU interpolation method.
  • the affine transformation matrix includes: obtaining the affine transformation coefficient of the target portrait region corresponding to the portrait region in the low-resolution image through the BGU interpolation method according to the target bilateral grid and the affine transformation information;
  • the non-portrait area in the target frame image and the non-portrait area in the previous frame image are geometrically registered to obtain second registration information, where the second registration information is used to indicate the non-portrait area of the target frame image
  • an embodiment of the present application provides a computer-readable storage medium, including computer instructions, which when the computer instruction runs on an electronic device, cause the electronic device to execute the second aspect or the second aspect of the embodiment of the present application.
  • the image quality enhancement method provided by any implementation method.
  • the embodiments of the present application provide a computer program product, which when the computer program product runs on an electronic device, causes the electronic device to execute the second aspect or any one of the implementation manners of the second aspect of the embodiments of the present application Provides image quality enhancement methods.
  • the present application provides a chip system, which includes a processor, and is used to support a network device to implement the functions involved in the above-mentioned first aspect.
  • the chip system further includes a memory, and the memory is used to store program instructions and data necessary for the data sending device.
  • the chip system can be composed of chips, or include chips and other discrete devices.
  • FIG. 1 is a schematic diagram of the architecture of an image quality enhancement system provided by an embodiment of the present application
  • FIG. 2A is a schematic structural diagram of an image quality enhancement device provided by an embodiment of the present application.
  • 2B is a schematic diagram of a process of processing a target video by an image quality enhancement apparatus provided by an embodiment of the present application
  • 2C is a schematic diagram of a multiplexing process of affine transform coefficients of a reference frame image provided by an embodiment of the present application;
  • FIG. 2D is a schematic diagram of the distribution of first pixel points and second pixel points in a non-portrait area according to an embodiment of the present application
  • 2E is a schematic diagram of the third pixel point distribution position in a non-portrait area provided by an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of an image quality enhancement method provided by an embodiment of the present application.
  • component used in this specification are used to denote computer-related entities, hardware, firmware, a combination of hardware and software, software, or software in execution.
  • the component may be, but is not limited to, a process, a processor, an object, an executable file, an execution thread, a program, and/or a computer running on a processor.
  • the application running on the computing device and the computing device can be components.
  • One or more components may reside in processes and/or threads of execution, and components may be located on one computer and/or distributed among two or more computers.
  • these components can be executed from various computer readable media having various data structures stored thereon.
  • the component can be based on, for example, a signal having one or more data packets (e.g. data from two components interacting with another component in a local system, a distributed system, and/or a network, such as the Internet that interacts with other systems through a signal) Communicate through local and/or remote processes.
  • a signal having one or more data packets (e.g. data from two components interacting with another component in a local system, a distributed system, and/or a network, such as the Internet that interacts with other systems through a signal) Communicate through local and/or remote processes.
  • High-Dynamic Range in computer graphics and film photography, is used to achieve a larger dynamic range of exposure (that is, greater difference between light and dark) than ordinary digital imaging technology. Group technology. Compared with ordinary images, HDR images can provide more dynamic range and image details, according to different exposure times of LDR (Low-Dynamic Range, low dynamic range images), and use each exposure time corresponding to the best detail LDR image to synthesize the final HDR image. It can better reflect the visual effects in the real environment.
  • LDR Low-Dynamic Range, low dynamic range images
  • Bit depth It means the number of binary bits needed to describe all colors when each color is described by a set of binary values when quantizing the color palette of the image. This does not mean that the image will use all these colors, but the accuracy level of the colors can be specified.
  • the bit depth can quantify the gray level. An image with a higher bit depth can encode more shadows or colors because it has more combinations of 0s and 1s available. For example, eight-bit depth is 2 to the 8th power, that is, one pixel can display 256 colors.
  • Frame rate is the frequency (rate) at which a bitmap image, which is called a frame, continuously appears on the display.
  • the frame rate is the number of pictures taken by the camera per second, and these pictures are played continuously to form a dynamic video.
  • Registration refers to the matching of geographic coordinates of different image patterns obtained by different imaging methods in the same area. Including geometric correction, projection transformation and unified scale processing.
  • Geometric Registration which combines the images (data) of the same area obtained by different time, different wavebands, and different remote sensor systems, through geometric transformation, so that the image points of the same name are completely superimposed in position and orientation. operate.
  • YUV is a color coding method. Often used in various video processing components. When YUV encodes photos or videos, it takes into account human perception and allows the bandwidth of chroma to be reduced. YUV is a type of compiling true-color color space (color space). Proper nouns such as Y'UV, YUV, YCbCr, YPbPr, etc. can all be called YUV, which overlap with each other. "Y” means brightness (Luminance, Luma), "U” and “V” mean chrominance, density (Chrominance, Chroma).
  • RGB is a color coding method.
  • the method of encoding a color is collectively referred to as "color space” or "color gamut”.
  • color space or “color gamut”.
  • each color can be represented by three variables: the intensity of R red (red), G green (green), and B blue (blue).
  • the manufacturer combines video media shooting to produce new high-resolution, high-contrast, and higher-bit-depth film sources, combined with high-bit-depth display panel hardware, to achieve outstanding picture quality.
  • the manufacturer adopts an automatic mode to adjust the image quality, and remaps the brightness, color temperature, and contrast curves through a preset image quality mode to achieve the effect of image enhancement.
  • the tone mapping curve of the traditional method only involves the modification of a limited number of index parameters such as contrast, saturation, brightness, etc., and is adjusted on a global scale, which is in effect with deep learning frame-by-frame and pixel-by-pixel image enhancement.
  • index parameters such as contrast, saturation, brightness, etc.
  • the methods are quite different; and the coverage of its preset scenes is also extremely limited, and at the same time it is difficult to guarantee the correct rate of automatically identifying the scene types, all of which lead to the unsatisfactory effects of traditional video enhancement methods.
  • HDR mode usually uses a multi-frame long and short time exposure sequence for synthesis. Long exposure is required when the light is dark, and short exposure is required when the light is bright.
  • This type of method uses three frames of long, medium, and short exposure time images before and after, or two frames before and after images with short exposure time, or six images with medium exposure before and after fusion, and short exposure images are used in high-brightness areas. In the low-brightness area, long-term exposure images are used to better retain the details of the image after fusion, thereby increasing the dynamic range of the image.
  • the traditional deep learning enhancement method requires all pixels to participate in feature extraction, which will bring a huge amount of computation.
  • the performance of the processor (CPU) and graphics card (GPU) is relatively weak. If you simply use the HDRNet method to enhance each frame of the video, it will not be able to use the inter-frame information in the video stream, and it will not be able to meet the real-time requirements. Deal with the needs of playback.
  • this application designs a heterogeneous multi-stage pipeline processing structure based on the single-frame image reconstruction HDRNet network structure based on deep learning, and uses multi-frame portrait information for multiplexing of enhancement coefficients. .
  • the problem of large single-core calculations in the prior art is solved, and high dynamic range (HDR) image enhancement without film source dependence and automatic recognition in multiple scenes is realized.
  • HDRNet high dynamic range
  • FIG. 1 is a schematic diagram of the architecture of an image quality enhancement system provided by an embodiment of the present application.
  • the following exemplarily enumerate the system architecture applied by an image quality enhancement method in this application, as shown in FIG. 1, which shows the system architecture involved in an embodiment of this application, including a hardware decoder 101, one or more processing The device 102, the hardware encoder 103, and the display panel 104. in,
  • the hardware decoder 101 is a device that inputs an analog video/audio signal and converts it into a digital signal format for further compression and transmission.
  • the input video stream can be decoded into a multi-frame original low dynamic range (LDR) image.
  • LDR low dynamic range
  • the processor 102 may include one or more processing units.
  • the processor 102 may include an application processor (AP), a modem processor, a graphics processing unit (GPU), and an image signal processor.
  • image signal processor, ISP image signal processor
  • general processing unit central processing unit, CPU
  • memory video codec
  • digital signal processor digital signal processor
  • DSP digital signal processor
  • baseband processor baseband processor
  • neural network processor neural-network processing unit, NPU
  • the different processing units may be independent devices or integrated in one or more processors.
  • the processor can first downsample the target frame image in the target video and convert it into a two-sided grid with image affine transformation information; then use the BGU interpolation method to obtain the full resolution affine variation coefficient, and then apply In the original image, an enhanced high dynamic range (HDR) enhanced image is finally obtained.
  • HDR high dynamic range
  • the processor 102 may include two arithmetic units, a CPU and a GPU.
  • the CPU is the final execution unit of information processing and program operation.
  • the general-purpose processing unit CPU may generally include a reduced instruction set processor (Advanced RISC Machines, ARM) series, which may include one or more core processing units.
  • the general processing unit CPU may be used to down-sample the target frame image in the target video to obtain a low-resolution image; perform feature extraction on the low-resolution image to obtain the low-resolution image A corresponding target bilateral grid, where the target bilateral grid includes affine transformation information corresponding to the low-resolution image.
  • the original image is first down-sampled, and then convolution down-sampling can be performed several times according to the preset requirements, and then the global and local features are learned separately, and after the combination, they are converted to a bilateral grid.
  • the GPU can be used to obtain full-resolution enhanced images, which can be implemented on the GPU using the OpenCL parallel programming language to obtain highly optimized performance.
  • the GPU makes the graphics card less dependent on the CPU and performs part of the original CPU work.
  • the GPU may be used to obtain the up-sampled affine transformation matrix of the low-resolution image according to the target bilateral grid and the affine transformation information through the bilateral guided up-sampling BGU interpolation method;
  • the affine transformation matrix enhances the image quality of each pixel in the target frame image to obtain an enhanced target frame image, wherein the affine transformation matrix includes affine transformation coefficients, and the affine transformation coefficients are used To enhance the image quality of the target frame image.
  • the obtained affine transformation coefficient is subjected to affine transformation on the original image to obtain an output image, and the output image is an enhanced high-resolution image.
  • the hardware encoder 103 is a process of converting data encoded files into analog video/audio signals.
  • the hardware decoder 101 and the hardware encoder 103 may be integrated on one device, and may be referred to as a codec.
  • the display panel 104 is used to display the analog video/audio signal output by the hardware encoder 103.
  • the received video stream is decoded by the hardware decoder 101 to obtain image frames, where the obtained image frames are the original low dynamic range (LDR) images, and the original LDR images are processed by the processor 102 to generate the corresponding high
  • the dynamic range (HDR) enhanced image is sent to the hardware encoder 103, and the enhanced video is finally displayed on the display panel 104.
  • the processor 102 adopts a heterogeneous multi-stage pipeline processing structure, and calculates the network processing time in parallel, which can increase the frame rate of real-time processing and save resources.
  • image quality enhancement system architecture in FIG. 1 is only an exemplary implementation in the embodiments of the present application, and the image quality enhancement system architecture in the embodiments of the present application includes but is not limited to the above image quality enhancement system architecture .
  • the image quality enhancement system architecture in the embodiments of the present application can be applied to, for example, large-scale street-side display screens, screen projectors, projectors, TV screens, etc., which have large-scale displays or can project video images to large-scale displays. Display in electronic equipment.
  • the image quality enhancement system architecture illustrated in the embodiments of the present application does not constitute a specific limitation on the electronic device. In other embodiments of the present application, the electronic device may include more or fewer components than those shown in the figure, or combine certain components, or split certain components, or arrange different components.
  • the components shown in the image quality enhancement system architecture can be implemented in hardware, software, or a combination of software and hardware.
  • FIG. 2A is an embodiment of the present application.
  • the image quality enhancement device 10 is equivalent to the processor 102 shown in FIG. 1 described above.
  • the image quality enhancement device may include two arithmetic units, a general-purpose processing unit CPU and an image processing unit GPU.
  • the CPU includes a preprocessing module 301, a low-resolution coefficient inference module 302, and a dimming module 304;
  • the GPU includes: a full-resolution BGU module 303.
  • the preprocessing module 301 is specifically used for frame format conversion (for example, the frame format of the target frame image can be converted from YUV to RGB), and down-sampling the image in the target video to obtain a low-resolution image.
  • frame format conversion for example, the frame format of the target frame image can be converted from YUV to RGB
  • the low-resolution coefficient inference module 302 is configured to perform feature extraction on the low-resolution image to obtain a target bilateral grid corresponding to the low-resolution image, and the target bilateral grid includes a simulation corresponding to the low-resolution image. Transformation information, that is, feature extraction of the image through a feature extraction network (for example, Tflite, Multilayer Neural Network (MNN), etc.). Then the global features and local features are learned separately, and after combining them, they are converted into a bilateral grid to obtain the target bilateral grid corresponding to the low-resolution image.
  • the extracted features include global features and local features, where global features include illumination, brightness, etc., and local features include contrast, color protection, semantic features, and so on.
  • the low-resolution coefficient inference module 302 can also use the portrait region positioning information obtained by the low-resolution network to perform a process between each frame of image in the target video and the reference frame image corresponding to each frame. track.
  • the dimming module 304 is specifically used for dimming and balancing the original image according to the full-resolution image quality enhancement map obtained by the full-resolution BGU module 303, smooth dimming between frames, maintenance of portrait area information, and affine of non-portrait areas Transformation coefficient update template and so on.
  • the full-resolution BGU module 303 is configured to obtain the up-sampled affine transformation matrix of the low-resolution image through the bilateral guided up-sampling BGU interpolation method according to the target bilateral grid and the affine transformation information
  • the affine transformation matrix includes affine transformation coefficients, and the affine transformation coefficients are used to enhance the image quality of the target frame image; according to the affine transformation matrix, the value of each pixel in the target frame image is The image quality is enhanced to obtain an enhanced target frame image.
  • the full-resolution BGU module 303 is implemented on the GPU using the OpenCL parallel programming language to obtain a highly optimized affine transformation matrix; it is sensitive to edges and has a relatively high sharpness, which improves the effect of image quality enhancement.
  • the affine transformation coefficient includes an affine transformation coefficient for a portrait area and an affine transformation coefficient for a non-portrait area.
  • the focus of the video picture of a large-screen product is mostly the portrait area, while the focus on the landscape, blurred background and other areas is low.
  • the persistence effect of human vision can be used to affine
  • the affine transformation coefficients contained in the transformation matrix are divided into portrait area affine transformation coefficients and non-portrait area affine transformation coefficients, so that the image quality enhancement device can perform different degrees of enhancement processing for the image quality of different areas, for example: simulating the non-portrait area
  • the multiplexing of radio transform coefficients can further reduce the amount of calculation and improve real-time performance.
  • the GPU is further configured to: receive the target bilateral grid sent by the CPU; the CPU is further configured to: receive the target bilateral grid sent by the CPU on the GPU After gridding, down-sampling and feature extraction are performed on the next frame image of the target frame image in the target video.
  • the serial processing scheme that is, the target video processed by a single CPU or GPU
  • the embodiment of the present application does not require all the image enhancement steps to be performed
  • the CPU or GPU runs in sequence within the frame rate time. You only need to ensure that the longest running step (such as obtaining the affine transformation matrix) is executed by the CPU or GPU within the frame rate time to achieve the video stream.
  • FIG. 2B is a schematic flowchart of processing a target video by an image quality enhancement apparatus according to an embodiment of the present application.
  • the target video passes through a heterogeneous multi-stage pipeline, and the image quality enhancement device sequentially performs image quality enhancement processing on each frame of the target video, where the preprocessing module 301 completes the preprocessing of the nth frame of the image ( After down-sampling and feature extraction), the pre-processing module 301 can then preprocess the n+1th frame image so that each step (such as down-sampling, feature extraction, obtaining affine transformation matrix, dimming, etc.) The execution is completed by the CPU or GPU within the frame rate time, and the video image quality enhancement solution can make full use of the computing performance of the CPU and GPU at the same time, achieving the purpose of real-time enhancement of the video stream.
  • the preprocessing module 301 completes the preprocessing of the nth frame of the image ( After down-sampling and feature extraction), the pre-processing module 301 can then preprocess the n+1th frame image so that each step (such as down-sampling, feature extraction, obtaining affine transformation matrix,
  • the GPU is specifically configured to: according to the target bilateral grid and the affine transformation information, obtain the affine transformation of the target portrait region corresponding to the portrait region in the low-resolution image through the BGU interpolation method Coefficient; geometrically register the non-portrait area in the target frame image and the non-portrait area in the reference frame image to obtain first registration information, where the first registration information is used to indicate the target frame image The similarity between the non-portrait region and the non-portrait region of the reference frame image; in the case that the first registration information is less than a preset threshold, the non-portrait in the low-resolution image is obtained by the BGU interpolation method The target non-portrait region affine transformation coefficient corresponding to the region, and the target frame image is updated to the reference frame image of the next frame image; according to the target portrait region affine transformation coefficient and the target non-portrait region simulating And obtain the affine transformation matrix.
  • the image quality enhancement device may obtain the affine transformation coefficient of the target portrait region corresponding to the portrait region in the low-resolution image through the BGU interpolation method;
  • the non-human image area in the target frame image and the non-human image area in the reference frame image are geometrically registered, and then determined.
  • FIG. 2C is a schematic diagram of the affine transformation coefficient multiplexing process of a reference frame image provided by an embodiment of the present application. As shown in Fig.
  • the portrait positioning and/or tracking is performed between the reference frame image and the current frame image, and the geometric position of the reference frame image and the current frame image are registered according to the positioning information to obtain the registration.
  • the aligned registration information determines whether the reference frame image can be applied to the current frame image.
  • the affine transformation coefficients of the non-portrait region in the target frame image need to be regenerated according to the BGU method; among them, the affine transformation coefficients of the portrait region of the current frame image are re-obtained according to the BGU method according to the circumscribed frame of the portrait.
  • the affine transformation coefficients of the portrait area and the non-portrait area are combined to obtain an affine transformation matrix, which is applied to the original image pixel by pixel, and the enhanced image is output.
  • the current frame image can be used as the reference frame image of the next frame image, thereby multiplexing the inter-frame images and enhancing the utilization of inter-frame information.
  • the first frame image may not have a reference frame image, or it may be a pre-stored image, which is not specifically limited in the embodiment of the present application.
  • the GPU is further configured to: in the case that the first registration information is greater than or equal to the preset threshold, obtain the affine transformation of the non-portrait region in the reference frame image Coefficient, and use the non-portrait area affine transformation coefficient of the reference frame image as the target non-portrait area affine transformation coefficient of the target frame image. Wherein, if the first registration information is greater than or equal to the preset threshold, the affine transformation coefficients of the non-portrait region of the reference frame image can be applied to the target frame image.
  • the image enhancement device can change the non-portrait region of the reference frame image
  • the affine transformation coefficient is used as the affine transformation coefficient of the target non-portrait area of the target frame image to enhance the image quality of the target frame image.
  • Figure 2C first use the portrait positioning information obtained by the low-resolution network to track between the reference frame and the current frame; on the one hand, the original BGU method is still used in the portrait area to obtain the full resolution enhancement coefficient of the area; On the one hand, the geometric position registration is performed in the non-portrait area, and the registration information is compared with the threshold to determine that the reference frame is available.
  • the enhancement coefficient of the background area of the current frame will be generated according to the subsequent background area coefficient multiplexing strategy.
  • the portrait area and The enhancement coefficients of the background area are combined and applied to the original image pixel by pixel, and the enhanced image is output, which saves computing space and improves the efficiency of image quality enhancement.
  • the GPU is further configured to: in the case that the first registration information is greater than or equal to the preset threshold, obtain the non-portrait region affine transformation coefficient of the reference frame image Obtain the affine transformation coefficient of the target non-portrait area corresponding to the non-portrait area in the low-resolution image; determine the affine transformation coefficient corresponding to the first pixel among all the pixels in the non-portrait area of the target frame image Is the affine transformation coefficient of the non-portrait region corresponding to the reference frame image, and the affine transformation coefficient corresponding to the second pixel is the affine transformation coefficient of the target non-portrait region, wherein the first pixel and the first pixel are Two pixels are distributed at intervals.
  • the image enhancement device can change the non-portrait region of the reference frame image
  • the affine transformation coefficient is used as the affine transformation coefficient of a part of the target non-portrait region in the target frame image, and the remaining part of the affine transformation coefficient is generated by the BGU method, which greatly reduces the calculation amount of the affine transformation coefficient generation in the non-portrait region. , And ensure that the degree of degradation of the enhancement effect in this area is within the acceptable range of the human eye.
  • FIG. 2D is a schematic diagram of the distribution of first pixel points and second pixel points in a non-portrait area provided by an embodiment of the present application. As shown in Fig.
  • the position of the first pixel and the second pixel in the Nth frame of image can be different from the position of the first pixel and the second pixel in the N+1th frame of image, so as to ensure the area
  • the degree of degradation of the enhancement effect is within the acceptable range of the human eye, and the efficiency of video quality enhancement is also improved.
  • the GPU is specifically configured to: obtain the target corresponding to the portrait area in the low-resolution image through the BGU interpolation method according to the target bilateral grid and the affine transformation information Portrait region affine transformation coefficient; geometrically register the non-portrait region in the target frame image and the non-portrait region in the previous frame image to obtain second registration information, the second registration information being used to indicate The similarity between the non-portrait area of the target frame image and the non-portrait area of the previous frame image; in the case that the second registration information is greater than or equal to a preset threshold, the low resolution is acquired
  • the affine transformation coefficient corresponding to the third pixel of the five pixels is the affine transformation coefficient of the target non-portrait area, and the affine transformation coefficients corresponding to the other four pixels except the third
  • the affine transformation coefficient of the non-portrait area corresponding to the frame image wherein every five pixels are distributed in a cross shape, the third pixel point is at the center of the cross shape, and the next frame image corresponds to all The position of the third pixel point is different from that of the third pixel point corresponding to the target frame image.
  • FIG. 2E is a schematic diagram of a third pixel point distribution position in a non-portrait area according to an embodiment of the present application. As shown in FIG. 2E, every five pixel points are distributed in a cross shape, and the third pixel point is at the center of the cross shape.
  • the pixels of the Nth frame image (equivalent to the target frame image) update coefficients, with the upper, lower, left, and right 4 connected areas of the third pixel as its visual radiation range.
  • the affine transformation coefficient update pixels of the non-portrait area can be polled in the direction of horizontal movement, vertical movement, diagonal movement, etc., then the visual radiation range of the update coefficient pixels can be closely arranged, and the third pixel is in the A loop can be formed between every 4 preceding and following frames. That is, as shown in FIG. 2E, the positions of the third pixel corresponding to the N+1 frame and the third pixel corresponding to the Nth frame of image are different. Therefore, there are four from the Nth frame to the N+3th frame All affine transformation coefficients of non-portrait regions can be updated between frames.
  • the calculation amount generated by the affine transformation coefficient BGU of the non-portrait area can be compressed to 1/5 of the original, and the degree of degradation of the enhancement effect of this area can be guaranteed by the human eye. Acceptable.
  • the image quality enhancement device can be used to simultaneously utilize the computing performance of the general-purpose processing unit CPU and the image processing unit GPU, and the image can be processed through parallel heterogeneous multi-stage processing (that is, the CPU and GPU simultaneously process the target video).
  • the quality enhancement program enhances the image quality of each frame of the target video.
  • the image quality enhancement device reasonably allocates the network structure of the image quality enhancement solution to the heterogeneous multi-stage pipeline of CPU and GPU for execution. Compared with separate CPU or GPU serial processing, it achieves lower single frame processing. Time delay shortens the time for real-time enhancement of the video stream.
  • the double-sided grid can be used to accelerate the image operation operator, that is, each frame of the target video is first down-sampled and converted into a double-sided grid with image affine transformation information; then the full range is obtained through the BGU interpolation method. The resolved affine variation coefficient is then applied to the original image, and finally an enhanced high-resolution image is obtained, which compresses the amount of calculation for video enhancement.
  • image quality enhancement device in FIG. 2A is only an exemplary implementation in the embodiment of the present application, and the image quality enhancement device in the embodiment of the present application includes but is not limited to the above image quality enhancement device.
  • FIG. 3 is a schematic flowchart of an image quality enhancement method provided by an embodiment of the present application.
  • the method can be applied to the structure of the image quality enhancement device described in FIG.
  • the quality enhancement system architecture, wherein the image quality enhancement device can be used to support and execute steps S301 to S305 of the method flow shown in FIG. 3. in,
  • Step S301 down-sampling the target frame image in the target video by the general processing unit CPU to obtain a low-resolution image.
  • the image quality enhancement device down-samples the target frame image in the target video through the general processing unit CPU to obtain a low-resolution image.
  • Step S302 Perform feature extraction on the low-resolution image by the CPU, and obtain a target bilateral grid corresponding to the low-resolution image.
  • the image quality enhancement device performs feature extraction on the low-resolution image through the CPU, and obtains a target bilateral grid corresponding to the low-resolution image, and the target bilateral grid includes the corresponding low-resolution image.
  • Affine transformation information
  • Step S303 Receive the target bilateral grid sent by the CPU through the image processing unit GPU.
  • the image quality enhancement apparatus receives the target bilateral grid sent by the CPU through the GPU.
  • the image quality enhancement apparatus uses the CPU to downsample the next frame of the target frame image in the target video after the GPU receives the target bilateral grid sent by the CPU And feature extraction.
  • Step S304 According to the target bilateral grid and affine transformation information, the GPU is used to obtain the affine transformation matrix after the up-sampling of the low-resolution image through the bilateral guided up-sampling BGU interpolation method.
  • the image quality enhancement device obtains the up-sampled affine transformation matrix of the low-resolution image through the bilateral guided up-sampling BGU interpolation method through the image processing unit GPU according to the target bilateral grid and the affine transformation information.
  • the affine transformation matrix includes affine transformation coefficients, and the affine transformation coefficients are used to enhance the image quality of the target frame image.
  • the affine transformation coefficients include a portrait area affine transformation coefficient and a non-portrait area affine transformation coefficient.
  • the image quality enhancement device obtains the affine transformation coefficient of the target portrait region corresponding to the portrait region in the low-resolution image through the BGU interpolation method according to the target bilateral grid and the affine transformation information; Perform geometric registration of the non-portrait area in the target frame image and the non-portrait area in the reference frame image to obtain first registration information, where the first registration information is used to indicate the non-portrait area of the target frame image The similarity between the non-portrait region and the reference frame image; in the case that the first registration information is less than the preset threshold, the BGU interpolation method is used to obtain the corresponding non-portrait region in the low-resolution image The target non-portrait region affine transformation coefficient, and the target frame image is updated to the reference frame image of the next frame image; according to the target portrait region affine transformation coefficient and the target non-portrait region affine transformation coefficient , To obtain the affine transformation matrix.
  • the image quality enhancement apparatus obtains the affine transformation coefficients of the non-portrait region in the reference frame image through the GPU, and combines The non-portrait region affine transformation coefficient of the reference frame image is used as the target non-portrait region affine transformation coefficient of the target frame image.
  • the image quality enhancement apparatus obtains the non-portrait region affine transformation coefficients of the reference frame image through the GPU;
  • the GPU obtains the target non-portrait region affine transformation coefficient corresponding to the non-portrait region in the low-resolution image;
  • the GPU determines the first pixel point corresponding to all the pixels in the non-portrait region of the target frame image
  • the affine transformation coefficient is the affine transformation coefficient of the non-portrait region corresponding to the reference frame image
  • the affine transformation coefficient corresponding to the second pixel is the affine transformation coefficient of the target non-portrait region, wherein the first pixel is Distributed spaced apart from the second pixel points.
  • the image quality enhancement device obtains the affine transformation coefficient of the target portrait region corresponding to the portrait region in the low-resolution image through the BGU interpolation method according to the target bilateral grid and the affine transformation information; Perform geometric registration of the non-portrait area in the target frame image with the non-portrait area in the previous frame image to obtain second registration information, where the second registration information is used to indicate the non-portrait of the target frame image The similarity between the region and the non-portrait region of the previous frame of image; in the case that the second registration information is greater than or equal to the preset threshold, obtain the corresponding non-portrait region in the low-resolution image
  • the target non-portrait region affine transformation coefficient and the non-portrait region affine transformation coefficient corresponding to the non-portrait region in the previous frame image determine the third pixel in every five pixels in the non-portrait region of the target frame image
  • the affine transformation coefficient corresponding to the point is the affine transformation coefficient of the target non-portrait area, and the affin
  • Step S305 The image quality of each pixel in the target frame image is enhanced by the GPU according to the affine transformation matrix to obtain an enhanced target frame image.
  • the image quality enhancement device enhances the image quality of each pixel in the target frame image according to the affine transformation matrix through the GPU to obtain an enhanced target frame image.
  • step S301 to step S305 in the embodiment of the present application can also refer to the relevant descriptions of the above-mentioned respective embodiments of FIG. 2A to FIG. 2E, which will not be repeated here.
  • the image quality enhancement device can be used to simultaneously utilize the computing performance of the general-purpose processing unit CPU and the image processing unit GPU, and the image quality of parallel heterogeneous multi-stage processing (that is, the CPU and GPU simultaneously process the target video) Enhancement scheme, which enhances the image quality of each frame of the target video.
  • the image quality enhancement device reasonably allocates the network structure of the image quality enhancement solution to the heterogeneous multi-stage pipeline of CPU and GPU for execution. Compared with separate CPU or GPU serial processing, it achieves lower single frame processing. Time delay shortens the time for real-time enhancement of the video stream.
  • the double-sided grid can be used to accelerate the image operation operator, that is, each frame of the target video is first down-sampled and converted into a double-sided grid with image affine transformation information; then the full range is obtained through the BGU interpolation method. The resolved affine variation coefficient is then applied to the original image, and finally an enhanced high-resolution image is obtained, which compresses the amount of calculation for video enhancement.
  • each network element such as an electronic device, a processor, etc.
  • each network element includes a hardware structure and/or software module corresponding to each function.
  • this application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software-driven hardware depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.
  • the embodiments of the present application can divide the functional modules of electronic equipment, camera equipment, etc. according to the above method examples.
  • each functional module can be divided corresponding to each function, or two or more functions can be integrated into one processing module.
  • the above-mentioned integrated modules can be implemented in the form of hardware or software functional modules. It should be noted that the division of modules in the embodiments of the present application is illustrative, and is only a logical function division, and there may be other division methods in actual implementation.
  • the embodiment of the present application also provides a computer-readable storage medium. All or part of the procedures in the foregoing method embodiments may be completed by a computer program instructing relevant hardware.
  • the program may be stored in the foregoing computer storage medium. When the program is executed, it may include the procedures of the foregoing method embodiments.
  • the computer-readable storage medium includes: read-only memory (ROM) or random access memory (RAM), magnetic disks or optical disks and other media that can store program codes.
  • the above embodiments it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof.
  • software it can be implemented in the form of a computer program product in whole or in part.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted through the computer-readable storage medium.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or a data center integrated with one or more available media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (SSD)).
  • the modules in the device of the embodiment of the present application may be combined, divided, and deleted according to actual needs.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Processing (AREA)

Abstract

本申请公开了一种画质增强装置及方法,具体可以应用于终端人工智能(Artificial Intelligence,AI)领域以及对应的子领域,该画质增强装置包括通用处理单元CPU和图像处理单元GPU;CPU用于对目标视频内的目标帧图像进行下采样,获得低分辨率图像;对低分辨率图像进行特征提取,获取低分辨率图像对应的目标双边网格;GPU用于根据目标双边网格和目标双边网格包括的仿射变换信息,通过双边引导上采样BGU插值方法获取低分辨率图像上采样后的仿射变换矩阵;根据仿射变换矩阵,对目标帧图像内每个像素点的画质进行增强,获得增强后的目标帧图像。实施本申请实施例,可以增强大屏上的视频画质。

Description

一种画质增强装置及相关方法
本申请要求于2020年4月22日提交中国专利局、申请号为202010322718.8、申请名称为“一种画质增强装置及相关方法”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及终端人工智能领域,尤其涉及一种画质增强装置及相关方法。
背景技术
在现有技术中,受限于拍摄设备中图像传感器的像元尺寸、响应性能,成像的动态范围要远小于真实场景中的动态范围;另外,为了方便网络传输,往往会对视频进行量化、压缩处理,使得最终呈现在大屏设备上的视频质量下降明显。
目前提升大屏画面质量的方法众多,一方面厂商可以联合视频媒体拍摄制作高分辨率、高对比度、更高位深的新片源,结合高位深显示面板硬件,从而取得出众的画质效果;另一方面可以采用自动模式调整画质,通过预置的画质模式,对亮度、色温、对比度曲线进行重新映射,到达画面增强的效果。
然而,高位深新片源的制作,需要与其相兼容的显示面板支持,这会带来额外的硬件成本,并且经典的老片源重制难度极高,基本上无法获得增强。而采用自动模式调整画质时,对运动物体,尽管采用相关的匹配算法进行补偿,仍然会出现伪像使得合成图片的质量难以接受。
因此,如何增强大屏上的视频画质,是亟待解决的问题。
发明内容
本申请实施例提供了一种画质增强装置及相关方法,以提升视频中的图像画质。
第一方面,本申请提供了一种画质增强装置,其特征在于,所述画质增强装置包括通用处理单元CPU和图像处理单元GPU;其中,所述CPU用于:对目标视频内的目标帧图像进行下采样,获得低分辨率图像;对所述低分辨率图像进行特征提取,获取所述低分辨率图像对应的目标双边网格,所述目标双边网格包括所述低分辨率图像对应的仿射变换信息;所述GPU用于:根据所述目标双边网格和所述仿射变换信息,通过双边引导上采样BGU插值方法获取所述低分辨率图像上采样后的仿射变换矩阵,所述仿射变换矩阵包括仿射变换系数,所述仿射变换系数用于增强所述目标帧图像的画质;根据所述仿射变换矩阵,对所述目标帧图像内每个像素点的画质进行增强,获得增强后的目标帧图像。
在本申请实施例中,可以通过画质增强装置,同时利用通用处理单元CPU和图像处理单元GPU的运算性能,通过并行异构多级处理(即,CPU与GPU同时分工处理目标视频)的画质增强方案,对目标视频的每一帧图像进行画质增强。其次,该画质增强装置将画质增强方案的网络结构合理分配在CPU与GPU异构的多级流水线上执行,相比于单独的CPU或者GPU串行处理,获得了更低的单帧处理时延,使得视频流的实时增强的时间变短。而 且可以利用双边网格实现图像操作算子的加速,即将目标视频中每一帧图像先通过下采样,并转换成为一个带有图像仿射变换信息的双边网格;随后通过BGU插值方法获得全分辨的仿射变化系数,进而应用在原始图像中,最终获得增强后的高分辨率图像,压缩了视频增强的运算量,从而能够快速高效的增强大屏上的视频画质。
在一种可能的实现方式中,所述GPU还用于:接收所述CPU发送的所述目标双边网格;所述CPU还用于:在所述GPU接收所述CPU发送的所述目标双边网格后,对所述目标视频内的所述目标帧图像的下一帧图像进行下采样以及特征提取。相比于串行处理方案(即,只有单独一个CPU或GPU处理的目标视频),本申请实施例无需所有的图像增强步骤均在帧率时间内通过CPU或GPU按顺序运行完成,只需要保证运行耗时最长的步骤(如:获得仿射变换矩阵)在帧率时间内通过CPU或GPU执行完毕,即可达到视频流的实时增强的目的。例如:在GPU对目标帧图像进行处理时,CPU同步对目标帧图像的下一帧图像进行处理,大大缩短了画质增强的处理时间。
在一种可能的实现方式中,所述仿射变换系数包括人像区域仿射变换系数和非人像区域仿射变换系数。在本申请实施例中,例如:大屏产品的视频画面焦点,大多是人像区域,而对于风景、虚化背景等区域的关注度较低,因此可以利用人类视觉的暂留效应,将仿射变换矩阵包含的仿射变换系数分为人像区域仿射变换系数和非人像区域仿射变换系数,以便画质增强装置针对不同区域的画质进行不同程度的增强处理,例如:由于大屏产品的视频画面焦点是人像区域,所以利用人类视觉的暂留效应,对非人像区域仿射变换系数进行复用,可以进一步较少运算量、提升实时性能。
在一种可能的实现方式中,所述GPU具体用于:根据所述目标双边网格和所述仿射变换信息,通过所述BGU插值方法获取所述低分辨率图像内人像区域对应的目标人像区域仿射变换系数;将所述目标帧图像内的非人像区域和参考帧图像内的非人像区域进行几何配准,获得第一配准信息,所述第一配准信息用于指示所述目标帧图像的非人像区域和参考帧图像的非人像区域之间的相似度;在所述第一配准信息小于预设阈值的情况下,通过所述BGU插值方法获取所述低分辨率图像内的非人像区域对应的目标非人像区域仿射变换系数,并将所述目标帧图像更新为所述下一帧图像的参考帧图像;根据所述目标人像区域仿射变换系数和所述目标非人像区域仿射变换系数,获得所述仿射变换矩阵。在本申请实施例中,画质增强装置为了确定参考帧图像的非人像区域是否可以应用到目标帧图像中,可以将参考帧图像与目标帧图像进行非人像区域的几何位置的配准,获得配准后的配准信息;将该配准信息与预设阈值对比后,确定参考帧是否可用;如果配准信息小于预设阈值,则参考帧不可用,即,目标帧对应的参考帧失效,无法为目标帧提供非人像区域的仿射变换系数,此时则需要重新按照所述BGU方法生成目标帧图像中非人像区域的仿射变换系数;最后,人像区域和背景区域的增强系数合并,并逐像素应用于原始图像中,输出增强之后的图像。而且,该目标帧图像可以作为下一帧图像的参考帧图像,进行帧间图像的复用。
在一种可能的实现方式中,所述GPU还用于:在所述第一配准信息大于或等于所述预设阈值的情况下,获取所述参考帧图像内的非人像区域仿射变换系数,并将所述参考帧图像的非人像区域仿射变换系数作为所述目标帧图像的所述目标非人像区域仿射变换系数。在本申请实施例中,若第一配准信息大于或等于预设阈值,则参考帧图像的非人像区域的 仿射变换系数可以应用到目标帧图像中,因此,图像增强装置可以将参考帧图像的非人像区域仿射变换系数作为目标帧图像的目标非人像区域仿射变换系数对目标帧图像的画质增强,节省了运算空间,提高了画质增强的效率。
在一种可能的实现方式中,所述GPU还用于:在所述第一配准信息大于或等于所述预设阈值的情况下,获取所述参考帧图像的非人像区域仿射变换系数;获取所述低分辨率图像内的非人像区域对应的目标非人像区域仿射变换系数;确定所述目标帧图像的所述非人像区域所有像素点中第一像素点对应的仿射变换系数为所述参考帧图像对应的非人像区域仿射变换系数,第二像素点对应的仿射变换系数为所述目标非人像区域仿射变换系数,其中,所述第一像素点与所述第二像素点间隔分布。在本申请实施例中,若第一配准信息大于或等于预设阈值,则参考帧图像的非人像区域的仿射变换系数可以应用到目标帧图像中,因此,图像增强装置可以将参考帧图像的非人像区域仿射变换系数作为目标帧图像内一部分的目标非人像区域仿射变换系数,剩余一部分的仿射变换系数通过所述BGU方法生成,大大压缩了非人像区域内的仿射变换系数生成的运算量,并且保证该区域增强效果的退化程度在人眼可接受范围内。
在一种可能的实现方式中,所述GPU具体用于:根据所述目标双边网格和所述仿射变换信息,通过所述BGU插值方法获取所述低分辨率图像内人像区域对应的目标人像区域仿射变换系数;将所述目标帧图像内的非人像区域和前一帧图像内的非人像区域进行几何配准,获得第二配准信息,所述第二配准信息用于指示所述目标帧图像的非人像区域和所述前一帧图像的非人像区域之间的相似度;在所述第二配准信息大于或等于预设阈值的情况下,获取所述低分辨率图像内的非人像区域对应的目标非人像区域仿射变换系数和前一帧图像内的非人像区域对应的非人像区域仿射变换系数;确定所述目标帧图像的所述非人像区域内每五个像素点中第三像素点对应的仿射变换系数为所述目标非人像区域仿射变换系数,除第三像素点外的其他四个像素点对应的仿射变换系数为所述前一帧图像对应的非人像区域仿射变换系数,其中,所述每五个像素点呈十字形分布,所述第三像素点处于所述十字形的中心,且所述下一帧图像对应的所述第三像素点与所述目标帧图像对应的第三像素点的位置不同。在本申请实施例中,为了保证非人像区域的增强效果,画质增强装置提供了一种非人像区域的仿射变换系数轮询复用策略。目标帧更新系数的像素点,以4连通区域为它的视觉辐射范围,那么更新系数像素点的视觉辐射范围可以紧密排布;随着时序帧的进行,非人像区域的仿射变换系数更新像素点可以按照水平移动、垂直移动、对角线移动等等方向进行轮询,那么每4个前后帧间可形成循环。按照以上非人像区域的仿射变换系数的更新策略,可以将非人像区域的仿射变换系数BGU生成的运算量压缩至原来的1/5,并且保证该区域增强效果的退化程度在人眼可接受范围内。
第二方面,本申请提供了一种画质增强方法,包括:
通过通用处理单元CPU对目标视频内的目标帧图像进行下采样,获得低分辨率图像;
通过所述CPU对所述低分辨率图像进行特征提取,获取所述低分辨率图像对应的目标双边网格,所述目标双边网格包括所述低分辨率图像对应的仿射变换信息;
通过图像处理单元GPU根据所述目标双边网格和所述仿射变换信息,通过双边引导上 采样BGU插值方法获取所述低分辨率图像上采样后的仿射变换矩阵,所述仿射变换矩阵包括仿射变换系数,所述仿射变换系数用于增强所述目标帧图像的画质;
通过所述GPU根据所述仿射变换矩阵,对所述目标帧图像内每个像素点的画质进行增强,获得增强后的目标帧图像。
在一种可能的实现方式中,所述方法还包括:通过所述GPU接收所述CPU发送的所述目标双边网格;通过所述CPU在所述GPU接收所述CPU发送的所述目标双边网格后,对所述目标视频内的所述目标帧图像的下一帧图像进行下采样以及特征提取。
在一种可能的实现方式中,所述仿射变换系数包括人像区域仿射变换系数和非人像区域仿射变换系数。
在一种可能的实现方式中,所述通过图像处理单元GPU根据所述目标双边网格和所述仿射变换信息,通过双边引导上采样BGU插值方法获取所述低分辨率图像上采样后的仿射变换矩阵,包括:根据所述目标双边网格和所述仿射变换信息,通过所述BGU插值方法获取所述低分辨率图像内人像区域对应的目标人像区域仿射变换系数;将所述目标帧图像内的非人像区域和参考帧图像内的非人像区域进行几何配准,获得第一配准信息,所述第一配准信息用于指示所述目标帧图像的非人像区域和参考帧图像的非人像区域之间的相似度;在所述第一配准信息小于预设阈值的情况下,通过所述BGU插值方法获取所述低分辨率图像内的非人像区域对应的目标非人像区域仿射变换系数,并将所述目标帧图像更新为所述下一帧图像的参考帧图像;根据所述目标人像区域仿射变换系数和所述目标非人像区域仿射变换系数,获得所述仿射变换矩阵。
在一种可能的实现方式中,所述方法还包括:在所述第一配准信息大于或等于所述预设阈值的情况下,通过所述GPU获取所述参考帧图像内的非人像区域仿射变换系数,并将所述参考帧图像的非人像区域仿射变换系数作为所述目标帧图像的所述目标非人像区域仿射变换系数。
在一种可能的实现方式中,所述方法还包括:在所述第一配准信息大于或等于所述预设阈值的情况下,通过所述GPU获取所述参考帧图像的非人像区域仿射变换系数;通过所述GPU获取所述低分辨率图像内的非人像区域对应的目标非人像区域仿射变换系数;通过所述GPU确定所述目标帧图像的所述非人像区域所有像素点中第一像素点对应的仿射变换系数为所述参考帧图像对应的非人像区域仿射变换系数,第二像素点对应的仿射变换系数为所述目标非人像区域仿射变换系数,其中,所述第一像素点与所述第二像素点间隔分布。
在一种可能的实现方式中,所述通过图像处理单元GPU根据所述目标双边网格和所述仿射变换信息,通过双边引导上采样BGU插值方法获取所述低分辨率图像上采样后的仿射变换矩阵,包括:根据所述目标双边网格和所述仿射变换信息,通过所述BGU插值方法获取所述低分辨率图像内人像区域对应的目标人像区域仿射变换系数;将所述目标帧图像内的非人像区域和前一帧图像内的非人像区域进行几何配准,获得第二配准信息,所述第二配准信息用于指示所述目标帧图像的非人像区域和所述前一帧图像的非人像区域之间的相似度;在所述第二配准信息大于或等于预设阈值的情况下,获取所述低分辨率图像内的非人像区域对应的目标非人像区域仿射变换系数和前一帧图像内的非人像区域对应的非人像 区域仿射变换系数;确定所述目标帧图像的所述非人像区域内每五个像素点中第三像素点对应的仿射变换系数为所述目标非人像区域仿射变换系数,除第三像素点外的其他四个像素点对应的仿射变换系数为所述前一帧图像对应的非人像区域仿射变换系数,其中,所述每五个像素点呈十字形分布,所述第三像素点处于所述十字形的中心,且所述下一帧图像对应的所述第三像素点与所述目标帧图像对应的第三像素点的位置不同。
第三方面,本申请实施例提供了一种计算机可读存储介质,包括计算机指令,当该计算机指令在电子设备上运行时,使得该电子设备执行本申请实施例第二方面或第二方面的任意一种实现方式提供的画质增强方法。
第四方面,本申请实施例提供了一种计算机程序产品,当该计算机程序产品在电子设备上运行时,使得该电子设备执行本申请实施例第二方面或第二方面的任意一种实现方式提供的画质增强方法。
第五方面,本申请提供了一种芯片系统,该芯片系统包括处理器,用于支持网络设备实现上述第一方面中所涉及的功能。在一种可能的设计中,所述芯片系统还包括存储器,所述存储器,用于保存数据发送设备必要的程序指令和数据。该芯片系统,可以由芯片构成,也可以包含芯片和其他分立器件。
可以理解地,上述提供的第二方面、第三方面、第四方面以及第五方面的有益效果可参考第一方面所提供的画质增强装置中的有益效果,此处不再赘述。
附图说明
图1是本申请实施例提供的一种画质增强系统架构示意图;
图2A是本申请实施例提供的一种画质增强装置的结构示意图;
图2B是本申请实施例提供的一种画质增强装置处理目标视频的流程示意图;
图2C是本申请实施例提供的一种参考帧图像的仿射变换系数复用流程示意图;
图2D是本申请实施例提供的一种非人像区域内第一像素点和第二像素点分布示意图;
图2E是本申请实施例提供的一种非人像区域内第三像素点分布位置示意图;
图3是本申请实施例提供的一种画质增强方法的流程示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例进行描述。
本申请的说明书和权利要求书及所述附图中的术语“第一”、“第二”和“第三”等是用于区别不同对象,而不是用于描述特定顺序。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。
在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式 地理解的是,本文所描述的实施例可以与其它实施例相结合。
在本说明书中使用的术语“部件”、“模块”、“系统”等用于表示计算机相关的实体、硬件、固件、硬件和软件的组合、软件、或执行中的软件。例如,部件可以是但不限于,在处理器上运行的进程、处理器、对象、可执行文件、执行线程、程序和/或计算机。通过图示,在计算设备上运行的应用和计算设备都可以是部件。一个或多个部件可驻留在进程和/或执行线程中,部件可位于一个计算机上和/或分布在2个或更多个计算机之间。此外,这些部件可从在上面存储有各种数据结构的各种计算机可读介质执行。部件可例如根据具有一个或多个数据分组(例如来自与本地系统、分布式系统和/或网络间的另一部件交互的二个部件的数据,例如通过信号与其它系统交互的互联网)的信号通过本地和/或远程进程来通信。
首先,对本申请中的部分用语进行解释说明,以便于本领域技术人员理解。
(1)高动态范围成像(High-Dynamic Range,HDR),在计算机图形学与电影摄影术中,是用来实现比普通数位图像技术更大曝光动态范围(即更大的明暗差别)的一组技术。相比普通的图像,HDR图像可以提供更多的动态范围和图像细节,根据不同的曝光时间的LDR(Low-Dynamic Range,低动态范围图像),并利用每个曝光时间相对应最佳细节的LDR图像来合成最终HDR图像。它能够更好的反映出真实环境中的视觉效果。
(2)位深:表示在进行图像的调色板颜色量化时,当每种颜色使用一组二进制数值描述,最终需要的用于描述所有颜色的二进制位数。这并不意味着图像一定会使用所有的这些颜色,而是可以指定颜色的精度级别。对于灰度图像,位深可以量化其灰度级。具有较高位深的图像可以编码更多的阴影或颜色,因为其具有更多的0和1可用组合。例如八位深,就是2的8次方,即一个像素点可以显示256种色彩。
(3)帧率(Frame rate)是以帧称为单位的位图图像连续出现在显示器上的频率(速率)。帧率就是摄影机每秒所拍摄图片的数量,这些图片连续播放形成动态变成视频。
(4)配准(registration)是指同一区域内以不同成像手段所获得的不同图像图形的地理坐标的匹配。包括几何纠正、投影变换与统一比例尺三方面的处理。
(5)几何配准(Geometric Registration),将不同时间、不同波段、不同遥感器系统所获得的同一地区的图像(数据),经几何变换使同名像点在位置上和方位上完全叠合的操作。
(6)YUV,是一种颜色编码方法。常使用在各个视频处理组件中。YUV在对照片或视频编码时,考虑到人类的感知能力,允许降低色度的带宽。YUV是编译true-color颜色空间(color space)的种类,Y'UV,YUV,YCbCr,YPbPr等专有名词都可以称为YUV,彼此有重叠。“Y”表示明亮度(Luminance、Luma),“U”和“V”则是色度、浓度(Chrominance、Chroma)。
(7)RGB,是一种颜色编码方法。对一种颜色进行编码的方法统称为“颜色空间”或“色域”。采用RGB这种编码方法,每种颜色都可用三个变量来表示:R红色(red)、G绿色(green)以及B蓝色(blue)的强度。
其次,为了便于理解本申请实施例,以下具体分析本申请实施例所需要解决的技术问题以及对应的应用场景。随着深度学习在图像领域取得巨大成功,众多大屏厂商将AI画质 优化作为重点宣传特性。其中,逐帧、像素级的画质优化成为技术发展的主流方向。示例性的,在进行提升大屏画面质量的过程中,大多数都采用以下几种方式:
现有技术一,生产商联合视频媒体拍摄制作高分辨率、高对比度、更高位深的新片源,结合高位深显示面板硬件,从而取得出众的画质效果。
然而,高位深新片源的制作,需要与其相兼容的显示面板支持,这会带来额外的硬件成本,并且经典的老片源重制难度极高,基本上无法获得增强。
现有技术二,生产商采用自动模式调整画质,通过预置的画质模式,对亮度、色温、对比度曲线进行重新映射,到达画面增强的效果。
然而,传统方法的色调映射曲线,只涉及到对比度、饱和度、亮度等有限个数的指标参数修改,并且是在全局范围内进行调整,在效果上与深度学习的逐帧、逐像素图像增强方法相差较大;并且其预置场景的覆盖范围也极为有限,同时自动识别场景类型的正确率又难以保证,这些都导致传统视频增强方法效果不够理想。
现有技术三,采用高动态范围(HDR)模式调整画质。在数码相机和手机等设备上,HDR模式通常利用多帧长短时曝光序列进行合成的方法。在光线暗的时候需要长曝光,亮的时候需要短曝光。这类方法使用前后三帧的长、中、短曝光时长的图像,或者前后两帧长、短曝光时长的图像、或者前后六张中度曝光的图像进行融合,在高亮度区域采用短时曝光图像,而在低亮度区域利用长时曝光图像,融合之后图像能够较好地保留细节,从而提升图像的动态范围。
然而,对于静止图像,可直接将过曝和欠曝等多种情况下的图像加权合成HDR图片;但是运动物体,尽管采用相关的匹配算法进行补偿,仍然会出现伪像使得合成图片的质量难以接受。在视频增强领域,也几乎不可能获得多曝光情况下的同一段视频,所以该方法目前还不具备应用条件。
现有技术四,基于深度学习的单帧图像重构HDR图像的技术中,HDRCNN、DPEU、HDRNet等方法都取得了较好的效果。
然而,对于720P甚至更高分辨率的视频,传统的深度学习增强方法需要全部像素点参与特征提取,会带来海量的运算量。在实际的大屏产品中的处理器(CPU)和显卡(GPU)性能较为羸弱,如果简单地对视频每一帧采用HDRNet方法增强,将无法利用视频流中的帧间信息,也无法满足实时处理播放的需求。
因此,针对上述技术问题,本申请在基于深度学习的单帧图像重构HDRNet网络结构的基础上,设计了一种异构多级流水线的处理结构,利用多帧人像信息进行增强系数的复用。解决了现有技术中单核运算量大的问题,实现无需片源依赖、多场景下自动识别的高动态范围(HDR)图像增强。
为了便于理解本申请实施例,下面先对本申请实施例所基于的其中一种画质增强系统架构进行描述。请参考附图1,图1是本申请实施例提供的一种画质增强系统架构示意图。下面示例性列举本申请中一种画质增强方法所应用的系统架构,如图1所示,图1示出了本申请实施例涉及的系统架构,包括硬件解码器101、一个或多个处理器102、硬件编码器103以及显示面板104。其中,
硬件解码器101,是一种输入模拟视频/音频信号并将它转换为数字信号格式,以进一步压缩和传输的设备。在本申请实施例中,可以将输入的视频流解码为多帧原始低动态范围(LDR)图像。
处理器102可以包括一个或多个处理单元,例如:处理器102可以包括应用处理器(application processor,AP),调制解调处理器,图形处理单元(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),通用处理单元(central processing unit,CPU),存储器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。例如:处理器可以先将目标视频中的目标帧图像通过下采样,并转换成为一个带有图像仿射变换信息的双边网格;随后通过BGU插值方法获得全分辨的仿射变化系数,进而应用在原始图像中最终获得增强后的高动态范围(HDR)增强图像。
示例性的,在本申请实施例中,处理器102可以包括CPU和GPU两个运算单元。CPU作为系统的运算和控制核心,是信息处理、程序运行的最终执行单元。在移动终端领域,通用处理单元CPU,一般可以包括精简指令集处理器(Advanced RISC Machines,ARM)系列,可包括1个或多个核心处理单元。在本申请实施例中,通用处理单元CPU可以用于对目标视频内的目标帧图像进行下采样,获得低分辨率图像;对所述低分辨率图像进行特征提取,获取所述低分辨率图像对应的目标双边网格,所述目标双边网格包括所述低分辨率图像对应的仿射变换信息。例如:先将原始图像做下采样,可以根据预设需求进行几次卷积下采样,然后分别学习其全局特征和局部特征,并且结合之后再将其转换到双边网格之中。而GPU可以用于获得全分辨率增强图,在GPU上利用OpenCL并行化编程语言实现,获得高优化的性能。GPU使显卡减少了对CPU的依赖,并进行部分原本CPU的工作。例如:GPU可以用于:根据所述目标双边网格和所述仿射变换信息,通过双边引导上采样BGU插值方法获取所述低分辨率图像上采样后的仿射变换矩阵;根据所述仿射变换矩阵,对所述目标帧图像内每个像素点的画质进行增强,获得增强后的目标帧图像,其中,所述仿射变换矩阵包括仿射变换系数,所述仿射变换系数用于增强所述目标帧图像的画质。例如:对输入的原始图像做仿射变换得到引导图,并将其用来引导前面的双边网格做空间和颜色深度上的插值,恢复到和原来图像一样大小,获得全分辨的仿射变换系数。又例如:将得到的这个仿射变换系数对原图像做仿射变换,得到输出图像,输出图像为获得增强后的高分辨率图像。
硬件编码器103,是把数据编码文件转为模拟视频/音频信号的过程。其中,硬件解码器101和硬件编码器103可以集成在一个设备上,可称为编解码器。
显示面板104,用于显示硬件编码器103输出的模拟视频/音频信号。
本申请实施例通过硬件解码器101将接收到的视频流,解码获得图像帧,其中获得的图像帧为原始低动态范围(LDR)图像,原始LDR图像通过处理器102进行处理,生成对应的高动态范围(HDR)增强图像,输送到硬件编码器103,将增强视频最终显示在显示面板104上。其中处理器102采用异构的多级流水线处理结构,对网络处理时间并行计算,可以提高实时处理的帧率,节约资源。
需要说明的是,图1中的画质增强系统架构只是本申请实施例中的一种示例性的实施方式,本申请实施例中的画质增强系统架构包括但不仅限于以上画质增强系统架构。
还需要说明的是,本申请实施例中的画质增强系统架构可以应用在如:街边大型显示屏、屏幕投影仪、放映仪、电视显示屏等拥有大型显示器或可以将视频图像投影至大型显示器的电子设备中。而且本申请实施例示意的画质增强系统架构并不构成对该电子设备的具体限定。在本申请另一些实施例中,电子设备可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。画质增强系统架构图示的部件可以以硬件,软件或软件和硬件的组合实现。
基于上述提供的一种画质增强系统架构,本申请实施例提供一种应用于上述画质增强系统架构中的画质增强装置,请参见图2A,图2A是本申请实施例提供的一种画质增强装置的结构示意图,该画质增强装置10相当于上述图1所示的处理器102。其中,在画质增强装置中,可以包括通用处理单元CPU和图像处理单元GPU两种运算单元。其中,CPU中包括预处理模块301、低分辨率系数推理模块302和调光模块304;GPU中包括:全分辨率BGU模块303。
其中,预处理模块301具体用于帧格式转换(例如,可以将目标帧图像的帧格式从YUV转换到RGB)、以及对目标视频内的图像下采样获得低分辨率图像。
低分辨率系数推理模块302用于对所述低分辨率图像进行特征提取,获取所述低分辨率图像对应的目标双边网格,所述目标双边网格包括所述低分辨率图像对应的仿射变换信息,即,通过特征提取网络(例如:Tflite、多层神经网络(Multilayer Neural Network,MNN)等)对图像进行特征提取。然后分别学习其全局特征和局部特征,并且将其结合之后再将其转换到双边网格之中,获取所述低分辨率图像对应的目标双边网格。需要说明的是,提取的特征包括全局特征和局部特征,其中,全局特征包括光照、亮度等等,局部特征包括对比度、色彩保护度、语义特征等等。还需要说明的是,所述低分辨率系数推理模块302还可以利用低分辨率网络获得的人像区域定位信息,在目标视频内的每一帧图像以及每一帧对应的参考帧图像之间进行跟踪。
调光模块304具体用于根据全分辨率BGU模块303获得的全分辨率画质增强图对原图像进行调光及平衡,帧间调光平顺、以及人像区域信息维护、非人像区域的仿射变换系数更新模板等等。
其中,全分辨率BGU模块303用于:根据所述目标双边网格和所述仿射变换信息,通过双边引导上采样BGU插值方法获取所述低分辨率图像上采样后的仿射变换矩阵,所述仿射变换矩阵包括仿射变换系数,所述仿射变换系数用于增强所述目标帧图像的画质;根据所述仿射变换矩阵,对所述目标帧图像内每个像素点的画质进行增强,获得增强后的目标帧图像。例如:对输入的原始图像做仿射变换得到引导图,并将其用来引导前面的双边网格做空间和颜色深度上的插值,恢复到和原来目标帧图像一样大小,获得全分辨的仿射变换系数。最后将得到的这个仿射变换系数对原图像做仿射变换,得到输出图像,输出图像为获得增强后的全分辨率画质增强图。例如,全分辨率BGU模块303在GPU上利用OpenCL并行化编程语言实现,获得高优化的仿射变换矩阵;对于边缘敏感,锐度比较高,提高了 画质增强的效果。
其中,可选的,所述仿射变换系数包括人像区域仿射变换系数和非人像区域仿射变换系数。在本申请实施例中,例如:大屏产品的视频画面焦点,大多是人像区域,而对于风景、虚化背景等区域的关注度较低,因此可以利用人类视觉的暂留效应,将仿射变换矩阵包含的仿射变换系数分为人像区域仿射变换系数和非人像区域仿射变换系数,以便画质增强装置针对不同区域的画质进行不同程度的增强处理,例如:对非人像区域仿射变换系数进行复用,可以进一步较少运算量、提升实时性能。
在一种可能的实现方式中,所述GPU还用于:接收所述CPU发送的所述目标双边网格;所述CPU还用于:在所述GPU接收所述CPU发送的所述目标双边网格后,对所述目标视频内的所述目标帧图像的下一帧图像进行下采样以及特征提取。相比于串行处理方案(即,只有单独一个CPU或GPU处理的目标视频)或者所有的画质增强步骤都在帧率时间内完成的方案,本申请实施例无需所有的图像增强步骤均在帧率时间内通过CPU或GPU按顺序运行完成,只需要保证运行耗时最长的步骤(如:获得仿射变换矩阵)在帧率时间内通过CPU或GPU执行完毕,即可达到视频流的实时增强的目的。例如:在GPU对目标帧图像进行处理时,CPU同步对目标帧图像的下一帧图像进行画质增强的预处理,如:下采样、特征提取等,大大缩短了画质增强的处理时间。例如:请参考附图2B,图2B是本申请实施例提供的一种画质增强装置处理目标视频的流程示意图。如图2B所示:目标视频通过异构多级流水线,画质增强装置依次对目标视频内每一帧图像进行画质增强处理,其中,当预处理模块301对第n帧图像完成预处理(下采样和特征提取)后,可以预处理模块301接着对第n+1帧图像进行预处理,使得每一个步骤(如:下采样、特征提取、获得仿射变换矩阵、调光等等)均在帧率时间内通过CPU或GPU执行完毕,在该视频画质增强方案得以同时充分利用CPU和GPU的运算性能,达到了视频流的实时增强的目的。
可选的,所述GPU具体用于:根据所述目标双边网格和所述仿射变换信息,通过所述BGU插值方法获取所述低分辨率图像内人像区域对应的目标人像区域仿射变换系数;将所述目标帧图像内的非人像区域和参考帧图像内的非人像区域进行几何配准,获得第一配准信息,所述第一配准信息用于指示所述目标帧图像的非人像区域和参考帧图像的非人像区域之间的相似度;在所述第一配准信息小于预设阈值的情况下,通过所述BGU插值方法获取所述低分辨率图像内的非人像区域对应的目标非人像区域仿射变换系数,并将所述目标帧图像更新为所述下一帧图像的参考帧图像;根据所述目标人像区域仿射变换系数和所述目标非人像区域仿射变换系数,获得所述仿射变换矩阵。画质增强装置为了确定参考帧图像的非人像区域是否可以应用到目标帧图像中,可以通过所述BGU插值方法获取所述低分辨率图像内人像区域对应的目标人像区域仿射变换系数;将所述目标帧图像内的非人像区域和参考帧图像内的非人像区域进行几何配准,再确定。例如:当目标帧图像为当前帧图像时,请参考附图2C,图2C是本申请实施例提供的一种参考帧图像的仿射变换系数复用流程示意图。如图2C所示,首先对于参考帧图像和当前帧图像之间进行人像定位和/或跟踪,根据该定位信息将参考帧图像与当前帧图像进行非人像区域的几何位置的配准,获得配准后的配准信息,判断该参考帧图像是否可以应用到当前帧图像中。将该配准信息与预设阈值对比后,确定配准信息小于预设阈值,参考帧不可用,即,当前帧图像对应的参考 帧失效,无法为当前帧图像提供非人像区域的仿射变换系数,则此时需要重新按照所述BGU方法生成目标帧图像中非人像区域的仿射变换系数;其中,当前帧图像的人像区域仿射变换系数按照人像外接框根据BGU方法重新获取,最后,将人像区域和非人像区域的仿射变换系数合并获得仿射变换矩阵,并逐像素应用于原始图像中,输出增强之后的图像。而且,当前帧图像对应的参考帧不可用时,该当前帧图像可以作为下一帧图像的参考帧图像,从而进行帧间图像的复用,加强帧间信息的利用。需要说明的是,第一帧图像可能没有参考帧图像,也可能是预先存储的图像,本申请实施例对此不作具体限定。
在一种可能的实现方式中,所述GPU还用于:在所述第一配准信息大于或等于所述预设阈值的情况下,获取所述参考帧图像内的非人像区域仿射变换系数,并将所述参考帧图像的非人像区域仿射变换系数作为所述目标帧图像的所述目标非人像区域仿射变换系数。其中,若第一配准信息大于或等于预设阈值,则参考帧图像的非人像区域的仿射变换系数可以应用到目标帧图像中,因此,图像增强装置可以将参考帧图像的非人像区域仿射变换系数作为目标帧图像的目标非人像区域仿射变换系数对目标帧图像的画质增强。如图2C所示,首先利用低分辨率网络获得的人像定位信息,在参考帧间和当前帧间进行跟踪;一方面在人像区域仍然采用原始BGU方法获得该区域的全分辨率增强系数;另一方面,在非人像区域进行几何位置的配准,配准信息与阈值对比后确定参考帧可用,将按照之后的背景区域系数复用策略产生当前帧背景区域的增强系数,最后,人像区域和背景区域的增强系数合并,并逐像素应用于原始图像中,输出增强之后的图像,节省了运算空间,提高了画质增强的效率。
在一种可能的实现方式中,所述GPU还用于:在所述第一配准信息大于或等于所述预设阈值的情况下,获取所述参考帧图像的非人像区域仿射变换系数;获取所述低分辨率图像内的非人像区域对应的目标非人像区域仿射变换系数;确定所述目标帧图像的所述非人像区域所有像素点中第一像素点对应的仿射变换系数为所述参考帧图像对应的非人像区域仿射变换系数,第二像素点对应的仿射变换系数为所述目标非人像区域仿射变换系数,其中,所述第一像素点与所述第二像素点间隔分布。其中,若第一配准信息大于或等于预设阈值,则参考帧图像的非人像区域的仿射变换系数可以应用到目标帧图像中,因此,图像增强装置可以将参考帧图像的非人像区域仿射变换系数作为目标帧图像内一部分的目标非人像区域仿射变换系数,剩余一部分的仿射变换系数通过所述BGU方法生成,大大压缩了非人像区域内的仿射变换系数生成的运算量,并且保证该区域增强效果的退化程度在人眼可接受范围内。可选的,下一帧图像的第一像素点和第二像素点分布的位置与所述目标帧图像的第一像素点和第二像素点分布的位置不同,可以按照水平移动、垂直移动、对角线移动等方向进行轮询。请参考附图2D,图2D是本申请实施例提供的一种非人像区域内第一像素点和第二像素点分布示意图。如图2D所示,第N帧图像内第一像素点和第二像素点分布的位置与第N+1帧图像内第一像素点和第二像素点分布的位置可以不同,以便保证该区域增强效果的退化程度在人眼可接受范围内,同时也提高了视频画质增强的效率。
在一种可能的实现方式中,所述GPU具体用于:根据所述目标双边网格和所述仿射变换信息,通过所述BGU插值方法获取所述低分辨率图像内人像区域对应的目标人像区域仿射变换系数;将所述目标帧图像内的非人像区域和前一帧图像内的非人像区域进行几何配 准,获得第二配准信息,所述第二配准信息用于指示所述目标帧图像的非人像区域和所述前一帧图像的非人像区域之间的相似度;在所述第二配准信息大于或等于预设阈值的情况下,获取所述低分辨率图像内的非人像区域对应的目标非人像区域仿射变换系数和前一帧图像内的非人像区域对应的非人像区域仿射变换系数;确定所述目标帧图像的所述非人像区域内每五个像素点中第三像素点对应的仿射变换系数为所述目标非人像区域仿射变换系数,除第三像素点外的其他四个像素点对应的仿射变换系数为所述前一帧图像对应的非人像区域仿射变换系数,其中,所述每五个像素点呈十字形分布,所述第三像素点处于所述十字形的中心,且所述下一帧图像对应的所述第三像素点与所述目标帧图像对应的第三像素点的位置不同。
其中,为了保证非人像区域的增强效果,画质增强装置提供了一种非人像区域的仿射变换系数轮询复用策略。请参考附图2E,图2E是本申请实施例提供的一种非人像区域内第三像素点分布位置示意图。如图2E所示,每五个像素点呈十字形分布,所述第三像素点处于所述十字形的中心。第N帧图像(相当于目标帧图像)更新系数的像素点,以所述第三像素点的上、下、左、右4个连通区域为它的视觉辐射范围,随着时序帧的进行,非人像区域的仿射变换系数更新像素点可以按照水平移动、垂直移动、对角线移动等等方向进行轮询,那么更新系数像素点的视觉辐射范围可以紧密排布,且第三像素点在每4个前后帧间可形成循环。即,如图2E所示,第N+1帧对应的所述第三像素点与第N帧图像对应的第三像素点的位置不同,因此,由第N帧至第N+3帧四个帧间可以更新全部的非人像区域仿射变换系数。按照以上非人像区域的仿射变换系数的更新策略,可以将非人像区域的仿射变换系数BGU生成的运算量压缩至原来的1/5,并且保证该区域增强效果的退化程度在人眼可接受范围内。
在本申请实施例中,可以通过画质增强装置,同时利用通用处理单元CPU和图像处理单元GPU的运算性能,通过并行异构多级处理(即,CPU与GPU同时分工处理目标视频)的画质增强方案,对目标视频的每一帧图像进行画质增强。其次,该画质增强装置将画质增强方案的网络结构合理分配在CPU与GPU异构的多级流水线上执行,相比于单独的CPU或者GPU串行处理,获得了更低的单帧处理时延,使得视频流的实时增强的时间变短。而且可以利用双边网格实现图像操作算子的加速,即将目标视频中每一帧图像先通过下采样,并转换成为一个带有图像仿射变换信息的双边网格;随后通过BGU插值方法获得全分辨的仿射变化系数,进而应用在原始图像中,最终获得增强后的高分辨率图像,压缩了视频增强的运算量。
需要说明的是,图2A中的画质增强装置只是本申请实施例中的一种示例性的实施方式,本申请实施例中的画质增强装置包括但不仅限于以上画质增强装置。
基于图1提供的画质增强系统架构,以及图2A提供的画质增强装置,结合本申请中提供的画质增强方法,对本申请中提出的技术问题进行具体分析和解决。请参见图3,图3是本申请实施例提供的一种画质增强方法的流程示意图,该方法可应用于上述图2A中所述的画质增强装置的结构中,以及图1提供的画质增强系统构架,其中,画质增强装置可以用于支持并执行所述图3中所示的方法流程步骤S301-步骤S305。其中,
步骤S301:通过通用处理单元CPU对目标视频内的目标帧图像进行下采样,获得低分辨率图像。
具体地,画质增强装置通过通用处理单元CPU对目标视频内的目标帧图像进行下采样,获得低分辨率图像。
步骤S302:通过CPU对所述低分辨率图像进行特征提取,获取低分辨率图像对应的目标双边网格。
具体地,画质增强装置通过所述CPU对所述低分辨率图像进行特征提取,获取所述低分辨率图像对应的目标双边网格,所述目标双边网格包括所述低分辨率图像对应的仿射变换信息;
步骤S303:通过图像处理单元GPU接收CPU发送的目标双边网格。
具体地,画质增强装置通过所述GPU接收所述CPU发送的所述目标双边网格。
可选的,画质增强装置通过所述CPU在所述GPU接收所述CPU发送的所述目标双边网格后,对所述目标视频内的所述目标帧图像的下一帧图像进行下采样以及特征提取。
步骤S304:通过GPU根据目标双边网格和仿射变换信息,通过双边引导上采样BGU插值方法获取低分辨率图像上采样后的仿射变换矩阵。
具体地,画质增强装置通过图像处理单元GPU根据所述目标双边网格和所述仿射变换信息,通过双边引导上采样BGU插值方法获取所述低分辨率图像上采样后的仿射变换矩阵,所述仿射变换矩阵包括仿射变换系数,所述仿射变换系数用于增强所述目标帧图像的画质。
可选的,所述仿射变换系数包括人像区域仿射变换系数和非人像区域仿射变换系数。
可选的,画质增强装置根据所述目标双边网格和所述仿射变换信息,通过所述BGU插值方法获取所述低分辨率图像内人像区域对应的目标人像区域仿射变换系数;将所述目标帧图像内的非人像区域和参考帧图像内的非人像区域进行几何配准,获得第一配准信息,所述第一配准信息用于指示所述目标帧图像的非人像区域和参考帧图像的非人像区域之间的相似度;在所述第一配准信息小于预设阈值的情况下,通过所述BGU插值方法获取所述低分辨率图像内的非人像区域对应的目标非人像区域仿射变换系数,并将所述目标帧图像更新为所述下一帧图像的参考帧图像;根据所述目标人像区域仿射变换系数和所述目标非人像区域仿射变换系数,获得所述仿射变换矩阵。
可选的,在所述第一配准信息大于或等于所述预设阈值的情况下,画质增强装置通过所述GPU获取所述参考帧图像内的非人像区域仿射变换系数,并将所述参考帧图像的非人像区域仿射变换系数作为所述目标帧图像的所述目标非人像区域仿射变换系数。
可选的,在所述第一配准信息大于或等于所述预设阈值的情况下,画质增强装置通过所述GPU获取所述参考帧图像的非人像区域仿射变换系数;通过所述GPU获取所述低分辨率图像内的非人像区域对应的目标非人像区域仿射变换系数;通过所述GPU确定所述目标帧图像的所述非人像区域所有像素点中第一像素点对应的仿射变换系数为所述参考帧图像对应的非人像区域仿射变换系数,第二像素点对应的仿射变换系数为所述目标非人像区域仿射变换系数,其中,所述第一像素点与所述第二像素点间隔分布。
可选的,画质增强装置根据所述目标双边网格和所述仿射变换信息,通过所述BGU插值方法获取所述低分辨率图像内人像区域对应的目标人像区域仿射变换系数;将所述目标 帧图像内的非人像区域和前一帧图像内的非人像区域进行几何配准,获得第二配准信息,所述第二配准信息用于指示所述目标帧图像的非人像区域和所述前一帧图像的非人像区域之间的相似度;在所述第二配准信息大于或等于预设阈值的情况下,获取所述低分辨率图像内的非人像区域对应的目标非人像区域仿射变换系数和前一帧图像内的非人像区域对应的非人像区域仿射变换系数;确定所述目标帧图像的所述非人像区域内每五个像素点中第三像素点对应的仿射变换系数为所述目标非人像区域仿射变换系数,除第三像素点外的其他四个像素点对应的仿射变换系数为所述前一帧图像对应的非人像区域仿射变换系数,其中,所述每五个像素点呈十字形分布,所述第三像素点处于所述十字形的中心,且所述下一帧图像对应的所述第三像素点与所述目标帧图像对应的第三像素点的位置不同。
步骤S305:通过GPU根据仿射变换矩阵,对目标帧图像内每个像素点的画质进行增强,获得增强后的目标帧图像。
具体地,画质增强装置通过所述GPU根据所述仿射变换矩阵,对所述目标帧图像内每个像素点的画质进行增强,获得增强后的目标帧图像。
需要说明的是,本申请实施例步骤S301-步骤S305的相关描述还可以对应参考上述图2A-图2E各个实施例的相关描述,此处不再赘述。
实施本申请实施例,可以通过画质增强装置,同时利用通用处理单元CPU和图像处理单元GPU的运算性能,通过并行异构多级处理(即,CPU与GPU同时分工处理目标视频)的画质增强方案,对目标视频的每一帧图像进行画质增强。其次,该画质增强装置将画质增强方案的网络结构合理分配在CPU与GPU异构的多级流水线上执行,相比于单独的CPU或者GPU串行处理,获得了更低的单帧处理时延,使得视频流的实时增强的时间变短。而且可以利用双边网格实现图像操作算子的加速,即将目标视频中每一帧图像先通过下采样,并转换成为一个带有图像仿射变换信息的双边网格;随后通过BGU插值方法获得全分辨的仿射变化系数,进而应用在原始图像中,最终获得增强后的高分辨率图像,压缩了视频增强的运算量。
上述主要从电子设备实施的方法的角度对本申请实施例提供的方案进行了介绍。可以理解的是,各个网元,例如电子设备、处理器等为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的网元及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本申请实施例可以根据上述方法示例对电子设备、摄像设备等进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
本申请实施例还提供了一种计算机可读存储介质。上述方法实施例中的全部或者部分 流程可以由计算机程序来指令相关的硬件完成,该程序可存储于上述计算机存储介质中,该程序在执行时,可包括如上述各方法实施例的流程。该计算机可读存储介质包括:只读存储器(read-only memory,ROM)或随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可存储程序代码的介质。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者通过所述计算机可读存储介质进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如,固态硬盘(solid state disk,SSD))等。
本申请实施例方法中的步骤可以根据实际需要进行顺序调整、合并和删减。
本申请实施例装置中的模块可以根据实际需要进行合并、划分和删减。
以上所述,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。

Claims (17)

  1. 一种画质增强装置,其特征在于,所述画质增强装置包括通用处理单元CPU和图像处理单元GPU;其中,
    所述CPU用于:对目标视频内的目标帧图像进行下采样,获得低分辨率图像;
    对所述低分辨率图像进行特征提取,获取所述低分辨率图像对应的目标双边网格,所述目标双边网格包括所述低分辨率图像对应的仿射变换信息;
    所述GPU用于:根据所述目标双边网格和所述仿射变换信息,通过双边引导上采样BGU插值方法获取所述低分辨率图像上采样后的仿射变换矩阵,所述仿射变换矩阵包括仿射变换系数,所述仿射变换系数用于增强所述目标帧图像的画质;
    根据所述仿射变换矩阵,对所述目标帧图像内每个像素点的画质进行增强,获得增强后的目标帧图像。
  2. 根据权利要求1所述装置,其特征在于,所述GPU还用于:接收所述CPU发送的所述目标双边网格;
    所述CPU还用于:在所述GPU接收所述CPU发送的所述目标双边网格后,对所述目标视频内的所述目标帧图像的下一帧图像进行下采样以及特征提取。
  3. 根据权利要求1或2所述装置,其特征在于,所述仿射变换系数包括人像区域仿射变换系数和非人像区域仿射变换系数。
  4. 根据权利要求3所述装置,其特征在于,所述GPU具体用于:
    根据所述目标双边网格和所述仿射变换信息,通过所述BGU插值方法获取所述低分辨率图像内人像区域对应的目标人像区域仿射变换系数;
    将所述目标帧图像内的非人像区域和参考帧图像内的非人像区域进行几何配准,获得第一配准信息,所述第一配准信息用于指示所述目标帧图像的非人像区域和参考帧图像的非人像区域之间的相似度;
    在所述第一配准信息小于预设阈值的情况下,通过所述BGU插值方法获取所述低分辨率图像内的非人像区域对应的目标非人像区域仿射变换系数,并将所述目标帧图像更新为所述下一帧图像的参考帧图像;
    根据所述目标人像区域仿射变换系数和所述目标非人像区域仿射变换系数,获得所述仿射变换矩阵。
  5. 根据权利要求4所述装置,其特征在于,所述GPU还用于:
    在所述第一配准信息大于或等于所述预设阈值的情况下,获取所述参考帧图像内的非人像区域仿射变换系数,并将所述参考帧图像的非人像区域仿射变换系数作为所述目标帧图像的所述目标非人像区域仿射变换系数。
  6. 根据权利要求4所述装置,其特征在于,所述GPU还用于:
    在所述第一配准信息大于或等于所述预设阈值的情况下,获取所述参考帧图像的非人像区域仿射变换系数;
    获取所述低分辨率图像内的非人像区域对应的目标非人像区域仿射变换系数;
    确定所述目标帧图像的所述非人像区域所有像素点中第一像素点对应的仿射变换系数为所述参考帧图像对应的非人像区域仿射变换系数,第二像素点对应的仿射变换系数为所述目标非人像区域仿射变换系数,其中,所述第一像素点与所述第二像素点间隔分布。
  7. 根据权利要求3所述装置,其特征在于,所述GPU具体用于:
    根据所述目标双边网格和所述仿射变换信息,通过所述BGU插值方法获取所述低分辨率图像内人像区域对应的目标人像区域仿射变换系数;
    将所述目标帧图像内的非人像区域和前一帧图像内的非人像区域进行几何配准,获得第二配准信息,所述第二配准信息用于指示所述目标帧图像的非人像区域和所述前一帧图像的非人像区域之间的相似度;
    在所述第二配准信息大于或等于预设阈值的情况下,获取所述低分辨率图像内的非人像区域对应的目标非人像区域仿射变换系数和前一帧图像内的非人像区域对应的非人像区域仿射变换系数;
    确定所述目标帧图像的所述非人像区域内每五个像素点中第三像素点对应的仿射变换系数为所述目标非人像区域仿射变换系数,除第三像素点外的其他四个像素点对应的仿射变换系数为所述前一帧图像对应的非人像区域仿射变换系数,其中,所述每五个像素点呈十字形分布,所述第三像素点处于所述十字形的中心,且所述下一帧图像对应的所述第三像素点与所述目标帧图像对应的第三像素点的位置不同。
  8. 一种画质增强方法,其特征在于,包括:
    通过通用处理单元CPU对目标视频内的目标帧图像进行下采样,获得低分辨率图像;
    通过所述CPU对所述低分辨率图像进行特征提取,获取所述低分辨率图像对应的目标双边网格,所述目标双边网格包括所述低分辨率图像对应的仿射变换信息;
    通过图像处理单元GPU根据所述目标双边网格和所述仿射变换信息,通过双边引导上采样BGU插值方法获取所述低分辨率图像上采样后的仿射变换矩阵,所述仿射变换矩阵包括仿射变换系数,所述仿射变换系数用于增强所述目标帧图像的画质;
    通过所述GPU根据所述仿射变换矩阵,对所述目标帧图像内每个像素点的画质进行增强,获得增强后的目标帧图像。
  9. 根据权利要求8所述方法,其特征在于,所述方法还包括:
    通过所述GPU接收所述CPU发送的所述目标双边网格;
    通过所述CPU在所述GPU接收所述CPU发送的所述目标双边网格后,对所述目标视频内的所述目标帧图像的下一帧图像进行下采样以及特征提取。
  10. 根据权利要求8或9所述方法,其特征在于,所述仿射变换系数包括人像区域仿射变换系数和非人像区域仿射变换系数。
  11. 根据权利要求10所述方法,其特征在于,所述通过图像处理单元GPU根据所述目标双边网格和所述仿射变换信息,通过双边引导上采样BGU插值方法获取所述低分辨率图像上采样后的仿射变换矩阵,包括:
    根据所述目标双边网格和所述仿射变换信息,通过所述BGU插值方法获取所述低分辨率图像内人像区域对应的目标人像区域仿射变换系数;
    将所述目标帧图像内的非人像区域和参考帧图像内的非人像区域进行几何配准,获得第一配准信息,所述第一配准信息用于指示所述目标帧图像的非人像区域和参考帧图像的非人像区域之间的相似度;
    在所述第一配准信息小于预设阈值的情况下,通过所述BGU插值方法获取所述低分辨率图像内的非人像区域对应的目标非人像区域仿射变换系数,并将所述目标帧图像更新为所述下一帧图像的参考帧图像;
    根据所述目标人像区域仿射变换系数和所述目标非人像区域仿射变换系数,获得所述仿射变换矩阵。
  12. 根据权利要求11所述方法,其特征在于,所述方法还包括:
    在所述第一配准信息大于或等于所述预设阈值的情况下,通过所述GPU获取所述参考帧图像内的非人像区域仿射变换系数,并将所述参考帧图像的非人像区域仿射变换系数作为所述目标帧图像的所述目标非人像区域仿射变换系数。
  13. 根据权利要求11所述方法,其特征在于,所述方法还包括:
    在所述第一配准信息大于或等于所述预设阈值的情况下,通过所述GPU获取所述参考帧图像的非人像区域仿射变换系数;
    通过所述GPU获取所述低分辨率图像内的非人像区域对应的目标非人像区域仿射变换系数;
    通过所述GPU确定所述目标帧图像的所述非人像区域所有像素点中第一像素点对应的仿射变换系数为所述参考帧图像对应的非人像区域仿射变换系数,第二像素点对应的仿射变换系数为所述目标非人像区域仿射变换系数,其中,所述第一像素点与所述第二像素点间隔分布。
  14. 根据权利要求10所述方法,其特征在于,所述通过图像处理单元GPU根据所述目标双边网格和所述仿射变换信息,通过双边引导上采样BGU插值方法获取所述低分辨率图像上采样后的仿射变换矩阵,包括:
    根据所述目标双边网格和所述仿射变换信息,通过所述BGU插值方法获取所述低分辨率图像内人像区域对应的目标人像区域仿射变换系数;
    将所述目标帧图像内的非人像区域和前一帧图像内的非人像区域进行几何配准,获得 第二配准信息,所述第二配准信息用于指示所述目标帧图像的非人像区域和所述前一帧图像的非人像区域之间的相似度;
    在所述第二配准信息大于或等于预设阈值的情况下,获取所述低分辨率图像内的非人像区域对应的目标非人像区域仿射变换系数和前一帧图像内的非人像区域对应的非人像区域仿射变换系数;
    确定所述目标帧图像的所述非人像区域内每五个像素点中第三像素点对应的仿射变换系数为所述目标非人像区域仿射变换系数,除第三像素点外的其他四个像素点对应的仿射变换系数为所述前一帧图像对应的非人像区域仿射变换系数,其中,所述每五个像素点呈十字形分布,所述第三像素点处于所述十字形的中心,且所述下一帧图像对应的所述第三像素点与所述目标帧图像对应的第三像素点的位置不同。
  15. 一种计算机可读存储介质,其特征在于,包括计算机指令,当所述计算机指令在电子设备上运行时,使得所述电子设备执行如权利要求8-14中任一项所述的方法。
  16. 一种计算机程序,其特征在于,所述计算机程序包括指令,当所述计算机程序被计算机执行时,使得所述计算机执行如权利要求8-14中任意一项所述的方法。
  17. 一种芯片,其特征在于,所述芯片包括至少一个处理器,存储器和接口电路,所述存储器、所述接口电路和所述至少一个处理器通过线路互联,所述至少一个存储器中存储有指令;所述指令被所述处理器执行时,权利要求8-14中任意一项所述的方法得以实现。
PCT/CN2021/088171 2020-04-22 2021-04-19 一种画质增强装置及相关方法 WO2021213336A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010322718.8A CN113538211A (zh) 2020-04-22 2020-04-22 一种画质增强装置及相关方法
CN202010322718.8 2020-04-22

Publications (1)

Publication Number Publication Date
WO2021213336A1 true WO2021213336A1 (zh) 2021-10-28

Family

ID=78123976

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/088171 WO2021213336A1 (zh) 2020-04-22 2021-04-19 一种画质增强装置及相关方法

Country Status (2)

Country Link
CN (1) CN113538211A (zh)
WO (1) WO2021213336A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114339412B (zh) * 2021-12-30 2024-02-27 咪咕文化科技有限公司 视频质量增强方法、移动终端、存储介质及装置
CN114501139A (zh) * 2022-03-31 2022-05-13 深圳思谋信息科技有限公司 一种视频处理方法、装置、计算机设备和存储介质
CN117437163A (zh) * 2023-10-31 2024-01-23 奕行智能科技(广州)有限公司 一种图像增强方法、图像处理芯片及图像增强视频系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170278224A1 (en) * 2016-03-22 2017-09-28 Algolux Inc. Method and system for denoising and demosaicing artifact suppression in digital images
CN109754007A (zh) * 2018-12-27 2019-05-14 武汉唐济科技有限公司 前列腺手术中外包膜智能检测和预警方法及系统
CN109961404A (zh) * 2017-12-25 2019-07-02 沈阳灵景智能科技有限公司 一种基于gpu并行计算的高清视频图像增强方法
CN110062282A (zh) * 2019-03-18 2019-07-26 北京奇艺世纪科技有限公司 一种超分辨率视频重建方法、装置及电子设备
CN110428362A (zh) * 2019-07-29 2019-11-08 深圳市商汤科技有限公司 图像hdr转换方法及装置、存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4480760B2 (ja) * 2007-12-29 2010-06-16 株式会社モルフォ 画像データ処理方法および画像処理装置
US10579908B2 (en) * 2017-12-15 2020-03-03 Google Llc Machine-learning based technique for fast image enhancement
CN110634147B (zh) * 2019-09-19 2023-06-23 延锋伟世通电子科技(上海)有限公司 基于双边引导上采样的图像抠图方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170278224A1 (en) * 2016-03-22 2017-09-28 Algolux Inc. Method and system for denoising and demosaicing artifact suppression in digital images
CN109961404A (zh) * 2017-12-25 2019-07-02 沈阳灵景智能科技有限公司 一种基于gpu并行计算的高清视频图像增强方法
CN109754007A (zh) * 2018-12-27 2019-05-14 武汉唐济科技有限公司 前列腺手术中外包膜智能检测和预警方法及系统
CN110062282A (zh) * 2019-03-18 2019-07-26 北京奇艺世纪科技有限公司 一种超分辨率视频重建方法、装置及电子设备
CN110428362A (zh) * 2019-07-29 2019-11-08 深圳市商汤科技有限公司 图像hdr转换方法及装置、存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
MICHAEL GHARBI , JIAWEN CHEN, JONATHAN T BARRON, SAMUEL W HASINOFF , FREDO DURAND: "Deep bilateral learning for real-time image enhancement", ACM TRANSACTIONS ON GRAPHICS, vol. 36, no. 4, 20 July 2017 (2017-07-20), US , pages 1 - 12, XP058372892, ISSN: 0730-0301, DOI: 10.1145/3072959.3073592 *

Also Published As

Publication number Publication date
CN113538211A (zh) 2021-10-22

Similar Documents

Publication Publication Date Title
WO2021213336A1 (zh) 一种画质增强装置及相关方法
US10579908B2 (en) Machine-learning based technique for fast image enhancement
CN111491148B (zh) 显示方法
RU2589857C2 (ru) Кодирование, декодирование и представление изображений с расширенным динамическим диапазоном
JP5476793B2 (ja) 元画像のダイナミックレンジの圧縮方法と装置及びデジタルカメラ
CN113518185B (zh) 视频转换处理方法、装置、计算机可读介质及电子设备
JP7359521B2 (ja) 画像処理方法および装置
US11715184B2 (en) Backwards-compatible high dynamic range (HDR) images
WO2023010754A1 (zh) 一种图像处理方法、装置、终端设备及存储介质
JP2011193511A (ja) 高ダイナミックレンジ画像データを復号化するための装置及び方法、表示用画像を処理可能なビューア、ならびに表示装置
CN113344773B (zh) 基于多级对偶反馈的单张图片重构hdr方法
CN112508812A (zh) 图像色偏校正方法、模型训练方法、装置及设备
CN116188296A (zh) 图像优化方法及其装置、设备、介质、产品
WO2023010751A1 (zh) 图像高亮区域的信息补偿方法、装置、设备及存储介质
Zhang et al. Multi-scale-based joint super-resolution and inverse tone-mapping with data synthesis for UHD HDR video
CN112819699A (zh) 视频处理方法、装置及电子设备
CN114240767A (zh) 一种基于曝光融合的图像宽动态范围处理方法及装置
CN111696034B (zh) 图像处理方法、装置及电子设备
US20230368489A1 (en) Enhancing image data for different types of displays
WO2023110880A1 (en) Image processing methods and systems for low-light image enhancement using machine learning models
TWI235608B (en) Method and apparatus for transforming a high dynamic range image into a low dynamic range image
CN112887597A (zh) 图像处理方法及装置、计算机可读介质和电子设备
CN113365107B (zh) 视频处理方法、影视视频处理方法及装置
CN115278090B (zh) 一种基于行曝光的单帧四曝光wdr处理方法
EP4377879A1 (en) Neural networks for dynamic range conversion and display management of images

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21793470

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21793470

Country of ref document: EP

Kind code of ref document: A1