WO2021077878A1 - 图像处理方法、装置及电子设备 - Google Patents

图像处理方法、装置及电子设备 Download PDF

Info

Publication number
WO2021077878A1
WO2021077878A1 PCT/CN2020/110029 CN2020110029W WO2021077878A1 WO 2021077878 A1 WO2021077878 A1 WO 2021077878A1 CN 2020110029 W CN2020110029 W CN 2020110029W WO 2021077878 A1 WO2021077878 A1 WO 2021077878A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
residual
super
mask
division
Prior art date
Application number
PCT/CN2020/110029
Other languages
English (en)
French (fr)
Other versions
WO2021077878A9 (zh
Inventor
王恒铭
汪洋
季杰
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP20878773.9A priority Critical patent/EP4036854A4/en
Publication of WO2021077878A1 publication Critical patent/WO2021077878A1/zh
Publication of WO2021077878A9 publication Critical patent/WO2021077878A9/zh
Priority to US17/726,218 priority patent/US20220245765A1/en

Links

Images

Classifications

    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G5/00Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
    • G09G5/003Details of a display terminal, the details relating to the control arrangement of the display terminal and to the interfaces thereto
    • G09G5/005Adapting incoming signals to the display format of the display terminal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/14Digital output to display device ; Cooperation and interconnection of the display device with other functional units
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4046Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G5/00Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
    • G09G5/36Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of a graphic pattern, e.g. using an all-points-addressable [APA] memory
    • G09G5/39Control of the bit-mapped memory
    • G09G5/391Resolution modifying circuits, e.g. variable screen formats
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/95Computational photography systems, e.g. light-field imaging systems
    • H04N23/951Computational photography systems, e.g. light-field imaging systems by using two or more images to influence resolution, frame rate or aspect ratio
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G2340/00Aspects of display data processing
    • G09G2340/04Changes in size, position or resolution of an image
    • G09G2340/0407Resolution change, inclusive of the use of different resolutions for different screen areas
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G2370/00Aspects of data communication
    • G09G2370/02Networking aspects
    • G09G2370/027Arrangements and methods specific for the display of internet documents

Definitions

  • This application relates to the field of image processing, and in particular to an image processing method, device, and electronic equipment.
  • Super resolution means super resolution.
  • a super-resolution display device is a display device that can perform super-division processing on an image, and the super-division processing is a technology that reconstructs a low-resolution image into a high-resolution image.
  • the super-division display device inputs the decoded image into the super-division model, and the super-division model performs super-division processing on the image.
  • the super-division model performs super-division processing on the image.
  • an image processing method requires a large amount of calculation for the super-division processing and high computational cost.
  • the embodiments of the present application provide an image processing method, device, and electronic equipment, which can reduce the amount of calculation in the current super-division processing and reduce the calculation cost.
  • the following introduces this application through different aspects. It should be understood that the implementation manners and beneficial effects of the following different aspects can be referred to each other.
  • An embodiment of the present application provides an image processing method, the method includes:
  • the residual block includes a plurality of residual points corresponding to a plurality of pixel positions of the first image one-to-one, each The residual point has a residual value; based on the residual block, determine the target pixel area in the first image; perform super-division processing on the target pixel area in the first image to obtain the target pixel area after the super-division processing; adopt The other pixel areas in the previous frame of image after the super-division processing update other pixel areas in the first image, and the other pixel areas include pixel areas in the first image excluding the target pixel area. In this way, there is no need to perform super-division processing on other pixel regions to achieve the same effect as performing super-division processing on other pixel regions.
  • the first image after the super-division processing includes the target image area after the super-division processing and other updated pixel areas (equivalent to other pixel areas after the super-division processing).
  • the super-division processing of the area where the pixel points of the first image and the previous frame image are different is realized, and adopts The other pixel areas in the previous frame of the image after the super-division processing update other pixel areas in the first image, achieving the same effect of super-division processing other pixel areas, making full use of the video temporal redundancy characteristics, so through Performing super-division processing on a partial area of the first image achieves the effect of performing all super-division processing on the first image, reduces the amount of calculation in the super-division processing, and reduces the computational cost.
  • the other pixel area includes the pixel area in the first image other than the target pixel area, the size of the target image area and other pixel areas may or may not match. Accordingly, the first image after the super-division processing is obtained.
  • the method is also different, and the embodiments of this application take the following two examples as examples for description:
  • the size of the target image area matches the size of other pixel areas, that is, the other pixel areas are pixel areas in the first image excluding the target pixel area; correspondingly, the target image area after super-division processing And the size of the updated other pixel area also matches. Then, the first image after the super-division processing can be formed by splicing the target pixel area after the super-division processing and the updated other pixel areas.
  • the size of the target image area does not match the size of other pixel areas, and there is an overlapping area at the edges of the two, that is, the other pixel areas are the basis of the pixel area in the first image except the target pixel area It also includes other pixel areas.
  • the size of the target image area after the super-division processing and the other pixel area after the update do not match, and there is an overlapping area at the edges of the two. Since other pixel areas of the first image are updated from other pixel areas of the second image after the super-division processing, the pixel data of the included pixels is usually more accurate.
  • the super-division processing of the first image overlaps
  • the pixel data of the area is usually based on the pixel data of the updated other pixel areas of the first image.
  • the first image after the super-division processing can be spliced by the updated target pixel area and the updated other pixel areas.
  • the updated target pixel area is the target pixel area after the super-division processing minus (also called removal ) Obtained from the overlapping area with other pixel areas.
  • the updated target pixel area is shrunk relative to the target pixel area before the update, and the size of the updated target pixel area matches the size of other pixel areas after the update.
  • the target pixel area includes an area where a pixel point corresponding to a first target residual point position in the first image is located, and the first target residual point is that the residual value in the residual block is greater than a specified threshold Point.
  • the specified threshold is 0.
  • the target pixel area is the area where the pixel points corresponding to the positions of the first target residual point and the second target residual point in the first image are located, and the second target residual point is the area in the residual block.
  • the residual points around the first target residual point refer to the residual points set around the first target residual point, which are the peripheral points of the first target residual point that meet the specified conditions. For example, it is the upper, lower, left, and right residual points of the first target residual point; or, it is the upper, lower, left, right, upper left, lower left, upper right, and lower right residual points of the first target residual point.
  • the foregoing specified conditions are determined based on the requirements of the super-division processing, for example, based on the setting of the receptive field (such as the receptive field of the last convolutional layer) in the super-division model.
  • the second target residual point is a residual point whose residual value around the first target residual point in the residual block is not greater than the specified threshold. That is, the second target residual point is the first target residual point, and among the peripheral points that meet the specified condition, the residual value is not greater than the specified threshold.
  • the residual value of all residual points in the residual block is 0, it means that the content of the first image and the previous frame image has not changed, and the image after the two super-resolution processing should not change, and the super-resolution processing can be used.
  • the next previous frame image updates the first image to achieve the same effect as the super-division processing of the first image. There is no need to perform super-division processing on the first image, that is, there is no need to perform the action of determining the target pixel area, which can effectively reduce Small computational cost.
  • determining the target pixel area in the first image may include: when the residual value of at least one residual point included in the residual block is not 0, based on the residual block, in the first image Determine the target pixel area. That is, when a residual point whose residual value is not 0 is included in the residual block, the action of determining the target pixel area is performed again.
  • the determining a target pixel area in the first image based on the residual block includes:
  • a mask pattern is generated, the mask pattern includes a plurality of first mask points, and the positions of the plurality of first mask points and the plurality of target residual points in the residual block One-to-one correspondence; the mask graphics and the first image are input into a super-division model, and the first image is matched with each of the plurality of first mask points through the super-division model The area where the pixel point corresponding to the point position is located is determined as the target pixel area.
  • the performing super-division processing on the target pixel in the first image to obtain the target pixel after the super-division processing includes:
  • the determining a target pixel area in the first image based on the residual block includes:
  • a mask pattern is generated, the mask pattern includes a plurality of first mask points, and the positions of the plurality of first mask points and the target residual points in the residual block are one by one Corresponding; determining the area of the pixel point corresponding to the position of each of the plurality of first mask points in the first image as the target pixel area.
  • the performing super-division processing on the target pixel in the first image to obtain the target pixel after the super-division processing includes:
  • the target pixel points in the first image are input into a super-division model, and the target pixel region in the first image is super-division processed by the super-division model to obtain a target pixel region after the super-division processing.
  • the generating a mask pattern based on the residual block includes:
  • an initial mask pattern including a plurality of mask points is generated, and the plurality of mask points are in one-to-one correspondence with a plurality of pixel point positions of the first image, and the plurality of mask points Including the plurality of first mask points and a plurality of second mask points; assign the mask value of the first mask point in the initial mask pattern to a first value, and set the mask pattern Assigning a mask value of the second mask point to a second value to obtain the mask pattern, and the first value and the second value are different;
  • the determining the pixel points corresponding to the positions of the plurality of first mask points in the first image as the target pixel area includes: traversing the mask points in the mask pattern, and traversing the mask points in the mask pattern.
  • the pixel point corresponding to the mask point whose mask value is the first value is determined as the target pixel area.
  • the generating a mask pattern based on the residual block includes:
  • the morphological change processing includes binarization processing and performing a first mask point in the residual block after binarization Expansion treatment.
  • the binarization process and the expansion process can be executed sequentially.
  • the binarization processing is a processing method in which the pixel value of each pixel in the image is set to one of the first value and the second value, and the first value and the second value are different. After the image is binarized, it only includes pixels with two pixel values.
  • the binarization process can reduce the interference of various elements in the image to the subsequent image processing process.
  • the binarization processing in the embodiment of the present application may adopt any method among the global binarization threshold method, the local binarization threshold method, the maximum between-class variance method, and the iterative binarization threshold method. Not limited.
  • Dilation processing is a processing for finding a local maximum.
  • the image to be processed is convolved with a preset kernel (also called kernel).
  • kernel also called kernel
  • the maximum value in the kernel coverage area is assigned to the specified pixel, so that the brighter is brighter.
  • the result is the waiting The bright areas of the processed image swell.
  • the core has a definable anchor point, the anchor point is usually the center point of the core, and the aforementioned designated pixel is the anchor point.
  • the super-division model includes at least one convolution kernel, and the size of the receptive field of the last convolution layer of the super-division model is the same as that of the kernel for expansion processing.
  • the last convolutional layer is the output layer of the superdivision model.
  • the image processed by the superdivision model is output from this layer.
  • the receptive field of this layer is the largest of the receptive fields corresponding to each convolutional layer in the superdivision model. Feeling wild.
  • the mask graphics obtained in this way are adapted to the size of the maximum receptive field of the super-division model, thus playing a good guiding role, avoiding that the convolution area of the image input to the super-division model is too small, and the resulting image cannot be convoluted.
  • the situation of layered convolution occurs to ensure that the images input to the super-division model can be effectively super-divided.
  • the size of the core may be 3 ⁇ 3 or 5 ⁇ 5 pixels.
  • the generating a mask pattern based on the residual block includes:
  • the residual block is divided into a plurality of sub-residual blocks, and block processing is performed on each sub-residual block obtained by division, and the block processing includes:
  • the sub-residual block is divided into a plurality of sub-residual blocks, and execution is performed on each sub-residual block obtained by division
  • the residual values of the residual points included in the divided sub-residual blocks are all 0, or the total number of residual points of the divided sub-residual blocks is less than the point number threshold, or the residual value of the residual
  • the total number of times the difference block is divided reaches the number threshold; a sub-mask pattern corresponding to each target residual block is generated, and the residual value of at least one residual point included in the target residual block is not 0; wherein, the mask The film pattern includes the generated sub-mask pattern.
  • the performing super-division processing on the target pixel in the first image to obtain the target pixel after the super-division processing includes:
  • the target image block corresponding to each of the sub-mask patterns in the first image perform super-division processing on the sub-regions of the target pixel area contained in each target image block, respectively, to obtain the super
  • the target pixel area after the subdivision processing, the target pixel area after the superdivision processing is composed of sub-areas of the target pixel area included in each of the target image blocks after the superdivision processing.
  • multiple target image blocks can be filtered out from the first image, and each target image block is subjected to super-division processing. Because the super-division processing is performed on the target image block, and each target image block The size of is smaller than the first image, so the computational complexity of the super-division processing can be reduced, and the computational cost can be reduced. Especially when the super-division processing is executed by the super-division model, the complexity of the super-division model can be effectively reduced, and the efficiency of the super-division operation can be improved.
  • the dividing the residual block into a plurality of sub-residual blocks includes: dividing the residual block into a plurality of sub-residual blocks in a quad-tree division manner;
  • the dividing the sub-residual block into a plurality of sub-residual blocks includes: dividing the sub-residual block into a plurality of sub-residual blocks in a quad-tree division manner.
  • the block Since the traditional video decoding process needs to divide the image into blocks, the block usually adopts the quadtree division method. Therefore, in the embodiment of the present application, when the residual block and the sub-residual block adopt the quad tree division method, Compatible with traditional image processing methods.
  • the residual block and the sub-residual block can also be divided by the image division module used in the aforementioned video decoding process, so as to realize the multiplexing of the modules and save the computational cost.
  • the quad-tree division method can be used to divide the residual block or sub-residual block into four sub-residual blocks of equal size during each division, so that the size of the residual block obtained by the division is uniform, which is convenient for subsequent processing.
  • the method before the other pixel regions in the previous frame image after the super-division processing is used to update the other pixel regions in the first image, the method further includes:
  • Corrosion processing is performed on multiple first mask points in the mask pattern to obtain an updated mask pattern.
  • the nucleus of the corrosion processing is the same size as the receptive field of the last convolutional layer of the hyperdivision model ; Determine the pixel points corresponding to the positions of the plurality of first mask points after the etching process in the first image as auxiliary pixels; Except the auxiliary pixels in the first image The area where the pixel points of is determined as the other pixel area.
  • the corrosion process can eliminate the edge noise of the image.
  • the other pixel areas determined by the updated mask pattern obtained by the corrosion process have clearer edges than the other pixel areas obtained by the first and second alternatives.
  • the noise is smaller, and the subsequent implementation of the pixel update process can reduce the negative effects of detail blur, edge dullness, graininess and noise enhancement, and ensure the display effect of the final reconstructed first image.
  • determining the target pixel area in the first image before determining the target pixel area in the first image, based on the residual block, determining the target pixel area in the first image includes: counting the residual value in the residual block The number of residual points of 0 is the first proportion of the total number of residual points in the residual block; when the first proportion is greater than the first over-division trigger proportion threshold, based on the residual block, the The target pixel area is determined in the first image.
  • the super-division model is a CNN model, such as a model such as SRCNN or ESPCN; the super-division model may also be a GAN, such as a model such as SRGAN or ESRGAN.
  • an exemplary embodiment of the present application provides an image processing device, which includes one or more modules, and the one or more modules are used to implement any one of the image processing methods in the first aspect.
  • an embodiment of the present application provides an electronic device, such as a terminal.
  • the electronic device includes a processor and a memory.
  • the processor usually includes a CPU and/or GPU.
  • the memory is used to store a computer program; the processor is used to implement any one of the aforementioned image processing methods in the first aspect when the computer program stored in the memory is executed.
  • the CPU and GPU can be two chips, or they can be integrated on the same chip.
  • embodiments of the present application provide a storage medium, which may be non-volatile.
  • a computer program is stored in the storage medium, and when the computer program is executed by a processor, the processor implements any one of the image processing methods in the first aspect.
  • the embodiments of the present application provide a computer program or computer program product containing computer-readable instructions.
  • the computer program or computer program product runs on a computer, the computer can execute any image processing of the first aspect mentioned above.
  • the computer program product may include one or more program units for implementing the foregoing method.
  • this application provides a chip, such as a CPU.
  • the chip includes a logic circuit, and the logic circuit may be a programmable logic circuit.
  • the chip is running, it is used to implement any one of the aforementioned image processing methods in the first aspect.
  • this application provides a chip, such as a CPU.
  • the chip includes one or more physical cores and a storage medium, and the one or more physical cores implement any one of the aforementioned image processing methods in the first aspect after reading computer instructions in the storage medium.
  • this application provides a chip, such as a GPU.
  • the chip includes one or more physical cores and a storage medium, and the one or more physical cores implement any one of the aforementioned image processing methods of the first aspect after reading computer instructions in the storage medium.
  • this application provides a chip, such as a GPU.
  • the chip includes a logic circuit, and the logic circuit may be a programmable logic circuit.
  • the chip is running, it is used to implement any one of the aforementioned image processing methods of the first aspect.
  • the super-division of the area where the pixel points of the first image and the previous frame image are different is realized.
  • Sub-processing, and the other pixel areas in the previous frame of the image after the super-division processing are used to update other pixel areas in the first image, achieving the same effect of super-division processing other pixel areas, making full use of video temporal redundancy Therefore, by performing super-division processing on a partial area of the first image, the effect of performing all super-division processing on the first image is achieved, which reduces the amount of calculation in the super-division processing and reduces the computational cost.
  • test video processed by the embodiment of the present application can save about 45% of the amount of super-division calculation compared to directly performing all the super-division processing on the video in the traditional technology.
  • the significant reduction in the amount of super-division calculations is conducive to speeding up the video processing speed, ensuring that the video can meet the basic frame rate requirements, thereby ensuring the real-time nature of the video, and there will be no delays in playback;
  • the reduction in the amount of calculation means less processing tasks and consumption of the computing unit in the super-division display device, which brings about a decrease in the overall power consumption and saves the power consumption of the device.
  • the partial super-division algorithm proposed in the embodiments of the present application is not a method that only super-divisions part of the image area, and other parts are processed by non-super-division methods to sacrifice the effect for efficiency, but avoid the unchanging area of the video before and after the frame.
  • the repetitive overscore of redundant time-domain information is essentially a method of pursuing the maximization of information utilization.
  • the super-resolution model is instructed to perform super-resolution accurate to the pixel level.
  • the final processed video essentially all the pixel values of each frame of image come from the super-resolution calculation result, which has the same display effect as the traditional all-super-resolution algorithm, avoiding the sacrifice of the display effect.
  • the complexity of each super-division processing of the super-division model is low, and the structural complexity of the super-division model is required to be relatively low. Low, which can simplify the super-division model, reduce the requirements for processor performance, and improve the efficiency of super-division processing.
  • FIG. 1 is a schematic structural diagram of a super-resolution display device involved in an image processing method provided by an embodiment of the present application
  • Fig. 2 is a flowchart of an image processing method provided by an embodiment of the present application.
  • Fig. 3 is a schematic diagram of pixel values of a first image provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of pixel values of a second image provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of a residual block provided by an embodiment of the present application.
  • Fig. 6 is a schematic diagram illustrating the principle of the residual block shown in Fig. 5;
  • FIG. 7 is a schematic diagram of a process for determining a target pixel area in a first image according to an embodiment of the present application
  • FIG. 8 is a schematic diagram of an expansion treatment principle provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of another expansion treatment principle provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of another process for determining a target pixel area in a first image according to an embodiment of the present application.
  • FIG. 11 is a schematic diagram of a corrosion treatment principle provided by an embodiment of the present application.
  • FIG. 12 is a schematic diagram of the principle of updating other pixel regions K1 in the first image by using super-division processing in other pixel regions K1 in the second image according to an embodiment of the present application;
  • FIG. 13 is a schematic diagram of the principle of an image processing method provided by an embodiment of the present application.
  • FIG. 14 is a block diagram of an image processing device provided by an embodiment of the present application.
  • FIG. 15 is a block diagram of a first determining module provided by an embodiment of the present application.
  • Figure 16 is a block diagram of another image processing device provided by an embodiment of the present application.
  • Fig. 17 is a block diagram of an electronic device provided by an embodiment of the present application.
  • Image resolution used to reflect the amount of information stored in the image, referring to the total number of pixels in the image.
  • the image resolution is usually expressed by the number of horizontal pixels ⁇ the number of vertical pixels.
  • 1080p A display format. P stands for Progressive scan. The image resolution of 1080p is usually 1920 ⁇ 1080.
  • 2k resolution a display format
  • the corresponding image resolution is usually 2048 ⁇ 1152.
  • 4k resolution a display format, the corresponding image resolution is usually 3840 ⁇ 2160.
  • 480p resolution a display format, the corresponding image resolution is usually 640 ⁇ 480.
  • 360p resolution a display format, the corresponding image resolution is usually 480 ⁇ 360.
  • Super resolution that is, super resolution (Super Resolution)
  • super resolution processing is a technology to reconstruct a low-resolution image into a high-resolution image, that is, the image resolution of the reconstructed image is greater than the image resolution of the image before reconstruction .
  • the reconstructed image is also called a super-division image.
  • the super-division processing may be to reconstruct a 360p resolution image into a 480p resolution image, or to reconstruct a 2k resolution image into a 4K resolution image.
  • Color space also called color model, color space or color system, is used to reflect the colors involved in an image. Different color spaces correspond to different color coding formats.
  • the two commonly used color spaces are YUV color space and RGB color space (each color in the color space is also called a color channel), and the corresponding color coding format is YUV format. And RGB format.
  • the pixel value of a pixel includes: the value of the luminance component Y, the value of the chrominance component U, and the value of the chrominance component V;
  • the color coding format is the RGB format, a
  • the pixel value of the pixel includes the value of the transparency component and the value of a plurality of color components, and the plurality of color components may include a red component R, a green component G, and a blue component B.
  • CNN Convolutional Neural Network
  • Neuron can respond to a part of the surrounding units in the coverage area, and can process the image according to the image characteristics.
  • the basic structure of a convolutional neural network includes two layers. One is a feature extraction layer. The input of each neuron is connected to the local receptive field of the previous layer, and the feature of the local receptive field is extracted. The second is the feature mapping layer. Each feature mapping layer of the network consists of multiple feature mappings, and each feature mapping is a plane.
  • the feature mapping layer is provided with an activation function (activation function), and the usual activation function is a non-linear mapping function, which can be a sigmoid function or a neural network review (Rectified linear unit, ReLU) function.
  • Convolutional neural networks are composed of a large number of nodes (also called “neurons” or “units”) connected to each other, and each node represents a specific output function. Each connection between two nodes represents a weighted value, which is called weight. Different weights and activation functions will lead to different outputs of the convolutional neural network.
  • a convolutional neural network includes at least one convolutional layer, and each convolutional layer includes a feature extraction layer and a feature mapping layer.
  • the convolutional neural network includes multiple convolutional layers, the multiple convolutional layers Connected in turn, the Receptive Field is the feature map output by each convolutional layer of the convolutional neural network.
  • Each pixel on the feature map is mapped on the original image (referring to the image input to the convolutional neural network) The size of the area.
  • convolutional neural networks avoids complex pre-processing of images (extracting artificial features, etc.), and can directly input original images for end-to-end learning.
  • One of the advantages of convolutional neural networks compared to traditional neural networks is that traditional neural networks are fully connected, that is, neurons from the input layer to the hidden layer are all connected. This will result in parameter values. It is huge, making network training time-consuming or even difficult to train. Convolutional neural networks avoid this problem through local connections and weight sharing.
  • FIG. 1 is a schematic structural diagram of a super-resolution display device 10 involved in an image processing method provided by an embodiment of the present application.
  • the super-resolution display device 10 may be a smart TV, a smart screen, a smart phone, or a tablet computer. , Electronic paper, monitors, notebook computers, digital photo frames or navigators and other products or components with display functions and super-resolution processing functions.
  • the super-division display device 10 includes a processor 101, a display control module 102, and a memory 103.
  • the processor 101 is used to process images in a video obtained from a video source and transmit the processed images to the display control module. 102.
  • the processed image is adapted to the format requirements of the display control module 102; the display control module 102 is used to process the received processed image to obtain a drive signal adapted to the display module (not marked in FIG. 1), The display module is driven to display images based on the driving signal; the memory 103 is used to store video data.
  • the processor 101 may include a central processing unit (CPU) and/or a graphics processing unit (GPU), and the processor 101 may be integrated on a graphics card;
  • the display control module 102 may be a time sequence A controller (Timing Controller, TCON) or a microcontroller unit (Microcontroller Unit; MCU);
  • the display module may be a display screen;
  • the memory 103 may be a double-rate dynamic random access memory (Double Data Rate, DDR).
  • the memory 103 stores the super-division model 1031
  • the processing process of the processor 101 on the image in the video obtained from the video source may include: decoding the video obtained from the video source, and decoding the obtained video
  • the image is preprocessed (such as the subsequent step 201 or step 202, etc.), the preprocessed image is input to the super-division model 1031, and the pre-processed image is super-division processed through the super-division model 1031.
  • the super-division model can be a CNN model, such as Super-Resolution Convolutional Neural Network (SRCNN) or Efficient sub-pixel Convolutional Neural Network (ESPCN), etc.
  • the super-division model can also be a Generative Adversarial Network (GAN), such as a Super-Resolution Generative Adversarial Network (SRGAN) or an enhanced super-resolution Generative Adversarial Network (Enhanced Adversarial Network).
  • GAN Generative Adversarial Network
  • SRGAN Super-Resolution Generative Adversarial Network
  • Enhanced Adversarial Network Enhanced Adversarial Network.
  • ESRGAN Super-Resolution Generative Adversarial Networks
  • the super-division display device directly inputs the decoded image into the super-division model, and the super-division model performs super-division processing on the image.
  • the super-division model performs super-division processing on the image.
  • an image processing method requires a large amount of calculation and high computational cost for the super-division processing.
  • the embodiment of the application provides an image processing method, and a partial super-resolution algorithm is proposed, which can reduce the computational complexity and cost of the super-division processing.
  • the image processing method can be applied to the super-division display device shown in FIG. May include multiple images.
  • the embodiment of the present application uses a first image as an example to describe the image processing method.
  • the first image is a frame of image in the video, and the first image is a non-first frame of image in the video. (That is not the first frame of image), other processing methods of non-first frame of image can refer to the processing method of the first image, assuming that the previous frame of image adjacent to the first image is the second image, as shown in Figure 2 ,
  • the method includes:
  • Step 201 The super-division display device obtains the inter-frame residual of the first image and the second image to obtain a residual block.
  • the inter-frame residual error refers to the absolute value difference of the pixel values of two adjacent frames of the video, which can reflect the content changes of the adjacent two frames of images (that is, the changes in pixel values).
  • the residual block is the result of obtaining the residual between frames.
  • the size of the residual block is the same as the size of the first image and the size of the second image.
  • the residual block includes a one-to-one correspondence with multiple pixel positions of the first image A plurality of residual points, each residual point has a residual value, and each residual value is the absolute value of the difference between the pixel value of the pixel point at the corresponding position of the first image and the second image.
  • the embodiment of the present application uses the following two methods as examples to illustrate the method of obtaining the residual block:
  • the first possible implementation manner is to calculate the inter-frame residual of the first image and the second image to obtain a residual block.
  • the inter-frame residual is the absolute value of the difference between the pixel values of the pixels at the corresponding positions of the first image and the second image. Assuming that the first image is the t-th frame image in the video, and t>1, then the second image is the t-1th frame image.
  • the color coding format involved is an RGB coding format.
  • the value of the transparency component in the pixel values of the pixels of the first image and the second image is usually ignored in the inter-frame residual, and the pixel values of the pixels of the first image and the second image include red
  • the inter-frame residual Residual includes Residual[R], Residual[G] and Residual[B]
  • the inter-frame residual Residual satisfies:
  • Residual[R] Absdiff(R Frame(t-1) , R Frame(t) ); (Formula 1)
  • Residual[B] Absdiff(B Frame(t-1) , B Frame(t) ). (Formula 3)
  • Absdiff means to calculate the absolute value of the difference between the pixel values (R value, G value or B value) of the pixels at the corresponding positions of the two images;
  • Frame(t) is the first image;
  • Frame(t-1) is the first image Two images;
  • R Frame(t-1) represents the red component R value of the second image,
  • R Frame(t) represents the red component R value of the first image;
  • G Frame(t-1) represents the green component G of the second image Value,
  • G Frame(t) represents the G value of the green component of the first image;
  • B Frame(t-1) represents the B value of the blue component of the second image, and
  • B Frame(t) represents the B value of the blue component of the first image .
  • the color coding format involved is the YUV coding format.
  • the inter-frame residual Residual satisfies:
  • the acquisition methods of Residual[R], Residual[G] and Residual[B] refer to the aforementioned formulas 1 to 3, respectively.
  • the inter-frame residual Residual is the inter-frame residuals of the aforementioned RGB three color channels converted according to a certain ratio. .
  • the inter-frame residual Residual satisfies:
  • Residual[Y1] is the value of the brightness component of the first image, which is converted from the RGB value of the first image according to a certain ratio
  • Residual[Y2] is the value of the brightness component of the second image, which is derived from the second image
  • the RGB value is converted according to a certain ratio.
  • the certain ratio can be the ratio in Formula 4, that is, the ratios of R value, G value and B value are 0.299, 0.587 and 0.144, respectively.
  • the first image and the second image respectively include 5 ⁇ 5 pixels, and the pixel value is represented by the brightness component Y.
  • the pixel value of the pixel of the first image is shown in Figure 3
  • the pixel value of the second image is The pixel value is shown in Figure 4
  • the finally obtained residual block is shown in Figure 5, including 5 ⁇ 5 residual points, and the residual value of each residual point is the value of the corresponding position in the first image and the second image.
  • the second achievable way is to obtain the pre-stored inter-frame residual to obtain the residual block.
  • the first image acquired by the super-division display device is an image that has been decoded.
  • the video decoding process it involves the calculation process of the residual error between two adjacent frames. Therefore, in the video decoding process, you can The calculated inter-frame residuals of every two adjacent frames of images are stored, and when the residual block needs to be obtained, the pre-stored inter-frame residuals can be directly extracted to obtain the residual block.
  • the standard used by the processor for video decoding may be any one of H.261 to H.265, and MPEG-4V1 to MPEG-4V3.
  • H.264 also known as Advanced Video Coding (AVC)
  • H.265 also known as High Efficiency Video Coding (HEVC)
  • AVC Advanced Video Coding
  • HEVC High Efficiency Video Coding
  • H.265 Take H.265 as an example.
  • the coding architecture of H.265 is roughly similar to that of H.264, mainly including: entropy coding module, intra prediction module, inter prediction ) Module, inverse transform module, inverse quantization module, loop filter module and other modules.
  • the loop filter module includes a deblocking module and sample adaptive offset (Sample Adaptive Offset, SAO).
  • SAO sample adaptive offset
  • the entropy decoding module is used to process the bitstream provided by the video source to obtain mode information and inter-frame residuals. Then, after the entropy decoding module processes and obtains the inter-frame residual, the inter-frame residual can be stored to extract the inter-frame residual when step 201 is executed.
  • Obtaining the residual block in this second achievable way can reduce the repeated calculation of the residual between frames, reduce the calculation cost, and save the overall time of image processing. Especially in a scene where the image resolution of the video provided by the video source is relatively high, the image processing delay can be effectively reduced.
  • Step 202 The super-resolution display device detects whether the residual value of the residual point included in the residual block is 0, and if it is determined that the residual value of at least one residual point included in the residual block is not 0, perform step 203, if the residual is determined The residual value of all residual points included in the block is 0, and step 207 is executed.
  • the super-resolution display device can traverse each residual point in the residual block and detect whether the residual value of each residual point is 0.
  • step 203 can be executed.
  • the subsequent steps 203 to 206 correspond to the partial super-division algorithm.
  • step 207 can be executed.
  • the super-resolution display device may traverse the residual points in a scanning order from left to right and top to bottom.
  • Step 203 The super-division display device determines the target pixel area in the first image based on the residual block.
  • the target pixel area is the area where the pixel point corresponding to the position of the target residual point in the residual block in the first image is located, and the point in the first image corresponding to the position of the target residual point is called the target pixel point .
  • the target residual point includes two types of residual points: the first target residual point and the second target residual point.
  • the target pixel area includes an area where a pixel point corresponding to the position of the first target residual point in the first image is located.
  • the target pixel area is the area where the pixel points corresponding to the positions of the first target residual point and the second target residual point in the first image are located.
  • the first target residual point is a point in the residual block whose residual value is greater than a specified threshold
  • the second target residual point is a residual point around the first target residual point in the residual block.
  • the residual points around the first target residual point refer to the residual points set around the first target residual point, which are the peripheral points of the first target residual point that meet the specified conditions. For example, it is the upper, lower, left, and right residual points of the first target residual point; or, it is the upper, lower, left, right, upper left, lower left, upper right, and lower right residual points of the first target residual point.
  • the aforementioned specified threshold is 0.
  • the second target residual point is a residual point whose residual value around the first target residual point in the residual block is not greater than (that is, less than or equal to) the specified threshold. That is, the second target residual point is the first target residual point, and among the peripheral points that meet the specified conditions, the residual value is not greater than the specified threshold.
  • the second target residual point is a residual point with a residual value of 0 around the first target residual point in the residual block.
  • the area where the first target residual point is located is the area where the content of the first image and the second image are different. Based on this area, the area in the first image that actually needs to be super-divided can be found.
  • the processing usually needs to refer to the pixel values of the pixels in the surrounding area of the area, so it is also necessary to find the surrounding area of the area that needs to be overdivided in the first image, and the area where the second target residual point is located is the area where the first target residual point is located In the peripheral area, by determining the second target residual point, the peripheral area of the area in the first image that actually needs to be over-divided can be determined, so as to adapt to the requirements of the over-division processing and ensure subsequent effective over-division processing.
  • the foregoing specified conditions are determined based on the requirements of the super-division processing, for example, based on the setting of the receptive field (such as the receptive field of the last convolutional layer) in the super-division model.
  • the first target residual point is a point in the residual block whose residual value is greater than a specified threshold
  • the second target residual point is a residual point whose residual value around the first target residual point in the residual block is not greater than the specified threshold
  • the specified threshold is 0.
  • the area where the target residual point of the residual block is located is the area composed of the area K and the area M, where the area K It includes the first target residual point P1 and the surrounding second target residual points P7 to P14; the area M includes the first target residual points P2 to P6, and the surrounding second target residual points P15 to P19.
  • the finally determined first target residual points are P1 to P6, and the second target residual points include P7 to P19.
  • the determined target pixel in the first image is the pixel with the same positions as the residual points P1 to P19.
  • Fig. 6 takes the peripheral points that meet the specified conditions as the upper, lower, left, right, upper left, lower left, upper right, and lower right residual points of each first target residual point as an example for illustration, but it does not limit this.
  • multiple methods can be used to determine the target pixel area.
  • the embodiment of the present application uses the following two determination methods as examples for illustration:
  • the target pixel area is determined within the super-division model.
  • the process of determining the target pixel area in the first image based on the residual block includes:
  • Step 2031 the super-division display device generates a mask pattern based on the residual block.
  • the mask graphic is used to indicate the position of the target pixel area, and plays a guiding role in screening the target pixel area in the first image.
  • the mask pattern includes a plurality of first mask points, the positions of the plurality of first mask points are in one-to-one correspondence with the positions of the target residual points in the residual block, and the target residual points include at least the first target residual points,
  • the multiple target residual points include a first target residual point and a second target residual point; that is, multiple first mask points are used to identify the positions of multiple target residual points in the residual block.
  • the multiple target residual points of the difference block have a one-to-one correspondence with the multiple target pixel points in the target pixel area in the first image. Therefore, multiple first mask points are used to identify multiple target pixels in the first image The location of the area. The target pixel area where the target pixel is located can be found through the first mask point.
  • step 2031 is schematically illustrated by taking the following two optional implementation manners as examples:
  • morphological transformations can be performed on the residual block to obtain a mask pattern.
  • the morphological change processing includes binarization processing and dilation processing.
  • the residual block is first binarized to obtain the binarized residual block; then the binarized residual block is expanded to obtain the expanded residual block, and the expanded residual block The residual block is used as a mask pattern.
  • the binarization processing is a processing method in which the pixel value of each pixel in the image is set to one of the first value and the second value, and the first value and the second value are different. After the image is binarized, it only includes pixels with two pixel values.
  • the binarization process can reduce the interference of various elements in the image to the subsequent image processing process.
  • the residual value of each residual point in the residual block is the absolute value of the difference between the pixel values of the two pixels, and the residual block is equivalent to the difference image of the two images. Therefore, the residual block can also be regarded as an image, the residual points contained in it are equivalent to the pixel points of the image, and the residual value of the residual point is equivalent to the pixel value of the pixel point.
  • the residual value of each residual point includes the values of the three color components of R, G, and B, that is, the aforementioned Residual[R], Residual[G], and Residual[B], the residual point
  • the residual value of can be characterized by its gray value to simplify the calculation process.
  • the gray value of the residual point is used to reflect the brightness of the residual point, which can be obtained by converting the R, G, and B values of the residual point, that is, the residual value of the residual point, such as the aforementioned Residual[R], Residual[G] And Residual[B] conversion, the conversion process can refer to the traditional R, G, B value conversion process to obtain the gray value, this application will not repeat this.
  • the gray value range of the residual point is generally 0 to 255, the gray value of the white residual point is 255, and the gray value of the black residual point is 0.
  • the binarization threshold for example, the binarization threshold may be a fixed value or a possible value.
  • the variable value when it is a variable value, a local adaptive binarization method can be used to determine the binarization threshold).
  • the residual value of a residual point is set to the first value, and when the residual value of a residual point is less than or equal to the binarization threshold, Set the residual value of the residual point to the second value.
  • each residual block can be judged Whether the residual value of the residual point is greater than the binarization threshold (for example, the binarization threshold can be a fixed value or a variable value. When it is a variable value, a local adaptive binarization method can be used Determine the binarization threshold).
  • the residual value of a residual point is greater than the binarization threshold, the residual value of the residual point is set to the first value, and when the residual value of a residual point is less than or equal to the binarization threshold, Set the residual value of the residual point to the second value.
  • the aforementioned first value is non-zero, and the second value is zero.
  • the aforementioned first value is 255 and the second value is 0; or the first value is 1 and the second value is 0.
  • the smaller the value the smaller the storage space occupied. Therefore, the first value is usually set to 1 and the second value is set to 0 to save storage space.
  • the binarization threshold can be determined based on the aforementioned specified threshold. For example, in the RGB color space, since the gray value of the residual point is compared with the binarization threshold, which is converted from the residual value, the binarization threshold is The specified threshold is converted using the same conversion rule; in the YUV color space, since the comparison with the binarization threshold is the residual value of the residual point, the binarization threshold is equal to the specified threshold.
  • the specified threshold may be 0, and correspondingly, the binarization threshold may be 0.
  • the residual point corresponding to the first value is the aforementioned first target residual point.
  • the residual point of the first target can be located quickly and simply.
  • the binarization processing in the embodiment of the present application may adopt any method among the global binarization threshold method, the local binarization threshold method, the maximum between-class variance method, and the iterative binarization threshold method. Not limited.
  • Dilation processing is a processing for finding a local maximum.
  • the image to be processed is convolved with a preset kernel (also called kernel).
  • kernel also called kernel
  • the maximum value in the kernel coverage area is assigned to the specified pixel, so that the brighter is brighter.
  • the result is the waiting The bright areas of the processed image swell.
  • the core has a definable anchor point, the anchor point is usually the center point of the core, and the aforementioned designated pixel is the anchor point.
  • Figure 8 assumes that the image to be processed is F1
  • the image to be processed includes 5 ⁇ 5 pixels, where the shadow represents the bright spot, the core is the shadow part in F2, a total of 5 pixels, and the anchor point is At the center point B of the 5 pixels, the final image after expansion processing is F3.
  • "*" in Figure 8 means convolution.
  • the image to be processed is a residual block
  • performing expansion processing on the residual block refers to performing expansion processing on the first target residual point of the residual block after binarization.
  • the first target residual point is a bright spot
  • other residual points that is, those other than the first target residual point
  • the first value can be updated by a specified algorithm If it is a non-zero value, the second value is updated to 0, so that the first target residual point is a bright spot, and the other residual points are dark spots, and then the aforementioned expansion process is performed.
  • the residual block is F1 and the residual block after expansion is F3, then the residual point corresponding to the diagonal shade in F1 and F3 is the first target residual point.
  • the residual point corresponding to the " ⁇ " shaped shadow in F3 is the second target residual point.
  • the kernel in FIG. 9 is F4, which is different from F2 in FIG. 8.
  • the final residual block F3 after the expansion process is different, that is, the mask pattern is different.
  • the super-division model includes at least one convolution kernel, and the size of the receptive field of the last convolution layer of the super-division model is the same as that of the kernel for expansion processing.
  • the last convolutional layer is the output layer of the superdivision model.
  • the image processed by the superdivision model is output from this layer.
  • the receptive field of this layer is the largest of the receptive fields corresponding to each convolutional layer in the superdivision model. Feeling wild.
  • the mask graphics obtained in this way are adapted to the size of the maximum receptive field of the super-division model, thus playing a good guiding role, avoiding that the convolution area of the image input to the super-division model is too small, and the resulting image cannot be convoluted.
  • the situation of layered convolution occurs to ensure that the images input to the super-division model can be effectively super-divided.
  • the size of the core may be 3 ⁇ 3 or 5 ⁇ 5 pixels.
  • the algorithm for finding the m neighborhood refers to an open interval centered on the target point
  • the algorithm for finding the m neighborhood refers to, Obtain the open interval of m points adjacent to the target point with the target point as the center.
  • the expansion process of Figure 8 is equivalent to finding 4 neighborhoods for each first target residual point in Figure F1
  • the expansion process of Figure 9 is equivalent to each first target residual point in Figure F1.
  • the target residual point finds 8 neighborhoods.
  • the above step 2031 includes: firstly performing binarization processing on the residual block to obtain a binarized residual block; and then finding the m neighborhood of each first target residual point in the binarized residual block, The residual value of each target residual point (that is, the aforementioned first value) is filled into the corresponding m neighborhood to obtain a mask pattern.
  • the process of binarization can refer to the foregoing implementation manner, and the method for obtaining the m-neighborhood can refer to related technologies, which will not be repeated in this embodiment of the application.
  • the process of generating a mask pattern based on the residual block in the foregoing step 2031 can also be implemented in other ways, for example, including: generating a mask including multiple mask points based on the residual block The initial mask pattern, the plurality of mask points are in one-to-one correspondence with the positions of the plurality of pixel points of the first image, and the plurality of mask points include a plurality of first mask points and a plurality of second mask points.
  • the initial mask pattern is a pattern that determines the first mask point and the second mask point; the mask value of the first mask point in the initial mask pattern is assigned the first value, and the The mask value of the second mask point in the mask pattern is assigned a second value to obtain the mask pattern, and the first value is different from the second value.
  • the mask image obtained in this way is a binary image.
  • the residual block can be directly processed as a whole to generate the aforementioned mask pattern, or the residual block can be divided into multiple sub-residual blocks, and each sub-residual block can be processed to obtain the sub-mask pattern ,
  • the generated sub-mask graphics form the mask graphics. Since the mask graphics are generated in blocks, the computational complexity and the computational cost can be reduced.
  • the processor has strong computing power, it can execute the generation process of multiple sub-mask graphics at the same time, saving image processing time.
  • the step of generating a mask pattern may include:
  • Step A1 Divide the residual block into multiple sub-residual blocks, and perform block processing on each sub-residual block obtained by the division.
  • the block processing includes: When the residual of at least one residual point included in the sub-residual block When the difference value is not 0 (that is, the sub-residual block includes residual points whose residual value is not 0), the sub-residual block is divided into multiple sub-residual blocks, and the division is performed on each sub-residual block obtained by division.
  • Block processing until the residual values of the residual points included in the divided sub-residual blocks are all 0 (that is, the sub-residual blocks do not include residual points whose residual values are not 0), or the sub-residuals obtained by division
  • the total number of residual points of the block is less than the point number threshold, or the total number of times the residual block is divided reaches the number threshold.
  • the aforementioned dot threshold and frequency threshold may be determined based on the image resolution of the first image, for example, it is positively correlated with the image resolution of the first image. That is, the larger the image resolution, the larger the dot threshold and the frequency threshold.
  • the number threshold can be 2 or 3 times.
  • step A1 the cyclic division of the residual block can be realized, and finally multiple sub-residual blocks are obtained.
  • the computing power of the processor is strong, when multiple sub-residual blocks all need to be divided, the division process of multiple sub-residuals can be performed at the same time, saving image processing time.
  • both the residual block and the sub-residual block can be divided into blocks using binary tree division or quad tree division.
  • a residual block or sub-residual block divided by a binary tree is divided into 2 sub-residual blocks of equal or unequal size each time; a residual block or sub-residual block divided by a quad tree is divided each time It is 4 sub-residual blocks of equal or unequal size.
  • the residual block and the sub-residual block may also have other division methods, as long as it is guaranteed to be able to achieve effective block division, which is not limited in the embodiment of the present application.
  • the block Since the traditional video decoding process needs to divide the image into blocks, the block usually adopts the quadtree division method. Therefore, in the embodiment of the present application, when the residual block and the sub-residual block adopt the quad tree division method, Compatible with traditional image processing methods.
  • the residual block and the sub-residual block can also be divided by the image division module used in the aforementioned video decoding process, so as to realize the multiplexing of the modules and save the computational cost.
  • the image resolution of the image in the video is 360p, 480p, 720p, 1080p, 2k and 4k, etc., which are all integer multiples of 4.
  • the quadtree division method can be used every time When dividing, divide the residual block or sub-residual block into four sub-residual blocks of equal size, that is, realize the quartering of the residual block or sub-residual block, so that the size of the residual block obtained by the division is uniform, which is convenient for subsequent deal with.
  • the sub-residual block cannot be divided into four equal parts.
  • the sub-residual block can be divided as much as possible, so that the four sub-residual blocks after each division
  • the size difference between any two sub-residual blocks is less than or equal to the specified gap threshold, so that the function of the finally obtained sub-mask pattern will not be affected.
  • Step A2 Generate a sub-mask pattern corresponding to each target residual block, and the aforementioned mask pattern includes the generated sub-mask pattern.
  • the residual value of at least one residual point included in the target residual block is not zero.
  • the block generation of the mask pattern can reduce the processing of other residual blocks except the target residual block, and only generate the sub-mask pattern for the target residual block, reducing the calculation cost.
  • the manner of generating each sub-mask pattern can refer to the two optional implementation manners in the foregoing step 2031.
  • morphological change processing is performed on each target sub-residual block to obtain a sub-mask pattern corresponding to each target sub-residual block.
  • the target sub-residual block is binarized first to obtain the binarized target sub-residual block; then the binarized target sub-residual block is expanded to obtain the expanded target sub-residual block.
  • Residual block, the target sub-residual block after expansion is used as the sub-mask pattern.
  • the residual point finds the m neighborhood, and fills the residual value of each target residual point into the corresponding m neighborhood to obtain the corresponding sub-mask pattern.
  • the size of each of the aforementioned sub-mask patterns is the same as the size of the corresponding target sub-residual block.
  • Step 2032 The super-division display device inputs the mask pattern and the first image into the super-division model, and through the super-division model, the area of the pixel points corresponding to the positions of the multiple first mask points in the first image is determined as the target pixel area.
  • the super-division display device determines the target pixel area through the super-division model.
  • the traditional super-division model only performs super-division processing on the received image.
  • a code for determining the target pixel area can be added to the front end (ie, the input end) of the traditional super-division model to realize the target pixel area.
  • the super-resolution display device only needs to input the mask pattern and the first image into the super-resolution model, which reduces the computational complexity of modules other than the super-resolution model in the super-resolution display device.
  • the mask pattern includes a plurality of mask points, and the plurality of mask points correspond to the positions of the pixels of the first image one-to-one, and each mask point has a mask value.
  • the mask pattern is a binary image
  • the plurality of mask points include a plurality of first mask points and a plurality of second mask points
  • the mask value of the first mask points is The mask value of the first value and the second mask point is the second value
  • the first value and the second value are different.
  • the first value and the second value may be one of a non-zero value and 0, respectively.
  • the first value is a non-zero value (such as 1)
  • the second value is 0.
  • the mask pattern is a monochrome image
  • the plurality of mask points only includes a plurality of first mask points.
  • step 2032 the mask points in the mask graphics are traversed through the hyperdivision model.
  • the area where the pixel points (ie, the target pixel points) corresponding to the mask points with the mask value of the first value are located Determined as the target pixel area.
  • the target pixel area is determined outside the super-division model.
  • the process of determining the target pixel area in the first image includes:
  • Step 2033 The super-division display device generates a mask pattern based on the residual block.
  • the mask pattern includes a plurality of first mask points, and the plurality of first mask points are in one-to-one correspondence with the positions of the target residual points in the residual block.
  • process of step 2033 reference may be made to the process of step 2031, which is not described in detail in the embodiment of the present application.
  • Step 2034 The super-division display device determines the area where the pixel points (ie target pixel points) corresponding to the positions of the multiple first mask points in the first image are located as the target pixel area.
  • the super-division display device can traverse the mask points in the mask pattern, and in the first image, determine the pixel points corresponding to the mask points with the first value of the mask value as the target pixel area.
  • the super-division display device is guided by the mask pattern to determine the target pixel area from the first image, shielding the pixel points other than the target pixel area, and realizes the rapid positioning of the target pixel area. , Thus effectively saving the time of image processing.
  • the mask pattern can have various shapes.
  • the mask pattern only includes a plurality of first mask points, that is, the mask pattern is composed of a plurality of first mask points, and the mask pattern obtained in this way is usually an irregular pattern;
  • the mask pattern only includes multiple sub-mask patterns, that is, the mask pattern is composed of multiple sub-mask patterns, and each sub-mask pattern has the same size as the corresponding target residual block, that is, it includes both sub-mask patterns.
  • a mask point includes a second mask point.
  • the mask pattern obtained in this way is usually an irregular pattern spliced by sub-mask patterns; in the third example, the size of the mask pattern is the same as that of the first pattern.
  • the memory in the super-division display device stores graphic data in a one-dimensional or multi-dimensional array, etc.
  • the data granularity that needs to be stored is pixel-level Data granularity, storage complexity is relatively high; if the mask pattern is the mask pattern in the aforementioned second example, the data granularity that needs to be stored is at the pixel block level (the size of a pixel block is the size of the aforementioned sub-mask pattern) )
  • Data granularity the stored graphics are more regular than the first example, and the storage complexity is lower; if the mask graphics are the mask graphics in the third example, the stored graphics are rectangular, which is relatively regular compared to the first example.
  • the first example and the second example are more regular and have lower storage complexity. Therefore, the mask pattern is usually the shape of the aforementioned second and third examples.
  • Step 204 The super-division display device performs super-division processing on the target pixel area in the first image to obtain the target pixel area after the super-division processing.
  • the super-division display device may perform super-division processing on the target pixel area in the first image through the super-division model to obtain the target pixel area after the super-division processing.
  • the foregoing step 203 since multiple methods can be used to determine the target pixel area, correspondingly, multiple methods can be used to perform super-division processing.
  • the embodiments of this application take the following two processing methods as examples for description:
  • the first processing method corresponds to the first determination method in step 203.
  • the super-division display device performs super-division processing on the target pixel area in the first image, and the process of obtaining the target pixel area after the super-division processing can be It is: the super-division display device performs super-division processing on the target pixel area in the first image through the super-division model to obtain the target pixel area after the super-division processing.
  • the super-division model can continue to analyze the target in the first image
  • the pixel area is subjected to super-division processing to obtain the target pixel area after the super-division processing.
  • the process of super-division processing performed by the super-division model may refer to related technologies, which will not be described in detail in the embodiment of the present application.
  • the second processing method corresponds to the second determination method in step 203.
  • the super-division display device performs super-division processing on the target pixel area in the first image to obtain the target pixel area after the super-division processing. It is: the super-division display device inputs the target pixel area in the first image into the super-division model, and performs super-division processing on the target pixel area in the first image through the super-division model to obtain the target pixel area after the super-division processing.
  • multiple target image blocks can be filtered out from the first image to perform super-division processing on each target image block. Since the super-division processing is performed on the target image block, and the size of each target image block is smaller than the first image, the computational complexity of the super-division processing can be reduced, and the computational cost can be reduced. Especially when the super-division processing is executed by the super-division model, the complexity of the super-division model can be effectively reduced, and the efficiency of the super-division calculation can be improved.
  • step 204 may include:
  • Step B1 The super-division display device acquires the target image block corresponding to each sub-mask pattern in the first image.
  • the image block corresponding to the position of each sub-mask graphic is determined as the target image block.
  • Step B2 The super-division display device performs super-division processing on the sub-regions of the target pixel area included in each target image block to obtain the target pixel area after the super-division processing.
  • the mask pattern is used to indicate the position of the target pixel area. Since the mask pattern is divided into multiple sub-mask patterns, the target pixel area can also be divided into multiple sub-areas corresponding to the multiple sub-mask patterns. In addition, since each sub-mask pattern corresponds to a target image block, each target image block includes a sub-region of the target pixel area, that is, the multiple target image blocks obtained above correspond to multiple sub-regions in a one-to-one manner. Therefore, the target pixel area after the super-division processing is composed of sub-areas of the target pixel area contained in each target image block after the super-division processing.
  • the super-score processing can be performed by the super-score model.
  • the hyperdivision model can first determine the target pixel area and the target image block corresponding area, that is, the sub-areas of the target pixel area contained in each target image block, and then To perform the super-division processing, the foregoing step 2032 may specifically include: the super-division display device inputs each sub-mask graphic and the corresponding target image block into the super-division model (each time a target image block and a corresponding sub-mask are input) Film graphics), using the hyperdivision model to determine the area of the pixel points corresponding to the multiple first mask point positions of the corresponding sub-mask graphics in the target image block as the sub-pixels of the target pixel area contained in the target image block.
  • step B2 may specifically include: super-division processing on the sub-regions of the target pixel area contained in each target image block by the super-division model, to obtain the sub-regions of the target pixel area contained in each target image block after the super-division processing Area, the target pixel area after the super-division processing is composed of sub-areas of the target pixel area contained in each target image block after the super-division processing.
  • the size of the image that is input to the superdivision model each time is small, which can effectively reduce the computational complexity of the superdivision model, so that the superdivision model with a relatively simple structure can be used to realize the superdivision operation, reduce the complexity of the superdivision model, and increase the complexity of the superdivision model.
  • Super-divided computing efficiency
  • step B2 may specifically include: the super-division display device inputs the sub-regions of the target pixel area contained in each target image block into the super-division model, and the super-division model is used for each
  • the sub-regions of the target pixel area contained in the target image block are subjected to super-division processing, and the sub-regions of the target pixel area contained in each target image block after the super-division processing are obtained.
  • the target pixel area after the super-division processing is processed by the super-division processing.
  • Each target image block contains sub-regions of the target pixel area.
  • the super-division display device After the super-division display device receives the sub-regions of the target pixel area contained in each target image block after the super-division processing output by the super-division model, it can be stitched according to the position of each target image block in the first image (also referred to as Combination) to obtain the sub-regions of the target pixel area after the super-division processing, the finally obtained target pixel area after the super-division processing is composed of the sub-regions of the target pixel area contained in each target image block after the super-division processing.
  • the number of pixels in the target pixel area after the super-division processing is greater than the number of pixels in the target pixel area before the super-division processing, that is, the super-division processing achieves an increase in the pixel density in the target pixel area. So as to achieve the super-division effect.
  • the target pixel area before the super-division processing includes 5 ⁇ 5 pixels
  • the target pixel area after the super-division processing includes 9 ⁇ 9 pixels.
  • Step 205 The super-division display device determines other pixel regions in the first image.
  • other pixel areas include pixel areas other than the target pixel area.
  • the other pixel areas are the areas where the pixels in the first image are located except for the target pixel area (it can also be understood as directly performing step 206 without performing the so-called "determining" step);
  • the other pixel area is the area where the pixel points in the first image except the pixel points corresponding to the first mask point are located;
  • the other pixel area is the area in the first image except for the pixels corresponding to the first mask point.
  • the area where the pixel points other than the auxiliary pixel point is located, and the area where the auxiliary pixel point is located includes the target pixel area, but the number of auxiliary pixels in the first image is greater than the number of pixels in the target pixel area (that is, the target pixel points), That is, the area of the area where the auxiliary pixel is located is larger than the area of the target pixel area.
  • step 205 may include: Erosion processing multiple first mask points in the mask pattern to obtain an updated mask pattern; In the image, the pixels corresponding to the positions of the multiple first mask points after the etching process are determined as auxiliary pixels; the area where the pixel points except the auxiliary pixels in the first image are determined as other pixel areas.
  • the corrosion treatment is a treatment for finding a local minimum.
  • the image to be processed is convolved with the preset kernel (also called kernel).
  • kernel also called kernel
  • the minimum value in the kernel coverage area is assigned to the specified pixel, and the result is the bright area of the image to be processed Zoom out.
  • the core has a definable anchor point, and the anchor point is usually the center point of the core, and the aforementioned designated pixel is the anchor point. It should be noted that the aforementioned expansion treatment and the corrosion treatment do not have a reciprocal relationship.
  • Figure 11 assumes that the image to be processed is F5, the image to be processed includes 5 ⁇ 5 pixels, the shaded part represents the bright spot, and the core is the shaded part in F6, a total of 5 pixels, anchor point It is the center point D of the 5 pixels, and the final image after the etching process is F7.
  • "*" in Figure 11 means convolution.
  • the mask points corresponding to the diagonal dashed shadows in F5 and F7 are the first mask points.
  • the hyperdivision model includes at least one convolution kernel.
  • the size of the corroded core is the same as the receptive field of the last convolution layer of the superdivision model, that is, the size of the corroded core and the aforementioned expansion processed core the same.
  • the area corresponding to the first mask point in the updated mask pattern obtained in this way is actually the first mask point in the mask pattern obtained by the expansion process after removing the outermost layer of the first mask point.
  • the edge of the updated mask pattern is smoother and the noise is smaller.
  • the edge noise of the image can be eliminated through the etching process, and other pixel areas determined by the updated mask pattern obtained by the etching process are compared with those obtained by the first and second alternatives mentioned above. In other pixel areas, the edges are clearer and the noise is less.
  • the subsequent implementation of the pixel update process in step 206 can reduce the negative effects of detail blur, edge dullness, graininess and noise enhancement, and ensure the final super-segmented first image The display effect.
  • Step 206 The super-division display device uses the other pixel areas in the second image after the super-division processing to update other pixel areas in the first image.
  • the number of pixels in other pixel areas in the second image after super-division processing is greater than the number of pixels in other pixel areas in the first image
  • the number of pixels in other pixel areas after the update is greater than that in the update
  • the number of pixels in the previous other pixel areas, that is, the update processing realizes an increase in pixel density in other pixel areas.
  • the display effect of the other pixel areas after the update is the same as the display effect of the other pixel areas after the super-division processing. Therefore, the other pixel areas after the update are equivalent to the other pixel areas after the super-division processing.
  • each square represents For one pixel, the sizes of the other pixel areas K1 and the other pixel areas K2, and the positions in the respective figures are all the same.
  • the other pixel areas K1 in the second image after super-division processing are used to update the other pixel areas K2 in the first image, which refers to updating the pixel data such as the pixel values and pixel positions of the pixels in the other pixel areas K1 to the other pixel areas K2 Pixel data such as the pixel value and pixel position of the corresponding pixel in.
  • the number of pixels, pixel values, and pixel positions in other pixel regions K2 in the updated first image are the same as those in other pixel regions K1 in the second image. Corresponds to the same.
  • the first image after the super-division processing that is, the reconstructed first image, includes the target pixel area after the super-division processing acquired in step 204 and the updated other pixel areas acquired in step 206, the super-division processing
  • the display effect of the first image is the same as that of the first image after the traditional super-division processing.
  • the other pixel area includes the pixel area in the first image other than the target pixel area, the size of the target image area and other pixel areas may or may not match. Accordingly, the first image after the super-division processing is obtained.
  • the method is also different, and the embodiments of this application take the following optional methods as examples for description:
  • the size of the target image area matches the size of other pixel areas, that is, the other pixel areas are pixel areas in the first image excluding the target pixel area; correspondingly, the target after the super-division processing The size of the image area and the updated other pixel areas also match. Then the first image after the super-division processing can be spliced by the target pixel area after the super-division processing and the updated other pixel areas; in another optional way, the size of the target image area and other pixel areas do not match , There is an overlapping area at the edges of the two, that is, other pixel areas include other pixel areas in addition to the pixel areas other than the target pixel area in the first image.
  • the size of the target image area after the super-division processing and the other pixel area after the update do not match, and there is an overlapping area at the edges of the two. Since other pixel areas of the first image are updated from other pixel areas of the second image after the super-division processing, the pixel data of the included pixels is usually more accurate. Therefore, the super-division processing of the first image overlaps The pixel data of the area is based on the pixel data of the updated other pixel areas of the first image. Then the first image after the super-division processing can be spliced by the updated target pixel area and the updated other pixel areas.
  • the updated target pixel area is the target pixel area after the super-division processing minus (also It is called removing) the overlapping area with other pixel areas.
  • the updated target pixel area is shrunk relative to the target pixel area before the update, and the size of the updated target pixel area matches the size of other pixel areas after the update.
  • the second image after the super-division processing can be used as the background image, and the target pixel area after the super-division processing obtained in step 204 is used to cover the corresponding area of the second image to obtain the super-division processing
  • the first image in another alternative, the first image or the blank image can be used as the background image, the target pixel area after the super-division processing obtained in step 204 is used to cover the corresponding area of the first image, and the super-division
  • the other pixel areas of the processed second image cover the corresponding areas of the first image, and the first image after the super-division processing is obtained.
  • the process of generating the first image after the super-division processing based on the acquired target pixel area after the super-division processing and the acquired other pixel areas after the update can also be implemented by the aforementioned super-division model.
  • the process of generating the first image after the super-division processing can satisfy:
  • R(F(t)) represents the first image after super-division processing
  • (w, h) represents any point in the image
  • Mask(w, h) represents the mask image (for example, the updated image in step 205)
  • SR stands for super-division processing.
  • the pixel area where it is located, that is, all or part of the target pixel area after the aforementioned super-division processing for example, if the mask pattern update action described in step 205 is not performed, it is the entire area; if the step 205 is performed
  • the mask pattern update action is a
  • This situation is equivalent to that the target pixel area determined in step 203 is updated with the update of the mask pattern.
  • This partial area is the aforementioned updated target pixel area. It can be obtained by subtracting the overlapping area with other pixel areas from the target pixel area after the super-division processing determined in step 203).
  • R1 represents the first value
  • R2 represents the second value. For example, the first value is 1 and the second value is 0; or, the first value is 255 and the second value is 0.
  • Step 207 The super-division display device uses the second image after super-division processing to update the first image.
  • the inter-frame residual error corresponding to the residual block is used to reflect the content changes of the two adjacent frames, when the residual value of all residual points in the residual block obtained based on the first image and the second image is 0, It shows that the contents of the first image and the second image have not changed, and the same is true, and the images after the two super-division processing should also not change. Therefore, the first image is updated with the second image after the super-division processing, and the first image after the super-division processing is obtained.
  • the second image after the super-division processing is the image determined by the image processing method provided in the embodiment of the application or the traditional image processing method or other super-division processing methods, and the second image after the super-division processing is used Updating the first image can make the number of pixels in the first image after the update greater than the number of pixels in the first image before the update, that is, the update process realizes an increase in the pixel density in the first image, thereby achieving Because of the super-resolution effect, the updated first image is also a super-division image, that is, the updated first image is equivalent to the first image after the super-division processing.
  • the super-resolution display device can also count the number of residual points with a residual value of 0 in the residual block as the first proportion of the total number of residual points in the residual block;
  • a proportion is greater than the first super-division trigger proportion threshold
  • the target pixel area is determined in the first image based on the residual block.
  • the first proportion is not greater than the first over-division trigger proportion threshold
  • other methods are used to perform the overall super-division processing of the first image, for example, a traditional method is used to perform the super-division processing of the first image.
  • the target pixel area is determined in the first image based on the residual block.
  • the proportion threshold is used, other methods are used to perform the super-division processing of the first image as a whole.
  • the first proportion of the number of residual points with a residual value of 0 in the total number of residual points in the residual block is greater than the first super-division trigger proportion threshold, it can be detected whether the difference in the content of the two frames before and after is large, When the first proportion is not greater than the threshold of the first super-division trigger proportion, it indicates that the content of the two frames is quite different, and the correlation in the time domain is not strong.
  • the super-division processing can be directly performed on the entire first image, that is, the first image is all super-divided; when the first proportion is If it is greater than the first super-division trigger percentage threshold, it means that the difference in the content of the two frames of image is small, and the computational cost of directly performing super-division processing on the entire first image is greater than the computational cost of using the aforementioned steps 203 to 206, then the aforementioned step 203 can be performed To 206. In this way, it is possible to determine whether to execute the partial super-division algorithm based on the content difference between the first image and the second image, thereby improving the flexibility of image processing.
  • the super-resolution display device can also count the number of residual points with a residual value of non-zero in the residual block as the second proportion of the total number of residual points in the residual block;
  • the target pixel area is determined in the first image based on the residual block;
  • the second proportion is greater than the second super-division trigger proportion threshold, other methods are used Perform super-division processing of the entire first image, for example, perform super-division processing of the first image in a traditional manner (or, when the second proportion is greater than the second super-division trigger proportion threshold, based on the residual block, in the first
  • the target pixel area is determined in the image; when the second proportion is not greater than the second over-division trigger proportion threshold, other methods are used to perform over-division processing of the entire first image).
  • the second proportion By judging whether the second proportion is greater than the second super-division trigger proportion threshold, it can be detected whether the difference in the content of the two frames before and after is large.
  • the second proportion is greater than the second super-division trigger proportion threshold, it indicates the content of the two frames of image The difference between is large and the correlation in the time domain is not strong.
  • the computational cost of directly performing super-division processing on the entire first image is less than or equal to the computational cost of the aforementioned steps 203 to 206 .
  • the computational cost of performing the super-division processing on the entire first image is greater than the computational cost of using the aforementioned steps 203 to 206, and the aforementioned steps 203 to 206 can be executed. In this way, it is possible to determine whether to execute the partial super-division algorithm based on the content difference between the first image and the second image, thereby improving the flexibility of image processing.
  • the aforementioned first over-division trigger percentage threshold and the second over-division trigger percentage threshold may be the same or different, and both of them are 50% in the example.
  • the image resolution of the image in the video can be 360p, 480p, 720p, 1080p, 2k, 4k, etc.
  • the image resolutions exemplified in the foregoing embodiments are all relatively small.
  • the first image and the second image respectively include 5 ⁇ 5 pixels. This is only for the convenience of readers' understanding, and the actual resolution of the image is not limited to the resolution in the foregoing example.
  • inputting an image or a region of the image into the super-division model refers to inputting pixel data of pixels in the image or the region of the image into the super-division model.
  • the partial super-division method provided by the embodiment of the present application actually divides the first image into the target pixel area H1 and other pixel areas H2 for processing (in actual implementation, the target pixel area H1 determined in step 203 There may be an overlapping area with the boundary of the other pixel area H2 determined in step 205.
  • Figure 13 takes the shape of the two to match and there is no overlapping area as an example), by determining the target pixel area H1 in the first image, and comparing the target pixel
  • the area is super-divided to realize the super-division processing of the area where the pixel points of the first image and the previous frame image are different, and the other pixel areas in the previous frame image after the super-division processing are used to update the other pixels in the first image
  • Pixel area H2 achieves the same effect of super-division processing on other pixel areas, and makes full use of the characteristics of video temporal redundancy. Therefore, by super-division processing part of the first image, the first image can be fully processed.
  • the effect of super-splitting processing reduces the amount of actual over-splitting processing and reduces the computational cost.
  • test video processed by the embodiment of the present application can save about 45% of the amount of super-division calculation compared to directly performing all the super-division processing on the video in the traditional technology.
  • the significant reduction in the amount of super-division calculations is conducive to speeding up the video processing speed, ensuring that the video can meet the basic frame rate requirements, thereby ensuring the real-time nature of the video, and there will be no delays in playback;
  • the reduction in the amount of calculation means less processing tasks and consumption of the computing unit in the super-division display device, which brings about a decrease in the overall power consumption and saves the power consumption of the device.
  • the partial super-division algorithm proposed in some embodiments of the present application does not only super-division part of the image area, and the other parts are processed by non-super-division methods to sacrifice the effect for efficiency, but avoid the unchanging area of the video before and after the frame.
  • the repeated over-score of redundant time-domain information is essentially a method of pursuing the maximization of information utilization.
  • the super-resolution model is instructed to perform super-resolution accurate to the pixel level.
  • the final processed video essentially all the pixel values of each frame of image come from the super-resolution calculation result, which has the same display effect as the traditional all-super-resolution algorithm, avoiding the sacrifice of the display effect.
  • the complexity of each super-division processing of the super-division model is low, and the structural complexity of the super-division model is required to be relatively low. Low, which can simplify the super-division model, reduce the requirements for processor performance, and improve the efficiency of super-division processing.
  • FIG. 14 is a block diagram of an image processing apparatus 300, which includes:
  • the acquiring module 301 is configured to acquire the inter-frame residuals between the first image and the adjacent previous frame of image to obtain a residual block, the residual block including a one-to-one correspondence with a plurality of pixel positions of the first image Multiple residual points of, each residual point has a residual value;
  • the first determining module 302 is configured to determine a target pixel area in the first image based on the residual block;
  • the partial super-division module 303 is configured to perform super-division processing on the target pixel area in the first image to obtain a target pixel area after the super-division processing;
  • the update module 304 is configured to update the other pixel areas in the first image by using other pixel areas in the previous frame of image after the super-division processing, and the other pixel areas include all pixels in the first image.
  • the first image after the super-division processing includes the target pixel area after the super-division processing and other updated pixel areas.
  • the target pixel area is the area where the pixel points corresponding to the positions of the first target residual point and the second target residual point in the first image are located, and the first target residual point is in the residual block
  • the second target residual point is a residual point around the first target residual point in the residual block.
  • the first determining module determines the target pixel in the first image, and some super-division modules perform super-division processing on the target pixel, so as to realize the difference between the first image and the previous frame image.
  • the super-division processing of the area where it is located, and the update module uses the other pixel areas in the previous frame image after the super-division processing to update the other pixel areas in the first image, making full use of the characteristics of video temporal redundancy Therefore, by performing super-division processing on a partial area of the first image, the effect of performing all super-division processing on the first image is achieved, which reduces the amount of calculation in the super-division processing and reduces the computational cost.
  • the first determining module 302 includes:
  • the generation sub-module 3021 is configured to generate a mask pattern based on the residual block, the mask pattern includes a plurality of first mask points, and the plurality of first mask points are different from those in the residual block The positions of multiple target residual points correspond one-to-one;
  • the determining sub-module 3022 is configured to input the mask graphic and the first image into a super-division model, and use the super-division model to compare the first image with each of the plurality of first mask points.
  • the area where the pixel point corresponding to the position of each mask point is located is determined as the target pixel area.
  • the partial super-division module 303 is configured to perform super-division processing on the target pixel area in the first image through the super-division model to obtain a target pixel area after the super-division processing.
  • the first determining module 302 includes:
  • the generation sub-module 3021 is configured to generate a mask pattern based on the residual block, the mask pattern including a plurality of first mask points, a plurality of first mask points and a plurality of the residual blocks The positions of the target residual points correspond one to one;
  • the determining sub-module 3022 is configured to determine the area of the pixel point corresponding to the position of each of the plurality of first mask points in the first image as the target pixel area.
  • the partial super-division module 303 is used for:
  • the target pixel points in the first image are input into a super-division model, and the target pixel region in the first image is super-division processed by the super-division model to obtain a target pixel region after the super-division processing.
  • the mask pattern includes a plurality of mask points, the plurality of mask points correspond to a plurality of pixel point positions of the first image one-to-one, and each of the mask points has a mask Value, the plurality of mask points include the plurality of first mask points and a plurality of second mask points, the mask value of the first mask point is a first value, and the second mask point The mask value of the point is a second value, and the first value is different from the second value;
  • the determining submodule 3022 can be used to:
  • the generating submodule 3021 is configured to:
  • the morphological change processing includes binarization processing and performing a first mask point in the residual block after binarization Expansion processing
  • the super-division model includes at least one convolutional layer
  • the core of the expansion processing is the same size as the receptive field of the last convolutional layer of the super-division model.
  • the generating submodule 3021 is configured to:
  • the residual block is divided into a plurality of sub-residual blocks, and block processing is performed on each sub-residual block obtained by division, and the block processing includes:
  • the sub-residual block is divided into a plurality of sub-residual blocks, and execution is performed on each sub-residual block obtained by division
  • the residual values of the residual points included in the divided sub-residual blocks are all 0, or the total number of residual points of the divided sub-residual blocks is less than the point number threshold, or the residual value of the residual
  • the total number of times of dividing the difference block reaches the threshold of times;
  • the mask pattern includes the generated sub-mask pattern.
  • the partial super-division module 303 is used for:
  • Each subsequent target image block includes sub-regions of the target pixel region.
  • both the residual block and the sub-residual block are divided into blocks in a quadtree division manner.
  • the apparatus 300 further includes:
  • the erosion module 305 is used for updating the pixel value of the other pixel area in the first image before the pixel value of the pixel point corresponding to the position of the other pixel area in the previous frame image is used to update the pixel value of the other pixel area in the first image.
  • the multiple first mask points in the mask pattern are etched to obtain an updated mask pattern, and the nucleus of the etched processing is the same size as the receptive field of the last convolutional layer of the hyperdivision model;
  • the second determining module 306 is configured to determine the pixel points corresponding to the positions of the plurality of first mask points after the etching process in the first image as auxiliary pixels;
  • the third determining module 307 is configured to determine the area where the pixel points except the auxiliary pixel point in the first image are located as the other pixel area.
  • the first determining module 302 is configured to:
  • the target pixel area is determined in the first image based on the residual block.
  • the super-division model may be a CNN model, such as a model such as SRCNN or ESPCN; the super-division model may also be a GAN, such as a model such as SRGAN or ESRGAN.
  • the first determining module determines the target pixel in the first image, and some super-division modules perform super-division processing on the target pixel, so as to realize the difference between the first image and the previous frame image.
  • the super-division processing of the area where it is located, and the update module uses the other pixel areas in the previous frame image after the super-division processing to update the other pixel areas in the first image, making full use of the characteristics of video temporal redundancy Therefore, by performing super-division processing on a partial area of the first image, the effect of performing all super-division processing on the first image is achieved, which reduces the amount of calculation in the super-division processing and reduces the computational cost.
  • each module in the above device can be implemented by software or a combination of software and hardware.
  • the hardware may be a logic integrated circuit module, which may specifically include a transistor, a logic gate array, or an arithmetic logic circuit.
  • the software exists in the form of a computer program product and is stored in a computer-readable storage medium.
  • the software can be executed by a processor. Therefore, alternatively, the image rendering device may be implemented by a processor executing a software program, which is not limited in this embodiment.
  • An embodiment of the present application provides an electronic device, including: a processor and a memory;
  • the memory is used to store a computer program
  • the processor is configured to implement any one of the image processing methods described in this application when executing the computer program stored in the memory.
  • FIG. 17 shows a schematic structural diagram of an electronic device 400 involved in the image processing method.
  • the electronic device 400 can be, but is not limited to, a laptop computer, a desktop computer, a mobile phone, a smart phone, a tablet computer, a multimedia player, an e-reader, a smart car device, a smart home appliance (such as a smart TV), an artificial intelligence device, Wearable devices, IoT devices, or virtual reality/augmented reality/mixed reality devices, etc.
  • the electronic device 400 may include the structure of the super-division display apparatus 100 shown in FIG. 1.
  • the electronic device 400 may include a processor 410, an external memory interface 420, an internal memory 421, a universal serial bus (USB) interface 430, a charging management module 440, a power management module 441, a battery 442, an antenna 4, and an antenna 2.
  • Mobile communication module 450 wireless communication module 460, audio module 470, speaker 470A, receiver 470B, microphone 470C, earphone jack 470D, sensor module 480, buttons 490, motor 491, indicator 492, camera 493, display 494, and Subscriber identification module (subscriber identification module, SIM) card interface 495, etc.
  • SIM Subscriber identification module
  • the sensor module 480 can include pressure sensor 480A, gyroscope sensor 480B, air pressure sensor 480C, magnetic sensor 480D, acceleration sensor 480E, distance sensor 480F, proximity light sensor 480G, fingerprint sensor 480H, temperature sensor 480J, touch sensor 480K, ambient light One or more of sensor 480L, bone conduction sensor 480M, etc.
  • the structure illustrated in the embodiment of the present application does not constitute a specific limitation on the electronic device 400.
  • the electronic device 400 may include more or fewer components than shown, or combine certain components, or split certain components, or arrange different components.
  • the illustrated components can be implemented in hardware, software, or a combination of software and hardware.
  • the interface connection relationship between the modules illustrated in the embodiment of the present application is merely a schematic description, and does not constitute a structural limitation of the electronic device 400.
  • the electronic device 400 may also adopt different interface connection modes (for example, bus connection mode) in the above-mentioned embodiments, or a combination of multiple interface connection modes.
  • the processor 410 may include one or more processing units, such as a central processing unit CPU (for example, an application processor (application processor, AP)), a graphics processing unit (GPU), and further, may also include a modem Modulation processor, image signal processor (image signal processor, ISP), MCU, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (neural-network processor) processing unit, NPU), etc.
  • the different processing units may be independent devices or integrated in one or more processors.
  • a memory may also be provided in the processor 410 for storing instructions and data.
  • the memory in the processor 410 is a cache memory.
  • the memory can store instructions or data that have just been used or recycled by the processor 410. If the processor 410 needs to use the instruction or data again, it can be directly called from the memory. Repeated accesses are avoided, the waiting time of the processor 410 is reduced, and the efficiency of the system is improved.
  • the processor 410 may include one or more interfaces.
  • the interface may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, and a universal asynchronous transceiver (universal asynchronous) interface.
  • I2C integrated circuit
  • I2S integrated circuit built-in audio
  • PCM pulse code modulation
  • PCM pulse code modulation
  • UART universal asynchronous transceiver
  • MIPI mobile industry processor interface
  • GPIO general-purpose input/output
  • SIM subscriber identity module
  • USB Universal Serial Bus
  • the I2C interface is a bidirectional synchronous serial bus, which includes a serial data line (SDA) and a serial clock line (SCL).
  • the processor 410 may include multiple sets of I2C buses.
  • the processor 410 may be coupled to the touch sensor 480K, charger, flash, camera 493, etc. through different I2C bus interfaces.
  • the processor 410 may couple the touch sensor 480K through an I2C interface, so that the processor 410 and the touch sensor 480K communicate through the I2C bus interface to realize the touch function of the electronic device 400.
  • the I2S interface can be used for audio communication.
  • the processor 410 may include multiple sets of I2S buses.
  • the processor 410 may be coupled with the audio module 470 through an I2S bus to implement communication between the processor 410 and the audio module 470.
  • the audio module 470 may transmit audio signals to the wireless communication module 460 through the I2S interface, so as to realize the function of answering calls through the Bluetooth headset.
  • the PCM interface can also be used for audio communication to sample, quantize and encode analog signals.
  • the audio module 470 and the wireless communication module 460 may be coupled through a PCM bus interface.
  • the audio module 470 may also transmit audio signals to the wireless communication module 460 through the PCM interface, so as to realize the function of answering calls through the Bluetooth headset. Both the I2S interface and the PCM interface can be used for audio communication.
  • the UART interface is a universal serial data bus used for asynchronous communication.
  • the bus can be a two-way communication bus. It converts the data to be transmitted between serial communication and parallel communication.
  • the UART interface is generally used to connect the processor 410 and the wireless communication module 460.
  • the processor 410 communicates with the Bluetooth module in the wireless communication module 460 through the UART interface to realize the Bluetooth function.
  • the audio module 470 may transmit audio signals to the wireless communication module 460 through the UART interface, so as to realize the function of playing music through the Bluetooth headset.
  • the MIPI interface can be used to connect the processor 410 with the display screen 494, the camera 493 and other peripheral devices.
  • the MIPI interface includes a camera serial interface (camera serial interface, CSI), a display serial interface (display serial interface, DSI), and so on.
  • the processor 410 and the camera 493 communicate through a CSI interface to implement the shooting function of the electronic device 400.
  • the processor 410 and the display screen 494 communicate through a DSI interface to realize the display function of the electronic device 400.
  • the GPIO interface can be configured through software.
  • the GPIO interface can be configured as a control signal or as a data signal.
  • the GPIO interface can be used to connect the processor 410 with the camera 493, the display screen 494, the wireless communication module 460, the audio module 470, the sensor module 480, and so on.
  • the GPIO interface can also be configured as an I2C interface, I2S interface, UART interface, MIPI interface, etc.
  • the USB interface 430 is an interface that complies with the USB standard specification, and specifically may be a Mini USB interface, a Micro USB interface, a USB Type C interface, and so on.
  • the USB interface 430 can be used to connect a charger to charge the electronic device 400, and can also be used to transfer data between the electronic device 400 and peripheral devices. It can also be used to connect earphones and play audio through earphones. This interface can also be used to connect to other electronic devices, such as AR devices.
  • the charging management module 440 is used to receive charging input from the charger.
  • the charger can be a wireless charger or a wired charger.
  • the charging management module 440 may receive the charging input of the wired charger through the USB interface 430.
  • the charging management module 440 may receive the wireless charging input through the wireless charging coil of the electronic device 400. While the charging management module 440 charges the battery 442, it can also supply power to the electronic device through the power management module 441.
  • the power management module 441 is used to connect the battery 442, the charging management module 440 and the processor 410.
  • the power management module 441 receives input from the battery 442 and/or the charging management module 440, and supplies power to the processor 410, the internal memory 421, the display screen 494, the camera 493, and the wireless communication module 460.
  • the power management module 441 can also be used to monitor parameters such as battery capacity, battery cycle times, and battery health status (leakage, impedance).
  • the power management module 441 may also be provided in the processor 410.
  • the power management module 441 and the charging management module 440 may also be provided in the same device.
  • the wireless communication function of the electronic device 400 may be implemented by the antenna 4, the antenna 2, the mobile communication module 450, the wireless communication module 460, the modem processor, and the baseband processor.
  • the antenna 4 and the antenna 2 are used to transmit and receive electromagnetic wave signals.
  • Each antenna in the electronic device 400 can be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve antenna utilization.
  • the antenna 4 can be multiplexed as a diversity antenna of the wireless local area network.
  • the antenna can be used in combination with a tuning switch.
  • the mobile communication module 450 may provide a wireless communication solution including 2G/3G/4G/5G and the like applied to the electronic device 400.
  • the mobile communication module 450 may include at least one filter, a switch, a power amplifier, a low noise amplifier (LNA), and the like.
  • the mobile communication module 450 can receive electromagnetic waves by the antenna 4, filter and amplify the received electromagnetic waves, and transmit them to the modem processor for demodulation.
  • the mobile communication module 450 can also amplify the signal modulated by the modem processor, and convert it into electromagnetic waves through the antenna 4 for radiation.
  • at least part of the functional modules of the mobile communication module 450 may be provided in the processor 410.
  • at least part of the functional modules of the mobile communication module 450 and at least part of the modules of the processor 410 may be provided in the same device.
  • the modem processor may include a modulator and a demodulator.
  • the modulator is used to modulate the low frequency baseband signal to be sent into a medium and high frequency signal.
  • the demodulator is used to demodulate the received electromagnetic wave signal into a low-frequency baseband signal. Then the demodulator transmits the demodulated low-frequency baseband signal to the baseband processor for processing. After the low-frequency baseband signal is processed by the baseband processor, it is passed to the application processor.
  • the application processor outputs a sound signal through an audio device (not limited to the speaker 470A, the receiver 470B, etc.), or displays an image or video through the display screen 494.
  • the modem processor may be an independent device. In other embodiments, the modem processor may be independent of the processor 410 and be provided in the same device as the mobile communication module 450 or other functional modules.
  • the wireless communication module 460 can provide applications on the electronic device 400 including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) networks), bluetooth (BT), and global navigation satellites. System (global navigation satellite system, GNSS), frequency modulation (FM), near field communication (NFC), infrared technology (infrared, IR) and other wireless communication solutions.
  • the wireless communication module 460 may be one or more devices integrating at least one communication processing module.
  • the wireless communication module 460 receives electromagnetic waves via the antenna 2, frequency modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 410.
  • the wireless communication module 460 may also receive a signal to be sent from the processor 410, perform frequency modulation, amplify, and convert it into electromagnetic waves to radiate through the antenna 2.
  • the antenna 4 of the electronic device 400 is coupled with the mobile communication module 450, and the antenna 2 is coupled with the wireless communication module 460, so that the electronic device 400 can communicate with the network and other devices through wireless communication technology.
  • the wireless communication technology may include global system for mobile communications (GSM), general packet radio service (GPRS), code division multiple access (CDMA), broadband Code division multiple access (wideband code division multiple access, WCDMA), time-division code division multiple access (TD-SCDMA), long term evolution (LTE), BT, GNSS, WLAN, NFC , FM, and/or IR technology, etc.
  • the GNSS may include global positioning system (GPS), global navigation satellite system (GLONASS), Beidou navigation satellite system (BDS), quasi-zenith satellite system (quasi -zenith satellite system, QZSS) and/or satellite-based augmentation systems (SBAS).
  • GPS global positioning system
  • GLONASS global navigation satellite system
  • BDS Beidou navigation satellite system
  • QZSS quasi-zenith satellite system
  • SBAS satellite-based augmentation systems
  • the electronic device 400 implements a display function through a GPU, a display screen 494, an application processor, and the like.
  • the GPU is a microprocessor for image processing, connected to the display screen 494 and the application processor.
  • the GPU is used to perform mathematical and geometric calculations for graphics rendering.
  • the processor 410 may include one or more GPUs that execute program instructions to generate or change display information.
  • the display screen 494 is used to display images, videos, and the like.
  • the display screen 494 includes a display panel.
  • the display panel can adopt liquid crystal display (LCD), organic light-emitting diode (OLED), active matrix organic light-emitting diode or active-matrix organic light-emitting diode (active-matrix organic light-emitting diode).
  • LCD liquid crystal display
  • OLED organic light-emitting diode
  • active-matrix organic light-emitting diode active-matrix organic light-emitting diode
  • AMOLED flexible light-emitting diode (FLED), Miniled, MicroLed, Micro-oLed, quantum dot light-emitting diode (QLED), etc.
  • the electronic device 400 may include 4 or N display screens 494, and N is a positive integer greater than 4.
  • the electronic device 400 can realize a shooting function through an ISP, a camera 493, a video codec, a GPU, a display screen 494, and an application processor.
  • the ISP is used to process the data fed back from the camera 493. For example, when taking a picture, the shutter is opened, the light is transmitted to the photosensitive element of the camera through the lens, the light signal is converted into an electrical signal, and the photosensitive element of the camera transmits the electrical signal to the ISP for processing and is converted into an image visible to the naked eye.
  • ISP can also optimize the image noise, brightness, and skin color. ISP can also optimize the exposure, color temperature and other parameters of the shooting scene.
  • the ISP may be provided in the camera 493.
  • the camera 493 is used to capture still images or videos.
  • the object generates an optical image through the lens and is projected to the photosensitive element.
  • the photosensitive element may be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor.
  • CMOS complementary metal-oxide-semiconductor
  • the photosensitive element converts the optical signal into an electrical signal, and then transfers the electrical signal to the ISP to convert it into a digital image signal.
  • ISP outputs digital image signals to DSP for processing.
  • DSP converts digital image signals into standard RGB, YUV and other formats of image signals.
  • the electronic device 400 may include 4 or N cameras 493, and N is a positive integer greater than 4.
  • Digital signal processors are used to process digital signals. In addition to digital image signals, they can also process other digital signals. For example, when the electronic device 400 selects a frequency point, the digital signal processor is used to perform Fourier transform on the energy of the frequency point.
  • Video codecs are used to compress or decompress digital video.
  • the electronic device 400 may support one or more video codecs. In this way, the electronic device 400 can play or record videos in multiple encoding formats, such as: moving picture experts group (MPEG) 4, MPEG2, MPEG3, MPEG4, and so on.
  • MPEG moving picture experts group
  • MPEG2 MPEG2, MPEG3, MPEG4, and so on.
  • NPU is a neural-network (NN) computing processor.
  • NN neural-network
  • applications such as intelligent cognition of the electronic device 400 can be realized, such as image recognition, face recognition, voice recognition, text understanding, and so on.
  • the external memory interface 420 may be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device 400.
  • the external memory card communicates with the processor 410 through the external memory interface 420 to realize the data storage function. For example, save music, video and other files in an external memory card.
  • the internal memory 421 may be used to store computer executable program code, where the executable program code includes instructions.
  • the internal memory 421 may include a program storage area and a data storage area.
  • the storage program area can store an operating system, at least one application program (such as a sound playback function, an image playback function, etc.) required by at least one function.
  • the data storage area can store data (such as audio data, phone book, etc.) created during the use of the electronic device 400.
  • the internal memory 421 may include a high-speed random access memory, such as a double data rate synchronous dynamic random access memory (DDR), and may also include a non-volatile memory, such as at least one disk storage device, Flash memory devices, universal flash storage (UFS), etc.
  • the processor 410 executes various functional applications and data processing of the electronic device 400 by running instructions stored in the internal memory 421 and/or instructions stored in a memory provided in the processor.
  • the electronic device 400 can implement audio functions through an audio module 470, a speaker 470A, a receiver 470B, a microphone 470C, a headphone interface 470D, and an application processor. For example, music playback, recording, etc.
  • the audio module 470 is used to convert digital audio information into an analog audio signal for output, and is also used to convert an analog audio input into a digital audio signal.
  • the audio module 470 can also be used to encode and decode audio signals.
  • the audio module 470 may be disposed in the processor 410, or part of the functional modules of the audio module 470 may be disposed in the processor 410.
  • the speaker 470A also called a "speaker" is used to convert audio electrical signals into sound signals.
  • the electronic device 400 can listen to music through the speaker 470A, or listen to a hands-free call.
  • the receiver 470B also called “earpiece” is used to convert audio electrical signals into sound signals.
  • the electronic device 400 answers a call or voice message, it can receive the voice by bringing the receiver 470B close to the human ear.
  • Microphone 470C also called “microphone”, “microphone”, is used to convert sound signals into electrical signals.
  • the user can make a sound by approaching the microphone 470C through the human mouth, and input the sound signal into the microphone 470C.
  • the electronic device 400 may be provided with at least one microphone 470C.
  • the electronic device 400 may be provided with two microphones 470C, which can implement noise reduction functions in addition to collecting sound signals.
  • the electronic device 400 may also be provided with three, four or more microphones 470C to collect sound signals, reduce noise, identify sound sources, and realize directional recording functions.
  • the earphone interface 470D is used to connect wired earphones.
  • the earphone interface 470D may be a USB interface 430, or a 3.5 mm open mobile terminal platform (OMTP) standard interface, or a cellular telecommunications industry association (cellular telecommunications industry association of the USA, CTIA) standard interface.
  • OMTP open mobile terminal platform
  • CTIA cellular telecommunications industry association
  • the pressure sensor 480A is used to sense the pressure signal and can convert the pressure signal into an electrical signal.
  • the pressure sensor 480A may be provided on the display screen 494.
  • the capacitive pressure sensor may include at least two parallel plates with conductive materials.
  • the electronic device 400 may also calculate the touched position according to the detection signal of the pressure sensor 480A.
  • touch operations that act on the same touch position but have different touch operation strengths may correspond to different operation instructions. For example, when a touch operation whose intensity of the touch operation is less than the first pressure threshold is applied to the short message application icon, an instruction to view the short message is executed. When a touch operation with a touch operation intensity greater than or equal to the first pressure threshold acts on the short message application icon, an instruction to create a new short message is executed.
  • the gyro sensor 480B may be used to determine the movement posture of the electronic device 400.
  • the angular velocity of the electronic device 400 around three axes ie, x, y, and z axes
  • the gyro sensor 480B can be used for image stabilization.
  • the gyro sensor 480B detects the shake angle of the electronic device 400, calculates the distance that the lens module needs to compensate according to the angle, and allows the lens to counteract the shake of the electronic device 400 through reverse movement to achieve anti-shake.
  • the gyroscope sensor 480B can also be used for navigation and somatosensory game scenes.
  • the air pressure sensor 480C is used to measure air pressure.
  • the electronic device 400 calculates the altitude based on the air pressure value measured by the air pressure sensor 480C to assist positioning and navigation.
  • the magnetic sensor 480D includes a Hall sensor.
  • the electronic device 400 can use the magnetic sensor 480D to detect the opening and closing of the flip holster.
  • the electronic device 400 when the electronic device 400 is a flip machine, the electronic device 400 can detect the opening and closing of the flip according to the magnetic sensor 480D.
  • features such as automatic unlocking of the flip cover are set.
  • the acceleration sensor 480E can detect the magnitude of the acceleration of the electronic device 400 in various directions (generally three axes). When the electronic device 400 is stationary, the magnitude and direction of gravity can be detected. It can also be used to identify the posture of electronic devices, and apply to applications such as horizontal and vertical screen switching, pedometers and so on.
  • the electronic device 400 can measure the distance by infrared or laser. In some embodiments, when shooting a scene, the electronic device 400 may use the distance sensor 480F to measure the distance to achieve fast focusing.
  • the proximity light sensor 480G may include, for example, a light emitting diode (LED) and a light detector, such as a photodiode.
  • the light emitting diode may be an infrared light emitting diode.
  • the electronic device 400 emits infrared light to the outside through the light emitting diode.
  • the electronic device 400 uses a photodiode to detect infrared reflected light from nearby objects. When sufficient reflected light is detected, it can be determined that there is an object near the electronic device 400. When insufficient reflected light is detected, the electronic device 400 may determine that there is no object near the electronic device 400.
  • the electronic device 400 can use the proximity light sensor 480G to detect that the user holds the electronic device 400 close to the ear to talk, so as to automatically turn off the screen to save power.
  • the proximity light sensor 480G can also be used in leather case mode, and the pocket mode will automatically unlock and lock the screen.
  • the ambient light sensor 480L is used to sense the brightness of the ambient light.
  • the electronic device 400 can adaptively adjust the brightness of the display screen 494 according to the perceived brightness of the ambient light.
  • the ambient light sensor 480L can also be used to automatically adjust the white balance when taking pictures.
  • the ambient light sensor 480L can also cooperate with the proximity light sensor 480G to detect whether the electronic device 400 is in the pocket to prevent accidental touch.
  • the fingerprint sensor 480H is used to collect fingerprints.
  • the electronic device 400 can use the collected fingerprint characteristics to realize fingerprint unlocking, access application locks, fingerprint photographs, fingerprint answering calls, and so on.
  • the temperature sensor 480J is used to detect temperature.
  • the electronic device 400 uses the temperature detected by the temperature sensor 480J to execute a temperature processing strategy. For example, when the temperature reported by the temperature sensor 480J exceeds a threshold value, the electronic device 400 executes to reduce the performance of the processor located near the temperature sensor 480J, so as to reduce power consumption and implement thermal protection.
  • the electronic device 400 when the temperature is lower than another threshold, the electronic device 400 heats the battery 442 to avoid abnormal shutdown of the electronic device 400 due to low temperature.
  • the electronic device 400 boosts the output voltage of the battery 442 to avoid abnormal shutdown caused by low temperature.
  • Touch sensor 480K also called “touch device”.
  • the touch sensor 480K can be arranged on the display screen 494, and the touch screen is composed of the touch sensor 480K and the display screen 494, which is also called a “touch screen”.
  • the touch sensor 480K is used to detect touch operations acting on or near it.
  • the touch sensor can pass the detected touch operation to the application processor to determine the type of touch event.
  • the visual output related to the touch operation can be provided through the display screen 494.
  • the touch sensor 480K may also be disposed on the surface of the electronic device 400, which is different from the position of the display screen 494.
  • the bone conduction sensor 480M can acquire vibration signals.
  • the bone conduction sensor 480M can acquire the vibration signal of the vibrating bone mass of the human voice.
  • the bone conduction sensor 480M can also contact the human pulse and receive the blood pressure pulse signal.
  • the bone conduction sensor 480M may also be provided in the earphone, combined with the bone conduction earphone.
  • the audio module 470 can parse the voice signal based on the vibration signal of the vibrating bone block of the voice obtained by the bone conduction sensor 480M, and realize the voice function.
  • the application processor can analyze the heart rate information based on the blood pressure beating signal obtained by the bone conduction sensor 480M, and realize the heart rate detection function.
  • the electronic device 400 may also adopt different interface connection modes in the above embodiments. For example, some or all of the above multiple sensors are connected to the MCU, and then the AP is connected through the MCU.
  • the button 490 includes a power-on button, a volume button, and so on.
  • the button 490 may be a mechanical button. It can also be a touch button.
  • the electronic device 400 may receive key input, and generate key signal input related to user settings and function control of the electronic device 400.
  • the motor 491 can generate vibration prompts.
  • the motor 491 can be used for incoming call vibration notification, and can also be used for touch vibration feedback.
  • touch operations that act on different applications can correspond to different vibration feedback effects.
  • Acting on touch operations in different areas of the display screen 494, the motor 491 can also correspond to different vibration feedback effects.
  • Different application scenarios for example: time reminding, receiving information, alarm clock, games, etc.
  • the touch vibration feedback effect can also support customization.
  • the indicator 492 can be an indicator light, which can be used to indicate the charging status, power change, and can also be used to indicate messages, missed calls, notifications, and so on.
  • the SIM card interface 495 is used to connect to the SIM card.
  • the SIM card can be connected to and separated from the electronic device 400 by inserting into the SIM card interface 495 or pulling out from the SIM card interface 495.
  • the electronic device 400 may support 4 or N SIM card interfaces, and N is a positive integer greater than 4.
  • the SIM card interface 495 can support Nano SIM cards, Micro SIM cards, SIM cards, etc.
  • the same SIM card interface 495 can insert multiple cards at the same time. The types of the multiple cards can be the same or different.
  • the SIM card interface 495 can also be compatible with different types of SIM cards.
  • the SIM card interface 495 may also be compatible with external memory cards.
  • the electronic device 400 interacts with the network through the SIM card to implement functions such as call and data communication.
  • the electronic device 400 adopts an eSIM, that is, an embedded SIM card.
  • the eSIM card can be embedded in the electronic device 400 and cannot be separated from the electronic device 400.
  • the software system of the electronic device 400 may adopt a layered architecture, an event-driven architecture, a microkernel architecture, a microservice architecture, or a cloud architecture.
  • the embodiment of the present application takes an Android system with a layered architecture as an example to illustrate the software structure of the electronic device 400 by way of example.
  • the embodiment of the present application also provides an image processing device, including a processor and a memory; when the processor executes the computer program stored in the memory, the image processing device executes the image processing method provided in the embodiment of the present application.
  • the image processing device can be deployed in a smart TV.
  • the embodiment of the present application also provides a storage medium, the storage medium may be a non-volatile computer-readable storage medium, and a computer program is stored in the storage medium, and the computer program instructs the terminal to execute any one provided in the embodiments of the present application.
  • Image processing method The storage medium may include: read-only memory (ROM) or random access memory (RAM), magnetic disks or optical disks, and other media that can store program codes.
  • the embodiments of the present application also provide a computer program product containing instructions.
  • the computer program product runs on a computer, the computer executes the image processing method provided in the embodiments of the present application.
  • the computer program product may include one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted through the computer-readable storage medium.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center integrated with one or more available media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (SSD)).
  • the embodiments of the present application also provide a chip, such as a CPU chip, which includes one or more physical cores and a storage medium.
  • the one or more physical cores are implemented after reading computer instructions in the storage medium.
  • the aforementioned image processing method can implement the foregoing image processing method in pure hardware or a combination of software and hardware, that is, the chip includes a logic circuit, and the logic circuit is used to implement any of the foregoing first aspects when the chip is running.
  • the logic circuit may be a programmable logic circuit.
  • the GPU can also be implemented like a CPU.
  • the program can be stored in a computer-readable storage medium.
  • the storage medium mentioned can be a read-only memory, a magnetic disk or an optical disk, etc.
  • a refers to B means that A and B are the same, or A is simply modified on the basis of B.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Computer Hardware Design (AREA)
  • Human Computer Interaction (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Image Processing (AREA)

Abstract

本申请公开了一种图像处理方法、装置及电子设备,属于图像处理领域。该方法包括:获取相邻两帧之间的帧间残差,得到残差块;基于所述残差块,确定需要执行超分的目标像素区域;仅对所述目标像素区域进行超分处理,得到超分处理后的目标像素区域。对于其他像素区域则直接采用两帧中已经经过超分处理的那一帧的超分结果。本申请能够解决超分处理的运算代价高的问题。

Description

图像处理方法、装置及电子设备
本申请要求于2019年10月22日提交的申请号为201911008119.2、发明名称为“图像处理方法、装置及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及图像处理领域,特别涉及一种图像处理方法、装置及电子设备。
背景技术
随着科技的发展,超分显示装置,如智能电视,的应用越来越广泛。超分,即超分辨率。超分显示装置是能够对图像进行超分处理的显示装置,超分处理是将低分辨率图像重建成高分辨率图像的技术。
目前,超分显示装置将解码后的图像输入超分模型,由超分模型对该图像进行超分处理,但是这样的图像处理方法,超分处理的运算量较大,运算代价较高。
发明内容
本申请实施例提供了一种图像处理方法、装置及电子设备,能够减小目前的超分处理的运算量,降低运算代价。下面通过不同的方面介绍本申请。应理解的是,以下不同方面的实现方式和有益效果可互相参考。
本申请中出现的“第一”和“第二”仅为了区分两个对象,并没有先后顺序的意思。
本申请实施例提供一种图像处理方法,所述方法包括:
获取第一图像与相邻的前一帧图像的帧间残差,得到残差块,该残差块包括与该第一图像的多个像素点位置一一对应的多个残差点,每个残差点具有一个残差值;基于该残差块,在第一图像中确定目标像素区域;对该第一图像中的目标像素区域进行超分处理,得到超分处理后的目标像素区域;采用超分处理后的前一帧图像中的其他像素区域更新第一图像中其他像素区域,该其他像素区域包括第一图像中除目标像素区域之外的像素区域。这样,无需对其他像素区域进行超分处理便达到了与对其他像素区域进行超分处理的同等效果。
其中,超分处理后的第一图像包括超分处理后的目标像区域以及更新后的其他像素区域(相当于超分处理后的其他像素区域)。
本申请实施例,通过确定第一图像中的目标像素区域,并对目标像素区域进行超分处理,实现对第一图像与前一帧图像存在差异的像素点所在区域的超分处理,并且采用超分处理后的前一帧图像中的其他像素区域更新第一图像中其他像素区域,达到了对其他像素区域进行超分处理的同等效果,充分利用了视频时域冗余的特性,因此通过对第一图像的部分区域进行超分处理,达到对第一图像进行全部超分处理的效果,减少了超分处理的运算量,降低了运算代价。
由于该其他像素区域包括第一图像中除目标像素区域之外的像素区域,因此目标像区域 与其他像素区域的尺寸可能匹配,也可能不匹配,相应的,获取超分处理后的第一图像的方式也不同,本申请实施例以以下两种示例为例进行说明:
在一种示例中,目标像区域与其他像素区域的尺寸匹配,也即是,其他像素区域为第一图像中除目标像素区域之外的像素区域;相应的,超分处理后的目标像区域以及更新后的其他像素区域的尺寸也匹配。则超分处理后的第一图像可以由超分处理后的目标像素区域以及更新后的其他像素区域拼接而成。
在另一种示例中,目标像区域与其他像素区域的尺寸不匹配,两者的边缘存在重叠区域,也即是,其他像素区域在第一图像中除目标像素区域之外的像素区域的基础上还包括其他的像素区域。相应的,超分处理后的目标像区域以及更新后的其他像素区域的尺寸也不匹配,两者的边缘存在重叠区域。由于第一图像的其他像素区域是由超分处理后的第二图像的其他像素区域更新得到的,包含的像素点的像素数据通常更准确,因此,该超分处理后的第一图像的重叠区域的像素数据通常以第一图像的更新后的其他像素区域的像素数据为准。则超分处理后的第一图像可以由更新后的目标像素区域及更新后的其他像素区域拼接而成,该更新后的目标像素区域为超分处理后的目标像素区域减去(也称去除)其与其他像素区域的重叠区域得到。更新后的目标像素区域相对于更新前的目标像素区域内缩,更新后的目标像素区域的尺寸与更新后的其他像素区域的尺寸匹配。
可选的,所述目标像素区域包括所述第一图像中与第一目标残差点位置对应的像素点所在区域,所述第一目标残差点为所述残差块中残差值大于指定阈值的点。可选的,指定阈值为0。
示例的,所述目标像素区域为所述第一图像中与第一目标残差点和第二目标残差点位置对应的像素点所在区域,所述第二目标残差点为所述残差块中所述第一目标残差点周边的残差点。第一目标残差点周边的残差点指的是围绕第一目标残差点设置的残差点,其为第一目标残差点的,符合指定条件的周边点。例如,为第一目标残差点的上、下、左和右的残差点;或者,为第一目标残差点的上、下、左、右、左上、左下、右上和右下的残差点。前述指定条件是基于超分处理的需求确定,例如基于超分模型中的感受野(receptive field)(如最后一个卷积层的感受野)设置。
示例的,该第二目标残差点为残差块中所述第一目标残差点周边的残差值不大于该指定阈值的残差点。即该第二目标残差点是第一目标残差点的,符合指定条件的周边点中残差值不大于该指定阈值的点。
由于残差块中所有残差点的残差值均为0时,说明第一图像和前一帧图像的内容并没有变化,两者超分处理后的图像也应该没有变化,可以采用超分处理后的前一帧图像更新第一图像,实现与对第一图像进行超分处理的同等效果,无需再对第一图像执行超分处理,即无需执行确定目标像素区域的动作,这样可以有效减小运算代价。相应的,前述基于残差块,在第一图像中确定目标像素区域可以包括:在残差块包括的至少一个残差点的残差值不为0时,基于残差块,在第一图像中确定目标像素区域。也即是,在残差块中包括残差值不为0的残差点时,再执行确定目标像素区域的动作。
在一些实现方式中,所述基于所述残差块,在所述第一图像中确定目标像素区域,包括:
基于所述残差块,生成掩膜图形,所述掩膜图形包括多个第一掩膜点,所述多个第一掩膜点与所述残差块中的多个目标残差点的位置一一对应;将所述掩膜图形和所述第一图像输 入超分模型,通过所述超分模型将所述第一图像中,与所述多个第一掩膜点中每个掩膜点位置对应的像素点所在区域,确定为所述目标像素区域。
相应的,所述对所述第一图像中的目标像素点进行超分处理,得到超分处理后的目标像素点,包括:
通过所述超分模型对所述第一图像中的所述目标像素区域进行超分处理,得到超分处理后的目标像素区域。
在另一些实现方式中,所述基于所述残差块,在所述第一图像中确定目标像素区域,包括:
基于所述残差块,生成掩膜图形,所述掩膜图形包括多个第一掩膜点,多个第一掩膜点与所述残差块中的多个目标残差点的位置一一对应;将所述第一图像中,与所述多个第一掩膜点中每个掩膜点位置对应的像素点所在区域,确定为所述目标像素区域。
相应的,所述对所述第一图像中的目标像素点进行超分处理,得到超分处理后的目标像素点,包括:
将所述第一图像中的目标像素点输入超分模型,通过所述超分模型对所述第一图像中的所述目标像素区域进行超分处理,得到超分处理后的目标像素区域。
在一些实现方式中,所述基于所述残差块,生成掩膜图形,包括:
基于所述残差块,生成包括多个掩膜点的初始掩膜图形,所述多个掩膜点与所述第一图像的多个像素点位置一一对应,所述多个掩膜点包括所述多个第一掩膜点和多个第二掩膜点;将所述初始掩膜图形中所述第一掩膜点的掩膜值赋值为第一值,将所述掩膜图形中所述第二掩膜点的掩膜值赋值为第二值,得到所述掩膜图形,所述第一值和所述第二值不同;
所述将所述第一图像中,与所述多个第一掩膜点位置对应的像素点,确定为所述目标像素区域,包括:遍历所述掩膜图形中的掩膜点,在所述第一图像中,将掩膜值为第一值的掩膜点所对应的像素点,确定为所述目标像素区域。
在一些实现方式中,所述基于所述残差块,生成掩膜图形,包括:
对所述残差块进行形态学变化处理,得到所述掩膜图形,所述形态学变化处理包括二值化处理和对二值化后的所述残差块中的第一掩膜点进行膨胀处理。示例的,二值化处理和膨胀处理可以依次执行。
其中,二值化处理是一种将图像中的每个像素点的像素值设置为第一值和第二值两种像素值中的一种的处理方式,第一值和第二值不同。图像在经过二值化处理后,仅包括两种像素值的像素点。该二值化处理能够减少图像中各种元素对于后续图像处理过程的干扰。
本申请实施例中的二值化处理可以采用全局二值化阈值法、局部二值化阈值法、最大类间方差法和迭代二值化阈值法中的任一方法,本申请实施例对此不做限定。
膨胀处理是一种求局部最大值的处理。将待处理的图像与预设的核(也称内核)进行卷积,在每次卷积过程中,核覆盖区域中最大值赋予指定像素点,从而使亮者更亮,得到的效果就是待处理的图像的亮的区域膨胀。其中,核具有一个可定义的锚点,该锚点通常为核的中心点,前述指定像素点即为该锚点。
在本申请实施例中,超分模型包括至少一个卷积核,膨胀处理的核与超分模型最后一个卷积层的感受野的尺寸相同。该最后一个卷积层即为超分模型的输出层,超分模型进行超分处理后的图像从这一层输出,该层的感受野是超分模型中各个卷积层对应的感受野中最大的 感受野。这样得到的掩膜图形适配于超分模型的最大感受野的尺寸,从而起到良好的引导作用,避免后续输入超分模型的图像的可卷积区域过小,所导致的图像无法被卷积层卷积的情况出现,保证后续输入超分模型的图像能够被有效超分。例如,该核的尺寸可以为3×3或5×5个像素点。
在一些实现方式中,所述基于所述残差块,生成掩膜图形,包括:
将所述残差块划分成多个子残差块,并对每个划分得到的子残差块执行分块处理,所述分块处理包括:
当所述子残差块中包括的至少一个残差点的残差值不为0时,将所述子残差块划分成多个子残差块,并对每个划分得到的子残差块执行所述分块处理,直至划分得到的子残差块中包括的残差点的残差值均为0,或者,划分得到的子残差块的残差点总数小于点数阈值,或者,对所述残差块的划分总次数达到次数阈值;生成与每个目标残差块对应的子掩膜图形,所述目标残差块包括的至少一个残差点的残差值不为0;其中,所述掩膜图形包括生成的子掩膜图形。
在一些实现方式中,所述对所述第一图像中的目标像素点进行超分处理,得到超分处理后的目标像素点,包括:
在所述第一图像中获取每个所述子掩膜图形对应的目标图像块;分别对每个所述目标图像块包含的所述目标像素区域的子区域进行超分处理,得到所述超分处理后的目标像素区域,所述超分处理后的目标像素区域由超分处理后的每个所述目标图像块包含的所述目标像素区域的子区域组成。
在本申请实施例中,从第一图像中可以筛选出多个目标图像块,对每个目标图像块进行超分处理,由于超分处理是针对目标图像块进行的,而每个目标图像块的尺寸小于第一图像,因此可以减少超分处理的运算复杂度,降低运算代价。尤其当超分处理由超分模型执行时,可以有效减少超分模型的复杂度,提高超分运算的效率。
在一些实现方式中,所述将所述残差块划分成多个子残差块,包括:采用四叉树划分的方式,将所述残差块划分成多个子残差块;
所述将所述子残差块划分成多个子残差块,包括:采用四叉树划分的方式,将所述子残差块划分成多个子残差块。
由于传统的视频解码过程需要进行图像的分块,该分块通常采用四叉树划分的方式,因此,本申请实施例中,残差块和子残差块采用四叉树划分的方式时,可以兼容传统的图像处理方法。示例的,在实际应用中,还可以通过前述视频解码过程中所采用的图像划分模块来进行残差块和子残差块的划分,从而实现模块的复用,节约运算代价。并且采用四叉树划分的方式,可以在每次划分时,将残差块或子残差块划分成尺寸相等的四个子残差块,使得划分得到的残差块尺寸均匀,便于后续处理。
在一些实现方式中,在所述采用超分处理后的所述前一帧图像中的其他像素区域更新所述第一图像中所述其他像素区域之前,所述方法还包括:
对所述掩膜图形中的多个第一掩膜点进行腐蚀处理,得到更新后的掩膜图形,所述腐蚀处理的核与所述超分模型最后一个卷积层的感受野的尺寸相同;将所述第一图像中,与腐蚀处理后的所述多个第一掩膜点位置对应的像素点,确定为辅助像素点;将所述第一图像中除所述辅助像素点之外的像素点所在区域确定为所述其他像素区域。
腐蚀处理可以消除图像的边缘噪声,通过腐蚀处理得到的更新后的掩膜图形来确定的其他像素区域,相较于前述第一种和第二种可选方式得到的其他像素区域,边缘更加清晰,噪声更小,后续执行像素更新过程,可以减少细节模糊、边缘钝化、颗粒感和噪声增强等负向效果,保证最终重建的第一图像的显示效果。
在一些实现方式中,在所述第一图像中确定目标像素区域之前,基于所述残差块,所述第一图像中确定目标像素区域,包括:统计所述残差块中残差值为0的残差点的数量在所述残差块中残差点总数中的第一占比;当所述第一占比大于第一超分触发占比阈值时,基于所述残差块,在所述第一图像中确定所述目标像素区域。
这样可以基于第一图像与前一帧图像的内容差异确定是否执行部分超分算法,从而提高图像处理的灵活性。
在一些实现方式中,所述超分模型为CNN模型,如SRCNN或ESPCN等模型;该超分模型也可以为GAN,如SRGAN或ESRGAN等模型。
第二方面,本申请示例性实施例提供了一种图像处理装置,装置包括一个或多个模块,这一个或多个模块用于实现前述第一方面任意一种图像处理方法。
第三方面,本申请实施例提供了一种电子设备,例如终端。该电子设备包括处理器和存储器。处理器通常包括CPU和/或GPU。所述存储器用于存储计算机程序;所述处理器用于执行所述存储器存储的所述计算机程序时实现前述第一方面任意一种图像处理方法。其中,CPU和GPU可以为两个芯片,也可以集成在同一块芯片上。
第四方面,本申请实施例提供了一种存储介质,该存储介质可以是非易失性的。该存储介质内存储有计算机程序,所述计算机程序在被处理器执行时使得所述处理器实现前述第一方面任意一种图像处理方法。
第五方面,本申请实施例提供了一种包含计算机可读指令的计算机程序或计算机程序产品,当计算机程序或计算机程序产品在计算机上运行时,使得计算机执行前述第一方面任意一种图像处理方法。该计算机程序产品中可以包括一个或多个程序单元,用于实现前述方法。
第六方面,本申请提供了一种芯片,例如CPU。所述芯片包括逻辑电路,所述逻辑电路可以为可编程逻辑电路。当所述芯片运行时用于实现前述第一方面任意一种图像处理方法。
第七方面,本申请提供了一种芯片,例如CPU。所述芯片包括一个或多个物理核、以及存储介质,所述一个或多个物理核在读取所述存储介质中的计算机指令后实现前述第一方面任意一种图像处理方法。
第八方面,本申请提供一种芯片,例如GPU。该芯片包括一个或多个物理核、以及存储介质,所述一个或多个物理核在读取所述存储介质中的计算机指令后实现前述第一方面的任意一种图像处理方法。
第九方面,本申请提供一种芯片,例如GPU。该芯片包括逻辑电路,所述逻辑电路可以为可编程逻辑电路。当所述芯片运行时用于实现前述第一方面的任意一种图像处理方法。
综上所述,本申请实施例,通过确定第一图像中的目标像素区域,并对目标像素区域进行超分处理,实现对第一图像与前一帧图像存在差异的像素点所在区域的超分处理,并且采用超分处理后的前一帧图像中的其他像素区域更新第一图像中其他像素区域,达到了对其他像素区域进行超分处理的同等效果,充分利用了视频时域冗余的特性,因此通过对第一图像的部分区域进行超分处理,达到对第一图像进行全部超分处理的效果,减少了超分处理的运 算量,降低了运算代价。
采用本申请实施例处理的测试视频,相对于传统技术中直接对视频进行全部超分处理,可以节省约45%的超分计算量。超分计算量的显著降低,一方面,有利于加快视频处理速度,保证视频能达到基本的帧率要求,从而保障了视频的实时性,不会出现播放延时卡顿等情况;另一方面,计算量减少意味着超分显示装置中的计算单元的更少的处理任务和消耗,带来了整体功耗的下降,节约了装置的用电量。
并且,本申请实施例所提出的部分超分算法,不是只超分部分图像区域,其他部分用非超分手段处理的牺牲效果换效率的方法,而是避免对视频前后帧的未变化区域、冗余时域信息的重复超分,本质上是一种追求信息利用率最大化的方法。对于采用部分超分算法的第一图像,通过设置掩膜图形,指导超分模型进行精确到像素点级别的超分。最终处理到的视频中,实质上每帧图像的所有的像素值都来自于超分计算结果,与传统的全部超分的算法的显示效果相同,避免显示效果的牺牲。
并且,若采用输入子掩膜图形和目标图像块的方式,由超分模型进行超分处理,则超分模型每次超分处理的复杂度较低,对超分模型的结构复杂度要求较低,从而可以简化超分模型,降低对处理器性能的要求,提高超分处理效率。
附图说明
图1是是本申请实施例提供的一种图像处理方法所涉及的超分显示装置的结构示意图;
图2是本申请实施例提供的一种图像处理方法流程图;
图3是本申请实施例提供的第一图像的像素值示意图;
图4是本申请实施例提供的第二图像的像素值示意图;
图5是本申请实施例提供的一种残差块示意图;
图6是图5所示的残差块的原理解释示意图;
图7是本申请实施例提供的一种在第一图像中确定目标像素区域的流程示意图;
图8是本申请实施例提供的一种膨胀处理原理示意图;
图9是本申请实施例提供的另一种膨胀处理原理示意图;
图10是本申请实施例提供的另一种在第一图像中确定目标像素区域的流程示意图;
图11是本申请实施例提供的一种腐蚀处理原理示意图;
图12是本申请实施例提供的一种采用超分处理后的第二图像中的其他像素区域K1更新第一图像中其他像素区域K2的原理示意图;
图13是本申请实施例提供的一种图像处理方法的原理示意图;
图14是本申请实施例提供的一种图像处理装置的框图;
图15是本申请实施例提供的一种第一确定模块的框图;
图16是本申请实施例提供的另一种图像处理装置的框图;
图17是本申请实施例提供的一种电子设备的框图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
为了便于读者理解,下面先对本申请实施例中涉及的名词进行解释。
本申请中出现的“多个”若无特殊说明指代“两个或两个以上”或“至少两个”。本申请中出现的“A和/或B”至少包括“A”、“B”和“A和B”三种情况。
图像分辨率:用于反映图像中存储的信息量,指的是图像中像素点的总数。图像分辨率通常采用横向像素数×纵向像素数表示。
1080p:一种显示格式,P表示逐行扫描(Progressive scan),1080p的图像分辨率通常为1920×1080。
2k分辨率:一种显示格式,对应的图像分辨率通常为2048×1152。
4k分辨率:一种显示格式,对应的图像分辨率通常为3840×2160。
480p分辨率:一种显示格式,对应的图像分辨率通常为640×480。
360p分辨率:一种显示格式,对应的图像分辨率通常为480×360。
超分,即超分辨率(Super Resolution),超分处理是将低分辨率图像重建成高分辨率图像的技术,也即是重建后的图像的图像分辨率大于重建前的图像的图像分辨率,重建后的图像也称超分图像。例如,超分处理可以为将360p分辨率的图像重建为480p分辨率的图像,或将2k分辨率的图像重建为4K分辨率的图像。
颜色空间,也称彩色模型、彩色空间或彩色系统,用于反映图像所涉及的颜色。不同颜色空间对应不同的颜色编码格式,目前较为常用的两种颜色空间为YUV颜色空间和RGB颜色空间(颜色空间中的每种颜色也称为一个颜色通道),相应的颜色编码格式为YUV格式和RGB格式。
其中,当颜色编码格式为YUV格式时,一个像素点具有的像素值包括:亮度分量Y的值、色度分量U的值和色度分量V的值;当颜色编码格式为RGB格式时,一个像素点的像素值包括透明度分量的值和多个颜色分量的值,该多个颜色分量可以包括红色分量R、绿色分量G和蓝色分量B。
卷积神经网络(Convolutional Neural Network,CNN)是一种前馈神经网络,它的人工神经元(Neuron)可以响应一部分覆盖范围内的周围单元,能根据图像特征进行图像的处理。
一般地,卷积神经网络的基本结构包括两层,其一为特征提取层,每个神经元的输入与前一层的局部接受域相连,并提取该局部接受域的特征。其二是特征映射层,网络的每个特征映射层由多个特征映射组成,每个特征映射为一个平面。特征映射层设置有激活函数(activation function),通常的激活函数为非线性映射函数,可以为sigmoid函数或神经网络回顾(Rectified linear unit,ReLU)函数。卷积神经网络由大量的节点(也称“神经元”或“单元”)相互连接而成,每个节点代表一种特定的输出函数。每两个节点之间的连接代表加权值,称之为权重(weight)。不同的权重和激活函数,会导致卷积神经网络不同的输出。
通常情况下,卷积神经网络包括至少一个卷积层,每个卷积层包括一个特征提取层和一个特征映射层,当卷积神经网络包括多个卷积层时,该多个卷积层依次连接,感受野(Receptive Field)是卷积神经网络每一个卷积层输出的特征图(feature map)上每个像素点在原始图像(指的是输入该卷积神经网络的图像)上映射的区域大小。
卷积神经网络相较于传统的图像处理算法的优点之一在于,避免了对图像复杂的前期预处理过程(提取人工特征等),可以直接输入原始图像,进行端到端的学习。卷积神经网络相较于传统的神经网络的优点之一在于,传统的神经网络都是采用全连接的方式,即输入层到 隐藏层的神经元都是全部连接的,这样做将导致参数量巨大,使得网络训练耗时甚至难以训练,而卷积神经网络则通过局部连接和权值共享等方式避免了这一问题。
请参考图1,图1是本申请实施例提供的一种图像处理方法所涉及的超分显示装置10的结构示意图,该超分显示装置10可以为智能电视、智慧屏、智能手机、平板电脑、电子纸、显示器、笔记本电脑、数码相框或导航仪等具有显示功能以及超分处理功能的产品或部件。该超分显示装置10包括:处理器101,显示控制模块102和存储器103,其中,处理器101用于对从视频源获取的视频中的图像进行处理,并将处理的图像传输至显示控制模块102,处理后的图像适配于显示控制模块102的格式需求;显示控制模块102用于将接收到的处理后的图像,进行处理得到与显示模块(图1未标示)适配的驱动信号,并基于该驱动信号驱动显示模块进行图像的显示;存储器103用于存储视频数据。
示例的,该处理器101可以包括中央处理器(Central Processing Unit,CPU)和/或图形处理器(Graphics Processing Unit,GPU),该处理器101可以集成在显卡上;显示控制模块102可以为时序控制器(Timing Controller,TCON)或微控制单元(Microcontroller Unit;MCU);显示模块可以为显示屏;存储器103可以为双倍速率动态随机存储器(Double Data Rate,DDR)。在本申请实施例中,存储器103中存储有超分模型1031,处理器101对从视频源获取的视频中的图像的处理过程可以包括:对从视频源获取的视频进行解码,将解码得到的图像进行预处理(如后续步骤201或步骤202等),将预处理得到图像输入超分模型1031,通过超分模型1031对预处理后的图像进行超分处理。示例的,该超分模型可以为CNN模型,如超分辨率卷积神经网络(Super-Resolution Convolutional Neural Network,SRCNN)或高效亚像素卷积神经网络(Efficient sub-pixel Convolutional Neural Network,ESPCN)等模型;该超分模型也可以为生成式对抗网络(Generative Adversarial Network,GAN),如超分辨率生成式对抗网络(Super-Resolution Generative Adversarial Network,SRGAN)或增强型超分辨率生成对抗网络(Enhanced Super-Resolution Generative Adversarial Networks,ESRGAN)等模型。
目前,超分显示装置将解码后的图像直接输入超分模型,由超分模型对该图像进行超分处理,但是这样的图像处理方法,超分处理的运算量较大,运算代价较高。
本申请实施例提供一种图像处理方法,提出了部分超分算法,能够减少超分处理的运算量,降低运算代价,该图像处理方法可以应用于图1所示的超分显示装置,由于视频中可以包括多个图像,本申请实施例以第一图像为例对该图像处理方法进行说明,该第一图像为视频中的一帧图像,且该第一图像为视频中的非首帧图像(即不是第一帧图像),其他的非首帧图像的处理方法可以参考该第一图像的处理方法,假设与第一图像相邻的前一帧图像为第二图像,如图2所示,该方法包括:
步骤201、超分显示装置获取第一图像与第二图像的帧间残差,得到残差块。
帧间残差指的是视频相邻两帧图像的像素值的绝对值差异,能够反应相邻两帧图像的内容变化情况(即像素值的变化)。残差块即为求取帧间残差的结果,残差块的尺寸与第一图像的尺寸以及第二图像的尺寸相同,残差块包括与第一图像的多个像素点位置一一对应的多个残差点,每个残差点具有一个残差值,每个残差值是第一图像与第二图像对应位置的像素点的像素值的差值的绝对值。
本申请实施例中,残差块的获取方式有多种,本申请实施例以以下两种方式为例,对残差块的获取方式进行说明:
第一种可实现方式,计算第一图像与第二图像的帧间残差,得到残差块。该帧间残差为第一图像和第二图像对应位置的像素点的像素值的差值的绝对值。假设第一图像为视频中第t帧图像,t>1,则第二图像为第t-1帧图像。
在第一种示例中,当第一图像和第二图像的颜色空间为RGB颜色空间时,其涉及的颜色编码格式为RGB编码格式。在这种情况下,第一图像和第二图像的像素点的像素值中的透明度分量的值在帧间残差中通常忽略不计,第一图像和第二图像的像素点的像素值包括红色分量R、绿色分量G和蓝色分量B的值,则帧间残差Residual包括Residual[R]、Residual[G]和Residual[B],则该帧间残差Residual满足:
Residual[R]=Absdiff(R Frame(t-1),R Frame(t));(公式一)
Residual[G]=Absdiff(G Frame(t-1),G Frame(t));(公式二)
Residual[B]=Absdiff(B Frame(t-1),B Frame(t))。(公式三)
其中,Absdiff表示计算两张图像对应位置的像素点的像素值(R值、G值或B值)的差值的绝对值;Frame(t)为第一图像;Frame(t-1)为第二图像;R Frame(t-1)表示第二图像的红色分量R值,R Frame(t)表示第一图像的红色分量R值;G Frame(t-1)表示第二图像的绿色分量G值,G Frame(t)表示第一图像的绿色分量G值;B Frame(t-1)表示第二图像的蓝色分量B值,B Frame(t)表示第一图像的蓝色分量B值。
在第二种示例中,当第一图像和第二图像的颜色空间为YUV颜色空间时,其涉及的颜色编码格式为YUV编码格式。在这种情况下,采用第一图像和第二图像的亮度分量Y的帧间残差来表征第一图像和第二图像的帧间残差,则帧间残差Residual=Residual[Y]。
在一种可选方式中,该帧间残差Residual满足:
Residual=Residual[Y]=0.299×Residual[R]+0.587×Residual[G]+0.144×Residual[B]。(公式四)
其中,Residual[R]、Residual[G]和Residual[B]的获取方式分别参考前述公式一至公式三,帧间残差Residual为前述RGB三个颜色通道的帧间残差按照一定的比例转化得到。
在另一种可选方式中,该帧间残差Residual满足:
Residual=|Residual[Y1]-Residual[Y2]|;(公式五)
其中,Residual[Y1]为第一图像的亮度分量的值,其由第一图像的RGB值按照一定的比例转化得到;Residual[Y2]为第二图像的亮度分量的值,其由第二图像的RGB值按照一定的比例转化得到。其中,该一定比例可以为公式四中的比例,即R值、G值和B值的比例分别为0.299,0.587和0.144。
示例的,假设第一图像和第二图像分别包括5×5个像素点,以亮度分量Y表征像素值,第一图像的像素点的像素值如图3所示,第二图像的像素点的像素值如图4所示,则最终获取的残差块如图5所示,包括5×5个残差点,每个残差点的残差值为第一图像和第二图像中的对应位置的像素点的像素值的差值的绝对值。
值得说明的是,本申请实施例的保护范围并不局限于此,当图像的颜色编码格式为其他格式时,任何熟悉本技术领域的技术人员在本申请实施例揭露的技术范围内,也可以采用本申请实施例提供的帧间残差计算方法轻易想到变换或替换来进行帧间残差的计算,因此,这 些可轻易想到变化或替换,也涵盖在本申请实施例保护范围内。
第二种可实现方式,获取预先存储的帧间残差,得到残差块。
如前所述,超分显示装置获取的第一图像是已经解码的图像,在视频解码过程中,涉及相邻两帧图像的帧间残差的计算过程,因此,在视频解码过程中,可以存储计算得到的每相邻两帧图像的帧间残差,在需要获取残差块时,直接提取预先存储的帧间残差得到残差块即可。
本申请实施例中,处理器进行视频解码所采用的标准可以为H.261至H.265,以及MPEG-4V1至MPEG-4V3等中的任意一种。其中,H.264,又称视频编码(Advanced Video Coding,AVC),H.265,又称高效率视频编码(High Efficiency Video Coding,HEVC),两者均采用运动补偿混合编码算法。
以H.265为例,H.265的编码架构大致上和H.264的编码架构相似,主要包含:熵解码(entropy coding)模块、帧内预测(intra prediction)模块、帧间预测(inter prediction)模块、反变换模块、反量化模块和环路滤波模块等模块。环路滤波模块包括去块滤波(deblocking)模块和采样点自适应偏移(Sample Adaptive Offset,SAO)等。其中,熵解码模块用于对视频源提供的码流(bitstream)进行处理得到模式信息和帧间残差。则在熵解码模块处理得到帧间残差后,可以存储该帧间残差,以在执行到步骤201时提取帧间残差。
通过该第二种可实现方式获取残差块,可以减少对帧间残差的重复计算,降低运算代价,节约图像处理的整体时长。尤其在视频源提供的视频的图像分辨率较高的场景下,可以有效降低图像处理时延。
步骤202、超分显示装置检测残差块包括的残差点的残差值是否为0,若确定残差块包括的至少一个残差点的残差值不为0,执行步骤203,若确定残差块包括的所有残差点的残差值为0,执行步骤207。
超分显示装置可以遍历残差块中的每个残差点,检测每个残差点的残差值是否为0,当残差块中至少一个残差点的残差值不为0,可以执行步骤203,后续步骤203至206对应部分超分算法,当残差块中所有残差点的残差值为0,可以执行步骤207。示例的,超分显示装置可以按照从左到右,从上到下的扫描顺序进行残差点的遍历。
步骤203、超分显示装置基于残差块,在第一图像中确定目标像素区域。
本申请实施例中,目标像素区域是第一图像中与残差块中的目标残差点的位置对应的像素点所在区域,第一图像中与目标残差点位置对应的点称之为目标像素点。通常,目标残差点包括第一目标残差点和第二目标残差点两种类型的残差点。可选的,该目标像素区域包括第一图像中与第一目标残差点位置对应的像素点所在区域。例如,目标像素区域为第一图像中与第一目标残差点和第二目标残差点位置对应的像素点所在区域。其中,第一目标残差点为残差块中残差值大于指定阈值的点,第二目标残差点为残差块中第一目标残差点周边的残差点。第一目标残差点周边的残差点指的是围绕第一目标残差点设置的残差点,其为第一目标残差点的,符合指定条件的周边点。例如,为第一目标残差点的上、下、左和右的残差点;或者,为第一目标残差点的上、下、左、右、左上、左下、右上和右下的残差点。可选的,前述指定阈值为0。
示例的,该第二目标残差点为残差块中第一目标残差点周边的残差值不大于(即小于或等于)该指定阈值的残差点。即该第二目标残差点是第一目标残差点的,符合指定条件的周 边点中残差值不大于该指定阈值的点。在前述指定阈值为0时,该第二目标残差点为残差块中第一目标残差点周边的残差值为0的残差点。
第一目标残差点所在区域是第一图像和第二图像内容存在区别的区域,基于该区域可以找到第一图像中实际需要超分处理的区域,由于在对第一图像的一个区域进行超分处理时通常需要参考该区域的周边区域的像素的像素值,因此还需要找到第一图像中需要超分的区域的周边区域,而第二目标残差点所在区域是第一目标残差点所在区域的周边区域,通过确定第二目标残差点,可以确定第一图像中实际需要超分的区域的周边区域,从而适配超分处理的需求,保证后续进行有效的超分处理。前述指定条件是基于超分处理的需求确定,例如基于超分模型中的感受野(receptive field)(如最后一个卷积层的感受野)设置。
假设第一目标残差点为残差块中残差值大于指定阈值的点,第二目标残差点为残差块中第一目标残差点周边的残差值不大于该指定阈值的残差点,且该指定阈值为0。以图6所示的残差块为例,图6是图5所示的残差块的原理解释示意图,残差块的目标残差点所在区域为区域K和区域M组成的区域,其中区域K包括第一目标残差点P1,以及其周边的第二目标残差点P7至P14;区域M包括第一目标残差点P2至P6,以及其周边的第二目标残差点P15至P19。因此,最终确定的第一目标残差点为P1至P6,第二目标残差点包括P7至P19。则第一图像中确定的目标像素点即为与残差点P1至P19位置相同的像素点。图6以符合指定条件的周边点为每个第一目标残差点的上、下、左、右、左上、左下、右上和右下的残差点为例进行说明,但并不对此进行限定。
在本申请实施例中,可以采用多种方式确定目标像素区域,本申请实施例以以下两种确定方式为例进行说明:
在第一种确定方式,在超分模型内部确定目标像素区域,如图7所示,基于残差块,在第一图像中确定目标像素区域的过程,包括:
步骤2031、超分显示装置基于残差块,生成掩膜(mask)图形。
该掩膜图形用于指示目标像素区域的位置,对于在第一图像中筛选目标像素区域,起到引导作用。掩膜图形包括多个第一掩膜点,多个第一掩膜点与残差块中的多个目标残差点的位置一一对应,该多个目标残差点至少包括第一目标残差点,通常,该多个目标残差点包括第一目标残差点和第二目标残差点;也即是,多个第一掩膜点用于标识残差块中的多个目标残差点的位置,由于残差块的多个目标残差点与第一图像中的目标像素区域的多个目标像素点的位置一一对应,因此,多个第一掩膜点用于标识第一图像中的多个目标像素区域的位置。通过第一掩膜点可以找到目标像素点所在的目标像素区域。
本申请实施例中,以以下两种可选实现方式为例对步骤2031进行示意性说明:
在第一种可选实现方式中,可以对残差块进行形态学变化(morphological transformations)处理,得到掩膜图形。该形态学变化处理包括二值化(binarization)处理和膨胀(dilation)处理。示例的,先对残差块进行二值化处理,得到二值化后的残差块;再对二值化后的残差块进行膨胀处理,得到膨胀后的残差块,将膨胀后的残差块作为掩膜图形。
其中,二值化处理是一种将图像中的每个像素点的像素值设置为第一值和第二值两种像素值中的一种的处理方式,第一值和第二值不同。图像在经过二值化处理后,仅包括两种像素值的像素点。该二值化处理能够减少图像中各种元素对于后续图像处理过程的干扰。
残差块中的每个残差点的残差值是两个像素点的像素值的差值的绝对值,残差块相当于 两个图像的差别图像。因此,残差块也可以视为一个图像,其包含的残差点相当于图像的像素点,残差点的残差值相当于像素点的像素值。
示例的,在RGB色彩空间,由于每个残差点的残差值包括R、G、B三个颜色分量的值,即前述的Residual[R]、Residual[G]和Residual[B],残差点的残差值可以采用其灰度值表征,以简化计算过程。残差点的灰度值用于反映残差点的亮暗程度,其可以由残差点的R、G、B值转换得到,即将残差点的残差值,如前述Residual[R]、Residual[G]和Residual[B]转换得到,该转化过程可以参考传统的R、G、B值转换得到灰度值的过程,本申请对此不做赘述。残差点的灰度值范围一般为0到255,白色残差点的灰度值为255,黑色残差点的灰度值为0。在对残差块进行二值化处理时,可以判断残差块中的每个残差点的残差值是否大于二值化阈值(例如,该二值化阈值可以为固定值,也可以为可变值,当其为可变值时,可以采用局部自适应二值化的方式确定二值化阈值)。在某个残差点的残差值大于该二值化阈值时,将该残差点的残差值设置为第一值,在某个残差点的残差值小于或等于该二值化阈值时,将该残差点的残差值设置为第二值。
示例的,在YUV色彩空间,由于每个残差点的残差值包括一个值,即前述的Residual[Y],在对残差块进行二值化处理时,可以判断残差块中的每个残差点的残差值是否大于二值化阈值(例如,该二值化阈值可以为固定值,也可以为可变值,当其为可变值时,可以采用局部自适应二值化的方式确定二值化阈值)。在某个残差点的残差值大于该二值化阈值时,将该残差点的残差值设置为第一值,在某个残差点的残差值小于或等于该二值化阈值时,将该残差点的残差值设置为第二值。
示例的,前述第一值为非零值,第二值为0。例如,前述第一值为255,第二值为0;或者第一值为1,第二值为0。实际实现时,数值越小,占用的存储空间越小,因此,通常将第一值设置为1,第二值设置为0,以节约存储空间。
该二值化阈值可以基于前述指定阈值确定,例如在RGB色彩空间,由于与二值化阈值进行比较的是残差点的灰度值,其由残差值转换得到,因此,二值化阈值是对指定阈值采用相同转换规则转换得到;在YUV色彩空间,由于与二值化阈值进行比较的是残差点的残差值,因此,二值化阈值等于指定阈值。示例的,指定阈值可以为0,相应的,该二值化阈值可以为0。
采用前述方式得到的二值化后的残差块中,只有两种取值,减小后续计算复杂度;并且该第一值对应的残差点即为前述第一目标残差点,在后续处理过程中可以简单迅速地定位到第一目标残差点。
本申请实施例中的二值化处理可以采用全局二值化阈值法、局部二值化阈值法、最大类间方差法和迭代二值化阈值法中的任一方法,本申请实施例对此不做限定。
膨胀处理是一种求局部最大值的处理。将待处理的图像与预设的核(也称内核)进行卷积,在每次卷积过程中,核覆盖区域中最大值赋予指定像素点,从而使亮者更亮,得到的效果就是待处理的图像的亮的区域膨胀。其中,核具有一个可定义的锚点,该锚点通常为核的中心点,前述指定像素点即为该锚点。
如图8所示,图8假设待处理图像为F1,该待处理图像包括5×5个像素点,其中的阴影表示亮点,核为F2中的阴影部分,共5个像素点,锚点为该5个像素点的中心点B,则最终膨胀处理后的图像为F3。图8中“*”表示求卷积。
本申请实施例中,待处理的图像为残差块,对残差块进行膨胀处理指的是对二值化后的残差块的第一目标残差点进行膨胀处理。为了适配于前述的膨胀处理的要求,若前述第一值为非零值,第二值为0,则第一目标残差点为亮点,其他残差点(即除第一目标残差点之外的点)为暗点,可以直接执行前述膨胀处理,实现第一目标残差点所在区域的膨胀;若前述第一值为0,第二值为非零值,则可以通过指定算法将第一值更新为非零值,将第二值更新为0,这样第一目标残差点为亮点,其他残差点为暗点,再执行前述膨胀处理。值得说明的是,请参考图8和图9,若残差块为F1,膨胀处理后的残差块为F3,则F1和F3中斜线阴影所对应的残差点为第一目标残差点,F3中“×”形阴影所对应的残差点为第二目标残差点。图9中的核为F4,与图8中的F2不同,最终形成的膨胀处理后的残差块F3不同,也即掩膜图形不同。
在本申请实施例中,超分模型包括至少一个卷积核,膨胀处理的核与超分模型最后一个卷积层的感受野的尺寸相同。该最后一个卷积层即为超分模型的输出层,超分模型进行超分处理后的图像从这一层输出,该层的感受野是超分模型中各个卷积层对应的感受野中最大的感受野。这样得到的掩膜图形适配于超分模型的最大感受野的尺寸,从而起到良好的引导作用,避免后续输入超分模型的图像的可卷积区域过小,所导致的图像无法被卷积层卷积的情况出现,保证后续输入超分模型的图像能够被有效超分。例如,该核的尺寸可以为3×3或5×5个像素点。
在第二种可选实现方式中,由于前述膨胀处理的原理,类似于求m邻域的算法,邻域指的是以目标点为中心的开区间,求m邻域的算法指的是,以目标点为中心获取与目标点相邻的m个点的开区间。m为超分模型的最后一个卷积层的感受野的尺寸-1,例如,超分模型的最后一个卷积层的感受野为5×5个像素点,则m=8。如前述图8和图9所示,图8的膨胀处理相当于对图F1中的每个第一目标残差点求4邻域,图9的膨胀处理相当于对图F1中的每个第一目标残差点求8邻域。在本申请实施例中,由于超分模型的最后一个卷积层的感受野通常为3×3或5×5个像素点,因此,m=8或24。其中,当m=8时,8邻域指的是以目标点为中心的上、下、左、右、左上、左下、右上和右下共8个点;当m=24时,24邻域指的是以目标点为中心的上、下、左、右、左上、左下、右上和右下共8个点,以及围绕该8个点一圈的16个点,一共24个点。
则上述步骤2031包括:先对残差块进行二值化处理,得到二值化后的残差块;再对二值化后的残差块中每个第一目标残差点求m邻域,将每个目标残差点的残差值(即前述第一值)填充至对应的m邻域中,得到掩膜图形。二值化处理的过程可以参考前述实现方式,m邻域的求取方法可以参考相关技术,本申请实施例对此不再赘述。
值得说明的是,本申请实施例中,前述步骤2031中基于残差块,生成掩膜图形的过程还可以通过其他方式实现,例如包括:基于该残差块,生成包括多个掩膜点的初始掩膜图形,该多个掩膜点与该第一图像的多个像素点位置一一对应,该多个掩膜点包括多个第一掩膜点和多个第二掩膜点。也即是,该初始掩膜图形为确定了第一掩膜点和第二掩膜点的图形;将该初始掩膜图形中该第一掩膜点的掩膜值赋值为第一值,将该掩膜图形中该第二掩膜点的掩膜值赋值为第二值,得到该掩膜图形,该第一值和该第二值不同。这样获取的掩膜图形为二值化图像,后续在定位目标像素点时,可以通过查找第一值的方式定位到第一掩膜点,再索引到目标像素点,能够实现目标像素点的快速确定。
在本申请实施例中,可以直接对残差块进行整体处理,生成前述掩膜图形,也可以将残差块划分成多个子残差块,对每个子残差块进行处理得到子掩膜图形,生成的子掩膜图形组成掩膜图形,由于掩膜图形是分块生成的,因此可以减少运算复杂度,降低运算代价。当处理器的运算能力较强时,可以同时执行多个子掩膜图形的生成过程,节约图像处理时长。
示例的,基于残差块,生成掩膜图形的步骤可以包括:
步骤A1、将残差块划分成多个子残差块,并对每个划分得到的子残差块执行分块处理,分块处理包括:当子残差块中包括的至少一个残差点的残差值不为0时(即子残差块包括残差值不为0的残差点),将子残差块划分成多个子残差块,并对每个划分得到的子残差块执行分块处理,直至划分得到的子残差块中包括的残差点的残差值均为0(即子残差块不包括残差值不为0的残差点),或者,划分得到的子残差块的残差点总数小于点数阈值,或者,对残差块的划分总次数达到次数阈值。
其中,当划分得到的子残差块中包括的残差点的残差值均为0,说明该子残差块所对应的第一图像以及第二图像的区域没有产生内容变化,无需再继续进行分块;当划分得到的子残差块的残差点总数小于点数阈值,说明划分得到的子残差块的尺寸已经足够小,若再进行进一步划分,一方面,后续过程中若将对应的目标图像块输入至超分模型(参考后续步骤B2),容易导致超分模型无法进行有效超分,影响超分模型的超分效果,另一方面,子残差块的尺寸过小,容易导致运算代价过大,因此也没有必要再划分下去了。对残差块的划分总次数达到次数阈值时,再继续进行划分的话一方面可能导致划分次数过多引起运算代价过大,另一方面可能导致划分得到的子残差块尺寸过小,影响超分效果,因此也无需继续划分,从而减少运算代价,保证超分效果。示例的,前述点数阈值和次数阈值可以基于第一图像的图像分辨率确定,例如与第一图像的图像分辨率正相关。即图像分辨率越大,点数阈值和次数阈值越大。通常,次数阈值可以为2次或3次。
通过将前述步骤A1,可以实现对残差块的循环划分,最终得到多个子残差块。当处理器的运算能力较强时,当多个子残差块均需要进行划分时,可以同时执行多个子残差的划分过程,节约图像处理时长。
其中,残差块和子残差块均可以采用二叉树划分或四叉树划分的方式进行分块。采用二叉树划分的残差块或子残差块,每次被划分为2个尺寸相等或不等的子残差块;采用四叉树划分的残差块或子残差块,每次被划分为4个尺寸相等或不等的子残差块。值得说明的是,残差块和子残差块还可以有其他划分方式,只要保证能够实现有效分块即可,本申请实施例对此不做限定。
由于传统的视频解码过程需要进行图像的分块,该分块通常采用四叉树划分的方式,因此,本申请实施例中,残差块和子残差块采用四叉树划分的方式时,可以兼容传统的图像处理方法。示例的,在实际应用中,还可以通过前述视频解码过程中所采用的图像划分模块来进行残差块和子残差块的划分,从而实现模块的复用,节约运算代价。如前所述,一般情况下,视频中图像的图像分辨率为360p、480p、720p、1080p、2k和4k等,均为4的整数倍,因此采用四叉树划分的方式,可以在每次划分时,将残差块或子残差块划分成尺寸相等的四个子残差块,即实现残差块或子残差块的四等分,使得划分得到的残差块尺寸均匀,便于后续处理。当然,在对残差块进行多次划分后,可能存在子残差块不能被四等分的情况,可以尽量进行子残差块的均匀划分,使得每次划分后的4子残差块中任意两个子残差块之间的尺 寸差距小于或等于指定差距阈值即可,这样也不会影响最终得到的子掩膜图形的功能。
步骤A2、生成与每个目标残差块对应的子掩膜图形,前述掩膜图形包括生成的子掩膜图形。其中,目标残差块包括的至少一个残差点的残差值不为0。
在本申请实施例中,由于除目标残差块之外的其他残差块包括的残差点的残差值均为0,说明该子残差块所对应的第一图像以及第二图像的区域没有产生内容变化,无需对第一图像的该子残差块对应区域进行超分,因此也不需要获取子掩膜图形来进行目标像素区域的引导。这样,相对于整体生成掩膜图形,分块生成掩膜图形的方式可以减少对目标残差块之外的其他残差块的处理,仅为目标残差块生成子掩膜图形,减少了运算代价。
其中,每个子掩膜图形的生成方式可以参考前述步骤2031中的两种可选实现方式。例如,对每个目标子残差块进行形态学变化处理,得到与每个目标子残差块对应的子掩膜图形。具体的,先对目标子残差块进行二值化处理,得到二值化后的目标子残差块;再对二值化后的目标子残差块进行膨胀处理,得到膨胀后的目标子残差块,将膨胀后的目标子残差块作为子掩膜图形。又例如,先对每个目标子残差块进行二值化处理,得到二值化后的目标子残差块;再对二值化后的每个目标子残差块中每个第一目标残差点求m邻域,将每个目标残差点的残差值填充至对应的m邻域中,得到对应的子掩膜图形。每个目标子残差块的处理过程可以参考前述图8和图9,本申请实施例对此不再赘述。
可选地,前述每个子掩膜图形的尺寸与对应的目标子残差块的尺寸相同。
步骤2032、超分显示装置将掩膜图形和第一图像输入超分模型,通过超分模型将第一图像中,与多个第一掩膜点位置对应的像素点所在区域,确定为目标像素区域。
在步骤2032中,超分显示装置通过超分模型确定目标像素区域。传统的超分模型仅对接收的图像进行超分处理,本申请实施例中,可以在传统的超分模型的前端(即输入端)添加用于确定目标像素区域的代码,以实现目标像素区域的确定,这样超分显示装置仅需将掩膜图形和第一图像输入超分模型,减少超分显示装置中超分模型之外的模块的运算复杂度。
参考前述实施例可知,掩膜图形包括多个掩膜点,多个掩膜点与第一图像的多个像素点位置一一对应,每个掩膜点具有一个掩膜值。在一种可选示例中,掩膜图形为二值化图像,该多个掩膜点包括多个第一掩膜点和多个第二掩膜点,第一掩膜点的掩膜值为第一值,第二掩膜点的掩膜值为第二值,第一值和第二值不同。如前所述,第一值和第二值分别可以为非零值和0中的一者。本申请实施例假设第一值为非零值(如1),第二值为0。在另一种可选示例中,掩膜图形为单色图像,该多个掩膜点仅包括多个第一掩膜点。
在步骤2032中,通过超分模型遍历掩膜图形中的掩膜点,在第一图像中,将掩膜值为第一值的掩膜点所对应的像素点(即目标像素点)所在区域,确定为目标像素区域。
在第二种确定方式,在超分模型外部确定目标像素区域,如图10所示,在第一图像中确定目标像素区域的过程,包括:
步骤2033、超分显示装置基于残差块,生成掩膜图形。
掩膜图形包括多个第一掩膜点,多个第一掩膜点与残差块中的多个目标残差点的位置一一对应。步骤2033的过程可以参考前述步骤2031的过程,本申请实施例对此不做赘述。
步骤2034、超分显示装置将第一图像中,与多个第一掩膜点位置对应的像素点(即目标像素点)所在区域,确定为目标像素区域。
参考前述步骤2032,超分显示装置可以遍历掩膜图形中的掩膜点,在第一图像中,将掩膜值为第一值的掩膜点所对应的像素点,确定为目标像素区域。
在前述两种确定方式中,超分显示装置均通过掩膜图形的引导,从第一图像中确定了目标像素区域,屏蔽了除目标像素区域之外的像素点,实现目标像素区域的快速定位,从而有效节约了图像处理的时长。
本申请实施例中,掩膜图形可以有多种形状。在第一种示例中,掩膜图形仅包括多个第一掩膜点,也即是掩膜图形由多个第一掩膜点组成,这样得到的掩膜图形通常为不规则图形;在第二种示例中,掩膜图形仅包括多个子掩膜图形,也即是掩膜图形由多个子掩膜图形组成,每个子掩膜图形与对应的目标残差块的尺寸相同,即既包括第一掩膜点,又包括第二掩膜点,这样得到的掩膜图形通常为由子掩膜图形拼接成的不规则图形;第三种示例中,掩膜图形与第一图形的尺寸相同,其既包括第一掩膜点,又包括第二掩膜点。通常情况下,超分显示装置中的存储器是按照一维或多维数组等方式存储图形数据的,若掩膜图形为前述第一种示例中的掩膜图形,需要存储的数据粒度为像素级的数据粒度,存储的复杂度较高;若掩膜图形为前述第二种示例中的掩膜图形,需要存储的数据粒度为像素块级(一个像素块的大小即前述一个子掩膜图形的大小)的数据粒度,存储的图形相对于第一种示例较为规则,存储的复杂度较低;若掩膜图形为前述第三种示例中的掩膜图形,存储的图形为矩形,相对于第一种示例和第二种示例较为规则,存储的复杂度较低。因此,通常掩膜图形通常为前述第二和第三种示例的形状。
步骤204、超分显示装置对第一图像中的目标像素区域进行超分处理,得到超分处理后的目标像素区域。
在本申请实施例中,超分显示装置可以通过超分模型对第一图像中的目标像素区域进行超分处理,得到超分处理后的目标像素区域。参考前述步骤203,由于可以采用多种方式确定目标像素区域,相应的,可以采用多种方式进行超分处理。本申请实施例以以下两种处理方式为例进行说明:
第一种处理方式,与前述步骤203中的第一种确定方式对应的,超分显示装置对第一图像中的目标像素区域进行超分处理,得到超分处理后的目标像素区域的过程可以为:超分显示装置通过超分模型对第一图像中的目标像素区域进行超分处理,得到超分处理后的目标像素区域。
示例的,参考步骤2032,由于超分显示装置将掩膜图形和第一图像输入超分模型,并且通过超分模型确定了目标像素区域,则可以继续通过超分模型对第一图像中的目标像素区域进行超分处理,得到超分处理后的目标像素区域。超分模型进行超分处理的过程可以参考相关技术,本申请实施例对此不做赘述。
第二种处理方式,与前述步骤203中的第二种确定方式对应的,超分显示装置对第一图像中的目标像素区域进行超分处理,得到超分处理后的目标像素区域的过程可以为:超分显示装置将第一图像中的目标像素区域输入超分模型,通过超分模型对第一图像中的目标像素区域进行超分处理,得到超分处理后的目标像素区域。
在本申请实施例中,从第一图像中可以筛选出多个目标图像块,以对每个目标图像块进行超分处理。由于超分处理是针对目标图像块进行的,而每个目标图像块的尺寸小于第一图像,因此可以减少超分处理的运算复杂度,降低运算代价。尤其当超分处理由超分模型执行 时,可以有效减少超分模型的复杂度,提高超分运算的效率。
示例的,步骤204可以包括:
步骤B1、超分显示装置在第一图像中获取每个子掩膜图形对应的目标图像块。
在第一图像中,将与各个子掩膜图形位置对应的图像块确定为目标图像块。
步骤B2、超分显示装置分别对每个目标图像块包含的目标像素区域的子区域进行超分处理,得到超分处理后的目标像素区域。
参考前述内容可知,掩膜图形用于指示目标像素区域的位置,由于掩膜图形划分为了多个子掩膜图形,则标像素区域也可以划分为与该多个子掩膜图形对应的多个子区域。又由于,每个子掩膜图形对应一个目标图像块,则每个目标图像块包含目标像素区域的一个子区域,也即是,前述获取的多个目标图像块与多个子区域一一对应。因此,超分处理后的目标像素区域由超分处理后的每个目标图像块包含的目标像素区域的子区域组成。
示例的,超分处理可以由超分模型执行。与步骤204中的第一种处理方式对应的,参考前述步骤2032,超分模型可以先确定目标像素区域与目标图像块对应区域,即每个目标图像块包含的目标像素区域的子区域,再执行超分处理,则前述步骤2032具体可以包括:超分显示装置分别将每个子掩膜图形和对应的目标图像块输入超分模型(每一次输入的是一个目标图像块和对应的一个子掩膜图形),通过超分模型将所目标图像块中,与对应的子掩膜图形的多个第一掩膜点位置对应的像素点所在区域,确定为目标图像块包含的目标像素区域的子区域;步骤B2具体可以包括:通过超分模型分别对每个目标图像块包含的目标像素区域的子区域进行超分处理,得到超分处理后的每个目标图像块包含的目标像素区域的子区域,超分处理后的目标像素区域由超分处理后的每个目标图像块包含的目标像素区域的子区域组成。这样每次输入超分模型的图像的尺寸较小,可以有效减少超分模型的运算复杂度,从而采用结构较为简单的超分模型即可实现超分运算,降低超分模型的复杂度,提高超分运算的效率。
与步骤204中的第二种处理方式对应的,步骤B2具体可以包括:超分显示装置分别将每个目标图像块包含的目标像素区域的子区域输入超分模型,通过超分模型对每个目标图像块包含的目标像素区域的子区域进行超分处理,得到超分处理后的每个目标图像块包含的目标像素区域的子区域,超分处理后的目标像素区域由超分处理后的每个目标图像块包含的目标像素区域的子区域组成。
超分显示装置在接收到超分模型输出的超分处理后的各个目标图像块包含的目标像素区域的子区域后,可以按照各个目标图像块在第一图像中的位置拼接(也可以称为组合)得到超分处理后的目标像素区域的子区域,则最终得到的超分处理后的目标像素区域由超分处理后的每个目标图像块包含的目标像素区域的子区域组成。
值得说明的是,超分处理后的目标像素区域中像素点的个数大于超分处理前的目标像素区域中像素的个数,即超分处理实现了目标像素区域中像素密度的增大,从而实现了超分效果。例如,超分处理前的目标像素区域包括5×5个像素点,超分处理后的目标像素区域包括9×9个像素点。
步骤205、超分显示装置确定第一图像中的其他像素区域。
其中,其他像素区域包括除目标像素区域之外的像素区域。在第一种可选方式中,其他像素区域为第一图像中除目标像素区域之外的像素点所在区域(也可以理解为直接执行步骤206而不执行所谓的“确定”步骤);在第二种可选方式中,其他像素区域为第一图像中除第 一掩膜点对应像素点之外的像素点所在区域;在第三种可选方式中,其他像素区域为第一图像中除辅助像素点之外的像素点所在区域,该辅助像素点所在区域包括目标像素区域,但是第一图像中辅助像素点的个数大于目标像素区域中像素点(即目标像素点)的个数,即辅助像素点所在区域的面积大于目标像素区域的面积。
在采用该第三种可选方式时,示例的,步骤205可以包括:对掩膜图形中的多个第一掩膜点进行腐蚀(Erosion)处理,得到更新后的掩膜图形;将第一图像中,与腐蚀处理后的多个第一掩膜点位置对应的像素点,确定为辅助像素点;将第一图像中除辅助像素点之外的像素点所在区域确定为其他像素区域。
其中,腐蚀处理是一种求局部最小值的处理。将待处理的图像与预设的核(也称内核)进行卷积,在每次卷积过程中,核覆盖区域中最小值赋予指定像素点,得到的效果就是待处理的图像的亮的区域缩小。其中,核具有一个可定义的锚点,该锚点通常为核的中心点,前述指定像素点该锚点。需要说明的是,前述膨胀处理和该腐蚀处理不具有互逆的关系。
如图11所示,图11假设待处理图像为F5,该待处理图像包括5×5个像素点,其中的阴影部分表示亮点,核为F6中的阴影部分,共5个像素点,锚点为该5个像素点的中心点D,则最终腐蚀处理后的图像为F7。图11中“*”表示求卷积。
值得说明的是,请参考图11,若掩膜图形为F5,腐蚀处理后的掩膜图形为F7,则F5和F7中斜划线阴影所对应的掩膜点为第一掩膜点。
如前所述,超分模型包括至少一个卷积核,腐蚀处理的核与超分模型最后一个卷积层的感受野的尺寸相同,也即是腐蚀处理的核与前述膨胀处理的核的尺寸相同。如此得到的更新后的掩膜图形中第一掩膜点所对应区域实际上是前述膨胀处理得到的掩膜图形中的第一掩膜点去掉最外围的一层第一掩膜点后所对应区域(也即是实现了第一掩膜点对应区域的内缩),相对于前述二值化处理后得到的掩膜图形,该更新后的掩膜图形的边缘更平滑,噪声更小。
本申请实施例中,通过腐蚀处理可以消除图像的边缘噪声,采用腐蚀处理得到的更新后的掩膜图形来确定的其他像素区域,相较于前述第一种和第二种可选方式得到的其他像素区域,边缘更加清晰,噪声更小,后续执行步骤206的像素更新过程,可以减少细节模糊、边缘钝化、颗粒感和噪声增强等负向效果,保证最终超分处理后的第一图像的显示效果。
本申请实施例在实际实现时还可以采用其他处理方式更新掩膜图形,只要保证更新后的掩膜图形中的第一掩膜点所对应区域为前述膨胀处理得到的掩膜图形中的第一掩膜点去掉最外围的一层第一掩膜点后所对应区域即可,也即是能够达到腐蚀处理相同的效果,本申请实施例对此不做限定。
步骤206、超分显示装置采用超分处理后的第二图像中的其他像素区域更新第一图像中其他像素区域。
由于超分处理后的第二图像中的其他像素区域中像素点的个数大于第一图像中其他像素区域的像素点的个数,因而更新后的其他像素区域中像素点的个数大于更新前的其他像素区域中像素点的个数,即更新处理实现了其他像素区域中像素密度的增大。更新后的其他像素区域的显示效果与超分处理后的其他像素区域的显示效果相同,因此,更新后的其他像素区域相当于超分处理后的其他像素区域。
如图15所示,假设超分处理后的第二图像中的其他像素区域K1包括12×12个像素点,第一图像中其他像素区域K2包括6×6个像素点,每个方格表示一个像素点,其他像素区域 K1和其他像素区域K2的尺寸,以及在各自图中的位置均相同。采用超分处理后的第二图像中的其他像素区域K1更新第一图像中其他像素区域K2,指的是将其他像素区域K1中的像素的像素值以及像素位置等像素数据更新其他像素区域K2中对应的像素的像素值以及像素位置等像素数据。则参考图12,更新后的第一图像中其他像素区域K2中的像素点个数、像素值以及像素位置与第二图像中的其他像素区域K1的像素点个数、像素值以及像素位置均对应相同。
超分处理后的第一图像,即重建后的第一图像,包括通过步骤204获取的超分处理后的目标像素区域以及通过步骤206获取的更新后的其他像素区域,该超分处理后的第一图像与传统的超分处理后的第一图像的显示效果相同。
由于该其他像素区域包括第一图像中除目标像素区域之外的像素区域,因此目标像区域与其他像素区域的尺寸可能匹配,也可能不匹配,相应的,获取超分处理后的第一图像的方式也不同,本申请实施例以以下几种可选方式为例进行说明:
在一种可选方式中,目标像区域与其他像素区域的尺寸匹配,也即是,其他像素区域为第一图像中除目标像素区域之外的像素区域;相应的,超分处理后的目标像区域以及更新后的其他像素区域的尺寸也匹配。则超分处理后的第一图像可以由超分处理后的目标像素区域以及更新后的其他像素区域拼接而成;在另一种可选方式中,目标像区域与其他像素区域的尺寸不匹配,两者的边缘存在重叠区域,也即是,其他像素区域在第一图像中除目标像素区域之外的像素区域的基础上还包括其他的像素区域。相应的,超分处理后的目标像区域以及更新后的其他像素区域的尺寸也不匹配,两者的边缘存在重叠区域。由于第一图像的其他像素区域是由超分处理后的第二图像的其他像素区域更新得到的,包含的像素点的像素数据通常更准确,因此,该超分处理后的第一图像的重叠区域的像素数据以第一图像的更新后的其他像素区域的像素数据为准。则超分处理后的第一图像可以由更新后的目标像素区域及更新后的其他像素区域拼接而成,例如,该更新后的目标像素区域为超分处理后的目标像素区域减去(也称去除)其与其他像素区域的重叠区域得到。更新后的目标像素区域相对于更新前的目标像素区域内缩,更新后的目标像素区域的尺寸与更新后的其他像素区域的尺寸匹配。
值得说明的是,在实际应用中,也可以不考虑目标像区域与其他像素区域的尺寸是否匹配,直接采用以下方式获取超分处理后的第一图像:
在一种可选方式中,可以将超分处理后的第二图像作为背景图像,采用步骤204获取的超分处理后的目标像素区域覆盖该第二图像的对应区域,得到超分处理后的第一图像;在另一种可选方式中,可以将第一图像或空白图像作为背景图像,采用步骤204获取的超分处理后的目标像素区域覆盖该第一图像的对应区域,采用超分处理后的第二图像的其他像素区域覆盖第一图像的对应区域,得到超分处理后的第一图像。只要保证最终超分处理后的第一图像与对第一图像进行整体超分处理后的图像的效果一致或者差距在可接受范围内即可,本申请实施例对此不做限定。
本申请实施例中,基于获取的超分处理后的目标像素区域以及获取的更新后的其他像素区域生成超分处理后的第一图像的过程也可以通过前述超分模型实现。
示例的,生成超分处理后的第一图像的过程可以满足:
R(F(t))=R(F(t-1)(w,h))[Mask(w,h)=R2]+SR(F(t)(w,h))[Mask(w,h)=R1];
其中,R(F(t))表示超分处理后的第一图像,(w,h)表示所在图像中任一点;Mask(w,h)表 示掩膜图形(例如前述步骤205中更新后的掩膜图形)中的掩膜点(w,h),SR表示超分处理。R(F(t-1)(w,h))[Mask(w,h)=R2]表示超分处理后的第二图像中,与掩膜图形中掩膜值为第二值的掩膜点(w,h)对应的像素点(w,h)所在的像素区域,即前述步骤205确定的其他像素区域;SR(F(t)(w,h))[Mask(w,h)=R1]表示第一图像进行超分处理后的区域(即步骤203确定的目标像素区域)中,与掩膜图形中掩膜值为第一值的掩膜点(w,h)对应的像素点所在的像素区域,即前述超分处理后的目标像素区域中的全部或部分区域(例如,若未进行步骤205所述的掩膜图形更新动作,即为全部区域;若进行了步骤205所述的掩膜图形更新动作,即为部分区域。这种情况相当于步骤203确定的目标像素区域随着掩膜图形的更新而更新,该部分区域即为前述更新后的目标像素区域,该部分区域可以为步骤203确定的超分处理后的目标像素区域减去其与其他像素区域的重叠区域得到)。R1表示第一值,R2表示第二值。示例的,第一值为1,第二值为0;或者,第一值为255,第二值为0。
步骤207、超分显示装置采用超分处理后的第二图像更新第一图像。
由于残差块所对应的帧间残差用于反映相邻两帧图像的内容变化情况,当基于第一图像和第二图像获取的残差块中所有残差点的残差值均为0,说明第一图像和第二图像的内容并没有变化,则同理,两者超分处理后的图像也应该没有变化。因此,采用超分处理后的第二图像更新第一图像,得到超分处理后的第一图像。
值得说明的是,超分处理后的第二图像是采用本申请实施例提供的图像处理方法或者传统的图像处理方法或者其他超分处理方法确定的图像,采用该超分处理后的第二图像更新第一图像,可以使得更新后的第一图像中像素点的个数大于更新前的第一图像中像素点的个数,即更新处理实现了第一图像中像素密度的增大,从而达到了超分辨率的效果,因此,更新后的第一图像也是超分图像,即更新后的第一图像相当于超分处理后的第一图像。
值得说明的是,在执行步骤203时,超分显示装置还可以统计残差块中残差值为0的残差点的数量在残差块中残差点总数中的第一占比;当该第一占比大于第一超分触发占比阈值时,再基于残差块,在第一图像中确定目标像素区域。当该第一占比不大于第一超分触发占比阈值时,采用其他方式进行第一图像整体的超分处理,例如采用传统方式进行第一图像的超分处理。(或者,当该第一占比大于或等于第一超分触发占比阈值时,再基于残差块,在第一图像中确定目标像素区域。当该第一占比小于第一超分触发占比阈值时,采用其他方式进行第一图像整体的超分处理。)
通过判断残差值为0的残差点的数量在残差块中残差点总数中的第一占比是否大于第一超分触发占比阈值,可以检测前后两帧图像内容的差异是否较大,当第一占比不大于第一超分触发占比阈值,说明两帧图像内容的差异较大,时域上关联性不强,直接对第一图像整体进行超分处理(例如将第一图像直接输入超分模型)的运算代价小于或等于采用前述步骤203至206的运算代价,则可以直接对第一图像整体进行超分处理,即对第一图像进行全部超分;当第一占比大于第一超分触发占比阈值,说明两帧图像内容的差异较小,直接对第一图像整体进行超分处理的运算代价大于采用前述步骤203至206的运算代价,则可以执行前述步骤203至206。这样可以基于第一图像与第二图像的内容差异确定是否执行部分超分算法,从而提高图像处理的灵活性。
同理地,在执行步骤203时,超分显示装置还可以统计残差块中残差值不为0的残差点的数量在残差块中残差点总数中的第二占比;当该第二占比不大于第二超分触发占比阈值时, 基于残差块,在第一图像中确定目标像素区域;当该第二占比大于第二超分触发占比阈值时,采用其他方式进行第一图像整体的超分处理,例如采用传统方式进行第一图像的超分处理(或者,当该第二占比大于第二超分触发占比阈值时,基于残差块,在第一图像中确定目标像素区域;当该第二占比不大于第二超分触发占比阈值时,采用其他方式进行第一图像整体的超分处理)。通过判断第二占比是否大于第二超分触发占比阈值,可以检测前后两帧图像内容的差异是否较大,当第二占比大于第二超分触发占比阈值,说明两帧图像内容的差异较大,时域上关联性不强,直接对第一图像整体进行超分处理(例如将第一图像直接输入超分模型)的运算代价小于或等于采用前述步骤203至206的运算代价,则可以直接对第一图像进行超分处理,即对第一图像进行全部超分;当第二占比不大于第二超分触发占比阈值,说明两帧图像内容的差异较小,直接对第一图像整体进行超分处理的运算代价大于采用前述步骤203至206的运算代价,则可以执行前述步骤203至206。这样可以基于第一图像与第二图像的内容差异确定是否执行部分超分算法,从而提高图像处理的灵活性。
前述第一超分触发占比阈值和第二超分触发占比阈值可以相同也可以不同,示例的两者均为50%。
如前所述,本申请实施例中,视频中的图像的图像分辨率可以为360p、480p、720p、1080p、2k和4k等,前述实施例中举例的图像分辨率均较小,例如假设第一图像和第二图像分别包括5×5个像素点,如此只是为了便于读者理解,并不限定图像的实际分辨率为前述例子中的分辨率。
前述实施例中,将图像或者图像的区域输入超分模型指的是将图像或图像的区域中像素点的像素数据输入超分模型。
本申请实施例提供的图像处理方法的步骤先后顺序可以进行适当调整,步骤也可以根据情况进行相应增减,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化的方法,都应涵盖在本申请的保护范围之内,因此不再赘述。前述步骤201至207可以均由如图1所示的处理器控制执行。
参考图13,本申请实施例提供的部分超分方法,实际上是将第一图像划分成了目标像素区域H1和其他像素区域H2分别进行处理(实际实现时,步骤203确定的目标像素区域H1和步骤205确定的其他像素区域H2的边界可能存在重叠区域,图13以两者形状匹配,不存在重叠区域为例进行说明),通过确定第一图像中的目标像素区域H1,并对目标像素区域进行超分处理,实现对第一图像与前一帧图像存在差异的像素点所在区域的超分处理,并且采用超分处理后的前一帧图像中的其他像素区域更新第一图像中其他像素区域H2,达到了对其他像素区域进行超分处理的同等效果,充分利用了视频时域冗余的特性,因此通过对第一图像的部分区域进行超分处理,达到对第一图像进行全部超分处理的效果,减少了实际超分处理的运算量,降低了运算代价。
采用本申请实施例处理的测试视频,相对于传统技术中直接对视频进行全部超分处理,可以节省约45%的超分计算量。超分计算量的显著降低,一方面,有利于加快视频处理速度,保证视频能达到基本的帧率要求,从而保障了视频的实时性,不会出现播放延时卡顿等情况;另一方面,计算量减少意味着超分显示装置中的计算单元的更少的处理任务和消耗,带来了整体功耗的下降,节约了装置的用电量。
并且,本申请一些实施例所提出的部分超分算法,不是只超分部分图像区域,其他部分 用非超分手段处理的牺牲效果换效率的方法,而是避免对视频前后帧的未变化区域、冗余时域信息的重复超分,本质上是一种追求信息利用率最大化的方法。对于采用部分超分算法的第一图像,通过设置掩膜图形,指导超分模型进行精确到像素点级别的超分。最终处理到的视频中,实质上每帧图像的所有的像素值都来自于超分计算结果,与传统的全部超分的算法的显示效果相同,避免显示效果的牺牲。
并且,若采用输入子掩膜图形和目标图像块的方式,由超分模型进行超分处理,则超分模型每次超分处理的复杂度较低,对超分模型的结构复杂度要求较低,从而可以简化超分模型,降低对处理器性能的要求,提高超分处理效率。
下述为本申请的装置实施例,可以用于执行本申请的方法实施例。对于本申请装置实施例中未披露的细节,请参照本申请方法实施例。
请参考图14,图14是一种图像处理装置300的框图,所述装置包括:
获取模块301,用于获取第一图像与相邻的前一帧图像的帧间残差,得到残差块,所述残差块包括与所述第一图像的多个像素点位置一一对应的多个残差点,每个残差点具有一个残差值;
第一确定模块302,用于基于所述残差块,在所述第一图像中确定目标像素区域;
部分超分模块303,用于对所述第一图像中的所述目标像素区域进行超分处理,得到超分处理后的目标像素区域;
更新模块304,用于采用超分处理后的所述前一帧图像中的其他像素区域更新所述第一图像中所述其他像素区域,所述其他像素区域包括所述第一图像中除所述目标像素区域之外的像素区域;
其中,超分处理后的所述第一图像包括所述超分处理后的目标像素区域以及更新后的其他像素区域。
可选的,所述目标像素区域为所述第一图像中与第一目标残差点和第二目标残差点位置对应的像素点所在区域,所述第一目标残差点为所述残差块中残差值大于指定阈值的点,所述第二目标残差点为所述残差块中所述第一目标残差点周边的残差点。
本申请实施例,第一确定模块通过确定第一图像中的目标像素点,并由部分超分模块对目标像素点进行超分处理,实现对第一图像与前一帧图像存在差异的像素点所在区域的超分处理,并且更新模块采用超分处理后的所述前一帧图像中的其他像素区域更新所述第一图像中所述其他像素区域,充分利用了视频时域冗余的特性,因此通过对第一图像的部分区域进行超分处理,达到对第一图像进行全部超分处理的效果,减少了超分处理的运算量,降低了运算代价。
如图15所示,在一种可选方式中,所述第一确定模块302,包括:
生成子模块3021,用于基于所述残差块,生成掩膜图形,所述掩膜图形包括多个第一掩膜点,所述多个第一掩膜点与所述残差块中的多个目标残差点的位置一一对应;
确定子模块3022,用于将所述掩膜图形和所述第一图像输入超分模型,通过所述超分模型将所述第一图像中,与所述多个第一掩膜点中每个掩膜点位置对应的像素点所在区域,确定为所述目标像素区域。
相应的,所述部分超分模块303,用于:通过所述超分模型对所述第一图像中的所述目 标像素区域进行超分处理,得到超分处理后的目标像素区域。
在另一种可选方式中,如图15所示,所述第一确定模块302,包括:
生成子模块3021,用于基于所述残差块,生成掩膜图形,所述掩膜图形包括多个第一掩膜点,多个第一掩膜点与所述残差块中的多个目标残差点的位置一一对应;
确定子模块3022,用于将所述第一图像中,与所述多个第一掩膜点中每个掩膜点位置对应的像素点所在区域,确定为所述目标像素区域。
相应的,所述部分超分模块303,用于:
将所述第一图像中的目标像素点输入超分模型,通过所述超分模型对所述第一图像中的所述目标像素区域进行超分处理,得到超分处理后的目标像素区域。
可选的,所述掩膜图形包括多个掩膜点,所述多个掩膜点与所述第一图像的多个像素点位置一一对应,每个所述掩膜点具有一个掩膜值,所述多个掩膜点包括所述多个第一掩膜点和多个第二掩膜点,所述第一掩膜点的掩膜值为第一值,所述第二掩膜点的掩膜值为第二值,所述第一值和所述第二值不同;
在前述两种可选方式中,所述确定子模块3022,均可以用于:
遍历所述掩膜图形中的掩膜点,在所述第一图像中,将掩膜值为第一值的掩膜点所对应的像素点,确定为所述目标像素点。
可选的,所述生成子模块3021,用于:
对所述残差块进行形态学变化处理,得到所述掩膜图形,所述形态学变化处理包括二值化处理和对二值化后的所述残差块中的第一掩膜点进行膨胀处理,所述超分模型包括至少一个卷积层,所述膨胀处理的核与所述超分模型最后一个卷积层的感受野的尺寸相同。
可选的,所述生成子模块3021,用于:
将所述残差块划分成多个子残差块,并对每个划分得到的子残差块执行分块处理,所述分块处理包括:
当所述子残差块中包括的至少一个残差点的残差值不为0时,将所述子残差块划分成多个子残差块,并对每个划分得到的子残差块执行所述分块处理,直至划分得到的子残差块中包括的残差点的残差值均为0,或者,划分得到的子残差块的残差点总数小于点数阈值,或者,对所述残差块的划分总次数达到次数阈值;
生成与每个目标残差块对应的子掩膜图形,所述目标残差块包括的至少一个残差点的残差值不为0;
其中,所述掩膜图形包括生成的子掩膜图形。
可选的,所述部分超分模块303,用于:
在所述第一图像中获取每个所述子掩膜图形对应的目标图像块;
分别对每个所述目标图像块包含的所述目标像素区域的子区域进行超分处理,得到所述超分处理后的目标像素区域,所述超分处理后的目标像素区域由超分处理后的每个所述目标图像块包含的所述目标像素区域的子区域组成。
可选的,所述残差块和所述子残差块均采用四叉树划分的方式进行分块。
可选的,如图16所示,所述装置300还包括:
腐蚀模块305,用于在所述采用所述前一帧图像中的与其他像素区域位置对应的像素点的像素值更新所述第一图像中所述其他像素区域的像素值之前,对所述掩膜图形中的多个第 一掩膜点进行腐蚀处理,得到更新后的掩膜图形,所述腐蚀处理的核与所述超分模型最后一个卷积层的感受野的尺寸相同;
第二确定模块306,用于将所述第一图像中,与腐蚀处理后的所述多个第一掩膜点位置对应的像素点,确定为辅助像素点;
第三确定模块307,用于将所述第一图像中除所述辅助像素点之外的像素点所在区域确定为所述其他像素区域。
可选的,所述第一确定模块302,用于:
统计所述残差块中残差值为0的残差点的数量在所述残差块中残差点总数中的第一占比;
当所述第一占比大于第一超分触发占比阈值时,基于所述残差块,在所述第一图像中确定所述目标像素区域。
可选的,该超分模型可以为CNN模型,如SRCNN或ESPCN等模型;该超分模型也可以为GAN,如SRGAN或ESRGAN等模型。
本申请实施例,第一确定模块通过确定第一图像中的目标像素点,并由部分超分模块对目标像素点进行超分处理,实现对第一图像与前一帧图像存在差异的像素点所在区域的超分处理,并且更新模块采用超分处理后的所述前一帧图像中的其他像素区域更新所述第一图像中所述其他像素区域,充分利用了视频时域冗余的特性,因此通过对第一图像的部分区域进行超分处理,达到对第一图像进行全部超分处理的效果,减少了超分处理的运算量,降低了运算代价。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的装置和模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
并且,以上装置中的各个模块可以通过软件或软件硬件结合的方式来实现。当至少一个模块是硬件的时候,该硬件可以是逻辑集成电路模块,可具体包括晶体管、逻辑门阵列或算法逻辑电路等。至少一个模块是软件的时候,该软件以计算机程序产品形式存在,并被存储于计算机可读存储介质中。该软件可以被一个处理器执行。因此可替换地,图像渲染装置,可以由一个处理器执行软件程序来实现,本实施例对此不限定。
本申请实施例提供一种电子设备,包括:处理器和存储器;
所述存储器用于存储计算机程序;
所述处理器用于执行所述存储器存储的所述计算机程序时实现本申请任一所述的图像处理方法。
图17示出了该图像处理方法涉及的电子设备400的结构示意图。该电子设备400可以但不限于是膝上型计算机、台式计算机、移动电话、智能手机、平板电脑、多媒体播放器、电子阅读器、智能车载设备、智能家电(如智能电视)、人工智能设备、穿戴式设备、物联网设备、或虚拟现实/增强现实/混合现实设备等。示例的,电子设备400可以包括前述图1所示的超分显示装置100的结构。
电子设备400可以包括处理器410,外部存储器接口420,内部存储器421,通用串行总线(universal serial bus,USB)接口430,充电管理模块440,电源管理模块441,电池442,天线4,天线2,移动通信模块450,无线通信模块460,音频模块470,扬声器470A,受话器470B,麦克风470C,耳机接口470D,传感器模块480,按键490,马达491,指示器492, 摄像头493,显示屏494,以及用户标识模块(subscriber identification module,SIM)卡接口495等。其中传感器模块480可以包括压力传感器480A,陀螺仪传感器480B,气压传感器480C,磁传感器480D,加速度传感器480E,距离传感器480F,接近光传感器480G,指纹传感器480H,温度传感器480J,触摸传感器480K,环境光传感器480L,骨传导传感器480M等中的一种或多种。
可以理解的是,本申请实施例示意的结构并不构成对电子设备400的具体限定。在本申请另一些实施例中,电子设备400可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
可以理解的是,本申请实施例示意的各模块间的接口连接关系,只是示意性说明,并不构成对电子设备400的结构限定。在本申请另一些实施例中,电子设备400也可以采用上述实施例中不同的接口连接方式(例如总线连接方式),或多种接口连接方式的组合。
处理器410可以包括一个或多个处理单元,例如包括中央处理器CPU(例如应用处理器(application processor,AP)),图形处理器(graphics processing unit,GPU),进一步的,还可以包括调制解调处理器,图像信号处理器(image signal processor,ISP),MCU,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。
处理器410中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器410中的存储器为高速缓冲存储器。该存储器可以保存处理器410刚用过或循环使用的指令或数据。如果处理器410需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器410的等待时间,因而提高了系统的效率。
在一些实施例中,处理器410可以包括一个或多个接口。接口可以包括集成电路(inter-integrated circuit,I2C)接口,集成电路内置音频(inter-integrated circuit sound,I2S)接口,脉冲编码调制(pulse code modulation,PCM)接口,通用异步收发传输器(universal asynchronous receiver/transmitter,UART)接口,移动产业处理器接口(mobile industry processor interface,MIPI),通用输入输出(general-purpose input/output,GPIO)接口,用户标识模块(subscriber identity module,SIM)接口,和/或通用串行总线(universal serial bus,USB)接口等。
I2C接口是一种双向同步串行总线,包括一根串行数据线(serial data line,SDA)和一根串行时钟线(derail clock line,SCL)。在一些实施例中,处理器410可以包含多组I2C总线。处理器410可以通过不同的I2C总线接口分别耦合触摸传感器480K,充电器,闪光灯,摄像头493等。例如:处理器410可以通过I2C接口耦合触摸传感器480K,使处理器410与触摸传感器480K通过I2C总线接口通信,实现电子设备400的触摸功能。
I2S接口可以用于音频通信。在一些实施例中,处理器410可以包含多组I2S总线。处理器410可以通过I2S总线与音频模块470耦合,实现处理器410与音频模块470之间的通信。在一些实施例中,音频模块470可以通过I2S接口向无线通信模块460传递音频信号,实现通过蓝牙耳机接听电话的功能。
PCM接口也可以用于音频通信,将模拟信号抽样,量化和编码。在一些实施例中,音频模块470与无线通信模块460可以通过PCM总线接口耦合。在一些实施例中,音频模块470 也可以通过PCM接口向无线通信模块460传递音频信号,实现通过蓝牙耳机接听电话的功能。所述I2S接口和所述PCM接口都可以用于音频通信。
UART接口是一种通用串行数据总线,用于异步通信。该总线可以为双向通信总线。它将要传输的数据在串行通信与并行通信之间转换。在一些实施例中,UART接口通常被用于连接处理器410与无线通信模块460。例如:处理器410通过UART接口与无线通信模块460中的蓝牙模块通信,实现蓝牙功能。在一些实施例中,音频模块470可以通过UART接口向无线通信模块460传递音频信号,实现通过蓝牙耳机播放音乐的功能。
MIPI接口可以被用于连接处理器410与显示屏494,摄像头493等外围器件。MIPI接口包括摄像头串行接口(camera serial interface,CSI),显示屏串行接口(display serial interface,DSI)等。在一些实施例中,处理器410和摄像头493通过CSI接口通信,实现电子设备400的拍摄功能。处理器410和显示屏494通过DSI接口通信,实现电子设备400的显示功能。
GPIO接口可以通过软件配置。GPIO接口可以被配置为控制信号,也可被配置为数据信号。在一些实施例中,GPIO接口可以用于连接处理器410与摄像头493,显示屏494,无线通信模块460,音频模块470,传感器模块480等。GPIO接口还可以被配置为I2C接口,I2S接口,UART接口,MIPI接口等。
USB接口430是符合USB标准规范的接口,具体可以是Mini USB接口,Micro USB接口,USB Type C接口等。USB接口430可以用于连接充电器为电子设备400充电,也可以用于电子设备400与外围设备之间传输数据。也可以用于连接耳机,通过耳机播放音频。该接口还可以用于连接其他电子设备,例如AR设备等。
充电管理模块440用于从充电器接收充电输入。其中,充电器可以是无线充电器,也可以是有线充电器。在一些有线充电的实施例中,充电管理模块440可以通过USB接口430接收有线充电器的充电输入。在一些无线充电的实施例中,充电管理模块440可以通过电子设备400的无线充电线圈接收无线充电输入。充电管理模块440为电池442充电的同时,还可以通过电源管理模块441为电子设备供电。
电源管理模块441用于连接电池442,充电管理模块440与处理器410。电源管理模块441接收电池442和/或充电管理模块440的输入,为处理器410,内部存储器421,显示屏494,摄像头493,和无线通信模块460等供电。电源管理模块441还可以用于监测电池容量,电池循环次数,电池健康状态(漏电,阻抗)等参数。在其他一些实施例中,电源管理模块441也可以设置于处理器410中。在另一些实施例中,电源管理模块441和充电管理模块440也可以设置于同一个器件中。
可选的,电子设备400的无线通信功能可以通过天线4,天线2,移动通信模块450,无线通信模块460,调制解调处理器以及基带处理器等实现。
天线4和天线2用于发射和接收电磁波信号。电子设备400中的每个天线可用于覆盖单个或多个通信频带。不同的天线还可以复用,以提高天线的利用率。例如:可以将天线4复用为无线局域网的分集天线。在另外一些实施例中,天线可以和调谐开关结合使用。
移动通信模块450可以提供应用在电子设备400上的包括2G/3G/4G/5G等无线通信的解决方案。移动通信模块450可以包括至少一个滤波器,开关,功率放大器,低噪声放大器(low noise amplifier,LNA)等。移动通信模块450可以由天线4接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。移动通信模块450还可以对经调制 解调处理器调制后的信号放大,经天线4转为电磁波辐射出去。在一些实施例中,移动通信模块450的至少部分功能模块可以被设置于处理器410中。在一些实施例中,移动通信模块450的至少部分功能模块可以与处理器410的至少部分模块被设置在同一个器件中。
调制解调处理器可以包括调制器和解调器。其中,调制器用于将待发送的低频基带信号调制成中高频信号。解调器用于将接收的电磁波信号解调为低频基带信号。随后解调器将解调得到的低频基带信号传送至基带处理器处理。低频基带信号经基带处理器处理后,被传递给应用处理器。应用处理器通过音频设备(不限于扬声器470A,受话器470B等)输出声音信号,或通过显示屏494显示图像或视频。在一些实施例中,调制解调处理器可以是独立的器件。在另一些实施例中,调制解调处理器可以独立于处理器410,与移动通信模块450或其他功能模块设置在同一个器件中。
无线通信模块460可以提供应用在电子设备400上的包括无线局域网(wireless local area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络),蓝牙(bluetooth,BT),全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块460可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块460经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器410。无线通信模块460还可以从处理器410接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。
在一些实施例中,电子设备400的天线4和移动通信模块450耦合,天线2和无线通信模块460耦合,使得电子设备400可以通过无线通信技术与网络以及其他设备通信。所述无线通信技术可以包括全球移动通讯系统(global system for mobile communications,GSM),通用分组无线服务(general packet radio service,GPRS),码分多址接入(code division multiple access,CDMA),宽带码分多址(wideband code division multiple access,WCDMA),时分码分多址(time-division code division multiple access,TD-SCDMA),长期演进(long term evolution,LTE),BT,GNSS,WLAN,NFC,FM,和/或IR技术等。所述GNSS可以包括全球卫星定位系统(global positioning system,GPS),全球导航卫星系统(global navigation satellite system,GLONASS),北斗卫星导航系统(beidou navigation satellite system,BDS),准天顶卫星系统(quasi-zenith satellite system,QZSS)和/或星基增强系统(satellite based augmentation systems,SBAS)。
电子设备400通过GPU,显示屏494,以及应用处理器等实现显示功能。GPU为图像处理的微处理器,连接显示屏494和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器410可包括一个或多个GPU,其执行程序指令以生成或改变显示信息。
显示屏494用于显示图像,视频等。显示屏494包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organic light emitting diode的,AMOLED),柔性发光二极管(flex light-emitting diode,FLED),Miniled,MicroLed,Micro-oLed,量子点发光二极管(quantum dot light emitting diodes,QLED)等。在一些实施例中,电子设备400可以包括4个或N个显示屏494,N为大于4的正整数。
电子设备400可以通过ISP,摄像头493,视频编解码器,GPU,显示屏494以及应用处 理器等实现拍摄功能。
ISP用于处理摄像头493反馈的数据。例如,拍照时,打开快门,光线通过镜头被传递到摄像头感光元件上,光信号转换为电信号,摄像头感光元件将所述电信号传递给ISP处理,转化为肉眼可见的图像。ISP还可以对图像的噪点,亮度,肤色进行算法优化。ISP还可以对拍摄场景的曝光,色温等参数优化。在一些实施例中,ISP可以设置在摄像头493中。
摄像头493用于捕获静态图像或视频。物体通过镜头生成光学图像投射到感光元件。感光元件可以是电荷耦合器件(charge coupled device,CCD)或互补金属氧化物半导体(complementary metal-oxide-semiconductor,CMOS)光电晶体管。感光元件把光信号转换成电信号,之后将电信号传递给ISP转换成数字图像信号。ISP将数字图像信号输出到DSP加工处理。DSP将数字图像信号转换成标准的RGB,YUV等格式的图像信号。在一些实施例中,电子设备400可以包括4个或N个摄像头493,N为大于4的正整数。
数字信号处理器用于处理数字信号,除了可以处理数字图像信号,还可以处理其他数字信号。例如,当电子设备400在频点选择时,数字信号处理器用于对频点能量进行傅里叶变换等。
视频编解码器用于对数字视频压缩或解压缩。电子设备400可以支持一种或多种视频编解码器。这样,电子设备400可以播放或录制多种编码格式的视频,例如:动态图像专家组(moving picture experts group,MPEG)4,MPEG2,MPEG3,MPEG4等。
NPU为神经网络(neural-network,NN)计算处理器,通过借鉴生物神经网络结构,例如借鉴人脑神经元之间传递模式,对输入信息快速处理,还可以不断的自学习。通过NPU可以实现电子设备400的智能认知等应用,例如:图像识别,人脸识别,语音识别,文本理解等。
外部存储器接口420可以用于连接外部存储卡,例如Micro SD卡,实现扩展电子设备400的存储能力。外部存储卡通过外部存储器接口420与处理器410通信,实现数据存储功能。例如将音乐,视频等文件保存在外部存储卡中。
内部存储器421可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。内部存储器421可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统,至少一个功能所需的应用程序(比如声音播放功能,图像播放功能等)等。存储数据区可存储电子设备400使用过程中所创建的数据(比如音频数据,电话本等)等。此外,内部存储器421可以包括高速随机存取存储器,例如双倍速率同步动态随机存储器(double data rate synchronous dynamic random access memory,DDR),还可以包括非易失性存储器,例如至少一个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。处理器410通过运行存储在内部存储器421的指令,和/或存储在设置于处理器中的存储器的指令,执行电子设备400的各种功能应用以及数据处理。
电子设备400可以通过音频模块470,扬声器470A,受话器470B,麦克风470C,耳机接口470D,以及应用处理器等实现音频功能。例如音乐播放,录音等。
音频模块470用于将数字音频信息转换成模拟音频信号输出,也用于将模拟音频输入转换为数字音频信号。音频模块470还可以用于对音频信号编码和解码。在一些实施例中,音频模块470可以设置于处理器410中,或将音频模块470的部分功能模块设置于处理器410中。
扬声器470A,也称“喇叭”,用于将音频电信号转换为声音信号。电子设备400可以通过 扬声器470A收听音乐,或收听免提通话。
受话器470B,也称“听筒”,用于将音频电信号转换成声音信号。当电子设备400接听电话或语音信息时,可以通过将受话器470B靠近人耳接听语音。
麦克风470C,也称“话筒”,“传声器”,用于将声音信号转换为电信号。当拨打电话或发送语音信息时,用户可以通过人嘴靠近麦克风470C发声,将声音信号输入到麦克风470C。电子设备400可以设置至少一个麦克风470C。在另一些实施例中,电子设备400可以设置两个麦克风470C,除了采集声音信号,还可以实现降噪功能。在另一些实施例中,电子设备400还可以设置三个,四个或更多麦克风470C,实现采集声音信号,降噪,还可以识别声音来源,实现定向录音功能等。
耳机接口470D用于连接有线耳机。耳机接口470D可以是USB接口430,也可以是3.5毫米的开放移动电子设备平台(open mobile terminal platform,OMTP)标准接口,美国蜂窝电信工业协会(cellular telecommunications industry association of the USA,CTIA)标准接口。
压力传感器480A用于感受压力信号,可以将压力信号转换成电信号。在一些实施例中,压力传感器480A可以设置于显示屏494。压力传感器480A的种类很多,如电阻式压力传感器,电感式压力传感器,电容式压力传感器等。电容式压力传感器可以是包括至少两个具有导电材料的平行板。当有力作用于压力传感器480A,电极之间的电容改变。电子设备400根据电容的变化确定压力的强度。当有触摸操作作用于显示屏494,电子设备400根据压力传感器480A检测所述触摸操作强度。电子设备400也可以根据压力传感器480A的检测信号计算触摸的位置。在一些实施例中,作用于相同触摸位置,但不同触摸操作强度的触摸操作,可以对应不同的操作指令。例如:当有触摸操作强度小于第一压力阈值的触摸操作作用于短消息应用图标时,执行查看短消息的指令。当有触摸操作强度大于或等于第一压力阈值的触摸操作作用于短消息应用图标时,执行新建短消息的指令。
陀螺仪传感器480B可以用于确定电子设备400的运动姿态。在一些实施例中,可以通过陀螺仪传感器480B确定电子设备400围绕三个轴(即,x,y和z轴)的角速度。陀螺仪传感器480B可以用于拍摄防抖。示例性的,当按下快门,陀螺仪传感器480B检测电子设备400抖动的角度,根据角度计算出镜头模组需要补偿的距离,让镜头通过反向运动抵消电子设备400的抖动,实现防抖。陀螺仪传感器480B还可以用于导航,体感游戏场景。
气压传感器480C用于测量气压。在一些实施例中,电子设备400通过气压传感器480C测得的气压值计算海拔高度,辅助定位和导航。
磁传感器480D包括霍尔传感器。电子设备400可以利用磁传感器480D检测翻盖皮套的开合。在一些实施例中,当电子设备400是翻盖机时,电子设备400可以根据磁传感器480D检测翻盖的开合。进而根据检测到的皮套的开合状态或翻盖的开合状态,设置翻盖自动解锁等特性。
加速度传感器480E可检测电子设备400在各个方向上(一般为三轴)加速度的大小。当电子设备400静止时可检测出重力的大小及方向。还可以用于识别电子设备姿态,应用于横竖屏切换,计步器等应用。
距离传感器480F,用于测量距离。电子设备400可以通过红外或激光测量距离。在一些实施例中,拍摄场景,电子设备400可以利用距离传感器480F测距以实现快速对焦。
接近光传感器480G可以包括例如发光二极管(LED)和光检测器,例如光电二极管。发光 二极管可以是红外发光二极管。电子设备400通过发光二极管向外发射红外光。电子设备400使用光电二极管检测来自附近物体的红外反射光。当检测到充分的反射光时,可以确定电子设备400附近有物体。当检测到不充分的反射光时,电子设备400可以确定电子设备400附近没有物体。电子设备400可以利用接近光传感器480G检测用户手持电子设备400贴近耳朵通话,以便自动熄灭屏幕达到省电的目的。接近光传感器480G也可用于皮套模式,口袋模式自动解锁与锁屏。
环境光传感器480L用于感知环境光亮度。电子设备400可以根据感知的环境光亮度自适应调节显示屏494亮度。环境光传感器480L也可用于拍照时自动调节白平衡。环境光传感器480L还可以与接近光传感器480G配合,检测电子设备400是否在口袋里,以防误触。
指纹传感器480H用于采集指纹。电子设备400可以利用采集的指纹特性实现指纹解锁,访问应用锁,指纹拍照,指纹接听来电等。
温度传感器480J用于检测温度。在一些实施例中,电子设备400利用温度传感器480J检测的温度,执行温度处理策略。例如,当温度传感器480J上报的温度超过阈值,电子设备400执行降低位于温度传感器480J附近的处理器的性能,以便降低功耗实施热保护。在另一些实施例中,当温度低于另一阈值时,电子设备400对电池442加热,以避免低温导致电子设备400异常关机。在其他一些实施例中,当温度低于又一阈值时,电子设备400对电池442的输出电压执行升压,以避免低温导致的异常关机。
触摸传感器480K,也称“触控器件”。触摸传感器480K可以设置于显示屏494,由触摸传感器480K与显示屏494组成触摸屏,也称“触控屏”。触摸传感器480K用于检测作用于其上或附近的触摸操作。触摸传感器可以将检测到的触摸操作传递给应用处理器,以确定触摸事件类型。可以通过显示屏494提供与触摸操作相关的视觉输出。在另一些实施例中,触摸传感器480K也可以设置于电子设备400的表面,与显示屏494所处的位置不同。
骨传导传感器480M可以获取振动信号。在一些实施例中,骨传导传感器480M可以获取人体声部振动骨块的振动信号。骨传导传感器480M也可以接触人体脉搏,接收血压跳动信号。在一些实施例中,骨传导传感器480M也可以设置于耳机中,结合成骨传导耳机。音频模块470可以基于所述骨传导传感器480M获取的声部振动骨块的振动信号,解析出语音信号,实现语音功能。应用处理器可以基于所述骨传导传感器480M获取的血压跳动信号解析心率信息,实现心率检测功能。
在本申请另一些实施例中,电子设备400也可以采用上述实施例中不同的接口连接方式,例如以上多种传感器中的部分或全部传感器连接MCU,通过MCU再连接AP。
按键490包括开机键,音量键等。按键490可以是机械按键。也可以是触摸式按键。电子设备400可以接收按键输入,产生与电子设备400的用户设置以及功能控制有关的键信号输入。
马达491可以产生振动提示。马达491可以用于来电振动提示,也可以用于触摸振动反馈。例如,作用于不同应用(例如拍照,音频播放等)的触摸操作,可以对应不同的振动反馈效果。作用于显示屏494不同区域的触摸操作,马达491也可对应不同的振动反馈效果。不同的应用场景(例如:时间提醒,接收信息,闹钟,游戏等)也可以对应不同的振动反馈效果。触摸振动反馈效果还可以支持自定义。
指示器492可以是指示灯,可以用于指示充电状态,电量变化,也可以用于指示消息, 未接来电,通知等。
SIM卡接口495用于连接SIM卡。SIM卡可以通过插入SIM卡接口495,或从SIM卡接口495拔出,实现和电子设备400的接触和分离。电子设备400可以支持4个或N个SIM卡接口,N为大于4的正整数。SIM卡接口495可以支持Nano SIM卡,Micro SIM卡,SIM卡等。同一个SIM卡接口495可以同时插入多张卡。所述多张卡的类型可以相同,也可以不同。SIM卡接口495也可以兼容不同类型的SIM卡。SIM卡接口495也可以兼容外部存储卡。电子设备400通过SIM卡和网络交互,实现通话以及数据通信等功能。在一些实施例中,电子设备400采用eSIM,即:嵌入式SIM卡。eSIM卡可以嵌在电子设备400中,不能和电子设备400分离。
电子设备400的软件系统可以采用分层架构,事件驱动架构,微核架构,微服务架构,或云架构。本申请实施例以分层架构的Android系统为例,示例性说明电子设备400的软件结构。
本申请实施例还提供了一种图像处理装置,包括处理器和存储器;在处理器执行存储器存储的计算机程序时,图像处理装置执行本申请实施例提供的图像处理方法。可选地,该图像处理装置可以部署在智能电视中。
本申请实施例还提供了一种存储介质,该存储介质可以为非易失性计算机可读存储介质,存储介质内存储有计算机程序,该计算机程序指示终端执行本申请实施例提供的任一的图像处理方法。该存储介质可以包括:只读存储器(read-only memory,ROM)或随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可存储程序代码的介质。
本申请实施例还提供了一种包含指令的计算机程序产品,当计算机程序产品在计算机上运行时,使得计算机执行本申请实施例提供的图像处理方法。该计算机程序产品可以包括一个或多个计算机指令。在计算机上加载和执行该计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。该计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。该计算机指令可以存储在计算机可读存储介质中,或者通过该计算机可读存储介质进行传输。该计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。该可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如,固态硬盘(solid state disk,SSD))等。
本申请实施例还提供了一种芯片,例如CPU芯片,该芯片包括一个或多个物理核、以及存储介质,所述一个或多个物理核在读取所述存储介质中的计算机指令后实现前述图像处理方法。另一些实施例中,该芯片可以用纯硬件或软硬结合的方式实现前述图像处理方法,即所述芯片包括逻辑电路,当所述芯片运行时所述逻辑电路用于实现前述第一方面任意一种图像处理方法,所述逻辑电路可以为可编程逻辑电路。类似的,GPU也可以如CPU般实现。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
本申请实施例中,“A参考B”,指的是A与B相同,或者A在B的基础上进行简单变形。
以上所述仅为本申请的可选实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (27)

  1. 一种图像处理方法,其特征在于,所述方法包括:
    获取第一图像与相邻的前一帧图像的帧间残差,得到残差块,所述残差块包括与所述第一图像的多个像素点位置一一对应的多个残差点,每个残差点具有一个残差值;
    基于所述残差块,在所述第一图像中确定目标像素区域;
    对所述第一图像中的所述目标像素区域进行超分处理,得到超分处理后的目标像素区域;
    采用超分处理后的所述前一帧图像中的其他像素区域更新所述第一图像中所述其他像素区域,所述其他像素区域包括所述第一图像中除所述目标像素区域之外的像素区域;
    其中,超分处理后的所述第一图像包括所述超分处理后的目标像素区域以及更新后的所述其他像素区域。
  2. 根据权利要求1所述的方法,其特征在于,所述基于所述残差块,在所述第一图像中确定目标像素区域,包括:
    基于所述残差块,生成掩膜图形,所述掩膜图形包括多个第一掩膜点,所述多个第一掩膜点与所述残差块中的多个目标残差点的位置一一对应;
    将所述掩膜图形和所述第一图像输入超分模型,通过所述超分模型将所述第一图像中,与所述多个第一掩膜点中每个掩膜点位置对应的像素点所在区域,确定为所述目标像素区域。
  3. 根据权利要求2所述的方法,其特征在于,所述对所述第一图像中的所述目标像素区域进行超分处理,得到超分处理后的目标像素区域,包括:
    通过所述超分模型对所述第一图像中的所述目标像素区域进行超分处理,得到超分处理后的目标像素区域。
  4. 根据权利要求1所述的方法,其特征在于,所述基于所述残差块,在所述第一图像中确定目标像素区域,包括:
    基于所述残差块,生成掩膜图形,所述掩膜图形包括多个第一掩膜点,多个第一掩膜点与所述残差块中的多个目标残差点的位置一一对应;
    将所述第一图像中,与所述多个第一掩膜点中每个掩膜点位置对应的像素点所在区域,确定为所述目标像素区域。
  5. 根据权利要求4所述的方法,其特征在于,所述对所述第一图像中的所述目标像素区域进行超分处理,得到超分处理后的目标像素区域,包括:
    将所述第一图像中的目标像素区域输入超分模型,通过所述超分模型对所述第一图像中的目标像素区域进行超分处理,得到超分处理后的目标像素区域。
  6. 根据权利要求2至5任一所述的方法,其特征在于,所述基于所述残差块,生成掩膜图形,包括:
    基于所述残差块,生成包括多个掩膜点的初始掩膜图形,所述多个掩膜点与所述第一图像的多个像素点位置一一对应,所述多个掩膜点包括所述多个第一掩膜点和多个第二掩膜点;
    将所述初始掩膜图形中所述第一掩膜点的掩膜值赋值为第一值,将所述掩膜图形中所述第二掩膜点的掩膜值赋值为第二值,得到所述掩膜图形,所述第一值和所述第二值不同;
    所述将所述第一图像中,与所述多个第一掩膜点位置对应的像素点,确定为所述目标像素区域,包括:
    遍历所述掩膜图形中的掩膜点,在所述第一图像中,将掩膜值为第一值的掩膜点所对应的像素点,确定为所述目标像素区域。
  7. 根据权利要求2至5任一所述的方法,其特征在于,所述基于所述残差块,生成掩膜图形,包括:
    对所述残差块进行形态学变化处理,得到所述掩膜图形,所述形态学变化处理包括二值化处理和对二值化后的所述残差块中的第一掩膜点进行膨胀处理,所述超分模型包括至少一个卷积层,所述膨胀处理的核与所述超分模型最后一个卷积层的感受野的尺寸相同。
  8. 根据权利要求2至7任一所述的方法,其特征在于,所述基于所述残差块,生成掩膜图形,包括:
    将所述残差块划分成多个子残差块,并对每个划分得到的子残差块执行分块处理,所述分块处理包括:
    当所述子残差块中包括的至少一个残差点的残差值不为0时,将所述子残差块划分成多个子残差块,并对每个划分得到的子残差块执行所述分块处理,直至划分得到的子残差块中包括的残差点的残差值均为0,或者,划分得到的子残差块的残差点总数小于点数阈值,或者,对所述残差块的划分总次数达到次数阈值;
    生成与每个目标残差块对应的子掩膜图形,所述目标残差块包括的至少一个残差点的残差值不为0;
    其中,所述掩膜图形包括生成的子掩膜图形。
  9. 根据权利要求8所述的方法,其特征在于,所述对所述第一图像中的目标像素区域进行超分处理,得到超分处理后的目标像素区域,包括:
    在所述第一图像中获取每个所述子掩膜图形对应的目标图像块;
    分别对每个所述目标图像块包含的所述目标像素区域的子区域进行超分处理,得到所述超分处理后的目标像素区域,所述超分处理后的目标像素区域由超分处理后的每个所述目标图像块包含的所述目标像素区域的子区域组成。
  10. 根据权利要求8或9所述的方法,其特征在于,所述残差块和所述子残差块均采用四叉树划分的方式进行分块。
  11. 根据权利要求7至10任一所述的方法,其特征在于,在所述采用超分处理后的所述前一帧图像中的其他像素区域更新所述第一图像中所述其他像素区域之前,所述方法还包括:
    对所述掩膜图形中的多个第一掩膜点进行腐蚀处理,得到更新后的掩膜图形,所述腐蚀处理的核与所述超分模型最后一个卷积层的感受野的尺寸相同;
    将所述第一图像中,与腐蚀处理后的所述多个第一掩膜点位置对应的像素点,确定为辅助像素点;
    将所述第一图像中除所述辅助像素点之外的像素点所在区域确定为所述其他像素区域。
  12. 根据权利要求1至11任一所述的方法,其特征在于,基于所述残差块,所述第一图像中确定目标像素区域,包括:
    统计所述残差块中残差值为0的残差点的数量在所述残差块中残差点总数中的第一占比;
    当所述第一占比大于第一超分触发占比阈值时,基于所述残差块,在所述第一图像中确定所述目标像素区域。
  13. 根据权利要求1至12任一所述的方法,其特征在于,所述目标像素区域为所述第一图像中与第一目标残差点和第二目标残差点位置对应的像素点所在区域,所述第一目标残差点为所述残差块中残差值大于指定阈值的点,所述第二目标残差点为所述残差块中所述第一目标残差点周边的残差点。
  14. 一种图像处理装置,其特征在于,所述装置包括:
    获取模块,用于获取第一图像与相邻的前一帧图像的帧间残差,得到残差块,所述残差块包括与所述第一图像的多个像素点位置一一对应的多个残差点,每个残差点具有一个残差值;
    第一确定模块,用于基于所述残差块,在所述第一图像中确定目标像素区域;
    部分超分模块,用于对所述第一图像中的所述目标像素区域进行超分处理,得到超分处理后的目标像素区域;
    更新模块,用于采用超分处理后的所述前一帧图像中的其他像素区域更新所述第一图像中所述其他像素区域,所述其他像素区域包括所述第一图像中除所述目标像素区域之外的像素区域;
    其中,超分处理后的所述第一图像包括所述超分处理后的目标像素区域以及更新后的所述其他像素区域。
  15. 根据权利要求14所述的装置,其特征在于,所述第一确定模块,包括:
    生成子模块,用于基于所述残差块,生成掩膜图形,所述掩膜图形包括多个第一掩膜点,所述多个第一掩膜点与所述残差块中的多个目标残差点的位置一一对应;
    确定子模块,用于将所述掩膜图形和所述第一图像输入超分模型,通过所述超分模型将所述第一图像中,与所述多个第一掩膜点中每个掩膜点位置对应的像素点所在区域,确定为所述目标像素区域。
  16. 根据权利要求15所述的装置,其特征在于,所述部分超分模块,用于:
    通过所述超分模型对所述第一图像中的所述目标像素区域进行超分处理,得到超分处理 后的目标像素区域。
  17. 根据权利要求14所述的装置,其特征在于,所述第一确定模块,包括:
    生成子模块,用于基于所述残差块,生成掩膜图形,所述掩膜图形包括多个第一掩膜点,多个第一掩膜点与所述残差块中的多个目标残差点的位置一一对应;
    确定子模块,用于将所述第一图像中,与所述多个第一掩膜点中每个掩膜点位置对应的像素点所在区域,确定为所述目标像素区域。
  18. 根据权利要求17所述的装置,其特征在于,所述部分超分模块,用于:
    将所述第一图像中的目标像素点输入超分模型,通过所述超分模型对所述第一图像中的所述目标像素区域进行超分处理,得到超分处理后的目标像素区域。
  19. 根据权利要求15至18任一所述的装置,其特征在于,所述生成子模块,用于:
    基于所述残差块,生成包括多个掩膜点的初始掩膜图形,所述多个掩膜点与所述第一图像的多个像素点位置一一对应,所述多个掩膜点包括所述多个第一掩膜点和多个第二掩膜点;
    将所述初始掩膜图形中所述第一掩膜点的掩膜值赋值为第一值,将所述掩膜图形中所述第二掩膜点的掩膜值赋值为第二值,得到所述掩膜图形,所述第一值和所述第二值不同;
    所述确定子模块,用于:
    遍历所述掩膜图形中的掩膜点,在所述第一图像中,将掩膜值为第一值的掩膜点所对应的像素点,确定为所述目标像素点。
  20. 根据权利要求15至18任一所述的装置,其特征在于,所述生成子模块,用于:
    对所述残差块进行形态学变化处理,得到所述掩膜图形,所述形态学变化处理包括二值化处理和对二值化后的所述残差块中的第一掩膜点进行膨胀处理,所述超分模型包括至少一个卷积层,所述膨胀处理的核与所述超分模型最后一个卷积层的感受野的尺寸相同。
  21. 根据权利要求15至20任一所述的装置,其特征在于,所述生成子模块,用于:
    将所述残差块划分成多个子残差块,并对每个划分得到的子残差块执行分块处理,所述分块处理包括:
    当所述子残差块中包括的至少一个残差点的残差值不为0时,将所述子残差块划分成多个子残差块,并对每个划分得到的子残差块执行所述分块处理,直至划分得到的子残差块中包括的残差点的残差值均为0,或者,划分得到的子残差块的残差点总数小于点数阈值,或者,对所述残差块的划分总次数达到次数阈值;
    生成与每个目标残差块对应的子掩膜图形,所述目标残差块包括的至少一个残差点的残差值不为0;
    其中,所述掩膜图形包括生成的子掩膜图形。
  22. 根据权利要求21所述的装置,其特征在于,所述部分超分模块,用于:
    在所述第一图像中获取每个所述子掩膜图形对应的目标图像块;
    分别对每个所述目标图像块包含的所述目标像素区域的子区域进行超分处理,得到所述超分处理后的目标像素区域,所述超分处理后的目标像素区域由超分处理后的每个所述目标图像块包含的所述目标像素区域的子区域组成。
  23. 根据权利要求21或22所述的装置,其特征在于,所述残差块和所述子残差块均采用四叉树划分的方式进行分块。
  24. 根据权利要求20至23任一所述的装置,其特征在于,所述装置还包括:
    腐蚀模块,用于在所述采用所述前一帧图像中的与其他像素区域位置对应的像素点的像素值更新所述第一图像中所述其他像素区域的像素值之前,对所述掩膜图形中的多个第一掩膜点进行腐蚀处理,得到更新后的掩膜图形,所述腐蚀处理的核与所述超分模型最后一个卷积层的感受野的尺寸相同;
    第二确定模块,用于将所述第一图像中,与腐蚀处理后的所述多个第一掩膜点位置对应的像素点,确定为辅助像素点;
    第三确定模块,用于将所述第一图像中除所述辅助像素点之外的像素点所在区域确定为所述其他像素区域。
  25. 根据权利要求14至24任一所述的装置,其特征在于,所述第一确定模块,用于:
    统计所述残差块中残差值为0的残差点的数量在所述残差块中残差点总数中的第一占比;
    当所述第一占比大于第一超分触发占比阈值时,基于所述残差块,在所述第一图像中确定所述目标像素区域。
  26. 根据权利要求14至25任一所述的装置,其特征在于,所述目标像素区域为所述第一图像中与第一目标残差点和第二目标残差点位置对应的像素点所在区域,所述第一目标残差点为所述残差块中残差值大于指定阈值的点,所述第二目标残差点为所述残差块中所述第一目标残差点周边的残差点。
  27. 一种电子设备,其特征在于,包括:处理器和存储器;
    所述存储器用于存储计算机程序;
    所述处理器用于执行所述存储器存储的所述计算机程序时实现如权利要求1至13任一所述的图像处理方法。
PCT/CN2020/110029 2019-10-22 2020-08-19 图像处理方法、装置及电子设备 WO2021077878A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP20878773.9A EP4036854A4 (en) 2019-10-22 2020-08-19 IMAGE PROCESSING METHOD AND EQUIPMENT AND ELECTRONIC DEVICE
US17/726,218 US20220245765A1 (en) 2019-10-22 2022-04-21 Image processing method and apparatus, and electronic device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911008119.2 2019-10-22
CN201911008119.2A CN112700368A (zh) 2019-10-22 2019-10-22 图像处理方法、装置及电子设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/726,218 Continuation US20220245765A1 (en) 2019-10-22 2022-04-21 Image processing method and apparatus, and electronic device

Publications (2)

Publication Number Publication Date
WO2021077878A1 true WO2021077878A1 (zh) 2021-04-29
WO2021077878A9 WO2021077878A9 (zh) 2022-04-14

Family

ID=75504786

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/110029 WO2021077878A1 (zh) 2019-10-22 2020-08-19 图像处理方法、装置及电子设备

Country Status (4)

Country Link
US (1) US20220245765A1 (zh)
EP (1) EP4036854A4 (zh)
CN (1) CN112700368A (zh)
WO (1) WO2021077878A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114463222A (zh) * 2022-02-21 2022-05-10 广州联合丽拓生物科技有限公司 一种皮内或皮下注射微型针头端的自动定位组装方法

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11948274B1 (en) * 2021-01-05 2024-04-02 Pixar Deep learned super resolution for feature film production
CN115240042B (zh) * 2022-07-05 2023-05-16 抖音视界有限公司 多模态图像识别方法、装置、可读介质和电子设备
CN115128570B (zh) * 2022-08-30 2022-11-25 北京海兰信数据科技股份有限公司 一种雷达图像的处理方法、装置及设备
CN116935788B (zh) * 2023-09-15 2023-12-29 长春希达电子技术有限公司 一种基于像素复用的颜色补偿方法、存储介质和系统
CN117041669B (zh) * 2023-09-27 2023-12-08 湖南快乐阳光互动娱乐传媒有限公司 视频流的超分控制方法、装置及电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101017573A (zh) * 2007-02-09 2007-08-15 南京大学 一种基于视频监控的运动目标检测与识别方法
JP2013222432A (ja) * 2012-04-19 2013-10-28 Nippon Hoso Kyokai <Nhk> 超解像パラメータ判定装置、画像縮小装置、及びプログラム
CN104966269A (zh) * 2015-06-05 2015-10-07 华为技术有限公司 一种多帧超分辨率成像的装置及方法
CN109951666A (zh) * 2019-02-27 2019-06-28 天津大学 基于监控视频的超分辨复原方法
CN110009650A (zh) * 2018-12-20 2019-07-12 浙江新再灵科技股份有限公司 一种扶梯扶手边界区域越界检测方法与系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101017573A (zh) * 2007-02-09 2007-08-15 南京大学 一种基于视频监控的运动目标检测与识别方法
JP2013222432A (ja) * 2012-04-19 2013-10-28 Nippon Hoso Kyokai <Nhk> 超解像パラメータ判定装置、画像縮小装置、及びプログラム
CN104966269A (zh) * 2015-06-05 2015-10-07 华为技术有限公司 一种多帧超分辨率成像的装置及方法
CN110009650A (zh) * 2018-12-20 2019-07-12 浙江新再灵科技股份有限公司 一种扶梯扶手边界区域越界检测方法与系统
CN109951666A (zh) * 2019-02-27 2019-06-28 天津大学 基于监控视频的超分辨复原方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4036854A4

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114463222A (zh) * 2022-02-21 2022-05-10 广州联合丽拓生物科技有限公司 一种皮内或皮下注射微型针头端的自动定位组装方法

Also Published As

Publication number Publication date
EP4036854A4 (en) 2023-02-01
WO2021077878A9 (zh) 2022-04-14
CN112700368A (zh) 2021-04-23
US20220245765A1 (en) 2022-08-04
EP4036854A1 (en) 2022-08-03

Similar Documents

Publication Publication Date Title
WO2021077878A1 (zh) 图像处理方法、装置及电子设备
US20240037836A1 (en) Image rendering method and apparatus, and electronic device
CN111179282B (zh) 图像处理方法、图像处理装置、存储介质与电子设备
CN113810601B (zh) 终端的图像处理方法、装置和终端设备
CN113810600B (zh) 终端的图像处理方法、装置和终端设备
US20220210308A1 (en) Image processing method and electronic apparatus
US20220245778A1 (en) Image bloom processing method and apparatus, and storage medium
CN113810603B (zh) 点光源图像检测方法和电子设备
CN115526787B (zh) 视频处理方法和装置
CN113709464A (zh) 视频编码方法及相关设备
WO2023125518A1 (zh) 一种图像编码方法以及装置
WO2023000745A1 (zh) 显示控制方法及相关装置
CN114079725B (zh) 视频防抖方法、终端设备和计算机可读存储介质
CN114332331A (zh) 一种图像处理方法和装置
WO2024082713A1 (zh) 一种图像渲染的方法及装置
CN115705663B (zh) 图像处理方法与电子设备
WO2021057626A1 (zh) 图像处理方法、装置、设备及计算机存储介质
CN115631250B (zh) 图像处理方法与电子设备
WO2024066834A1 (zh) Vsync信号的控制方法、电子设备、存储介质及芯片
CN115460343B (zh) 图像处理方法、设备及存储介质
CN116896626B (zh) 视频运动模糊程度的检测方法和装置
CN115802144B (zh) 视频拍摄方法及相关设备
WO2024078275A1 (zh) 一种图像处理的方法、装置、电子设备及存储介质
CN112541861B (zh) 图像处理方法、装置、设备及计算机存储介质
WO2022166386A1 (zh) 图像显示方法、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20878773

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2020878773

Country of ref document: EP

Effective date: 20220428