WO2021114846A1 - 一种视频降噪处理方法、装置及存储介质 - Google Patents

一种视频降噪处理方法、装置及存储介质 Download PDF

Info

Publication number
WO2021114846A1
WO2021114846A1 PCT/CN2020/119910 CN2020119910W WO2021114846A1 WO 2021114846 A1 WO2021114846 A1 WO 2021114846A1 CN 2020119910 W CN2020119910 W CN 2020119910W WO 2021114846 A1 WO2021114846 A1 WO 2021114846A1
Authority
WO
WIPO (PCT)
Prior art keywords
block
frame
noise reduction
current
current block
Prior art date
Application number
PCT/CN2020/119910
Other languages
English (en)
French (fr)
Inventor
段争志
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2021114846A1 publication Critical patent/WO2021114846A1/zh
Priority to US17/520,169 priority Critical patent/US20220058775A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/14Picture signal circuitry for video frequency region
    • H04N5/21Circuitry for suppressing or minimising disturbance, e.g. moiré or halo
    • G06T5/70
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration by the use of local operators
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration by the use of more than one image, e.g. averaging, subtraction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/86Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving reduction of coding artifacts, e.g. of blockiness
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20024Filtering details
    • G06T2207/20028Bilateral filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20172Image enhancement details
    • G06T2207/20182Noise reduction or smoothing in the temporal domain; Spatio-temporal filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20216Image averaging

Definitions

  • This application relates to the field of image processing technology, and in particular to a video noise reduction processing method, device and storage medium.
  • a video noise reduction processing method including:
  • a video noise reduction processing device includes:
  • the acquiring unit is used to acquire the current frame of the target video
  • a judging unit configured to determine that the current frame is a P frame or a B frame in the target video
  • a determining unit configured to determine the reference frame of the current frame from the target video according to the time-domain reference relationship between the current frame and the reference frame established in advance by the encoder;
  • a selecting unit configured to determine a reference block corresponding to a current block from the reference frame; the current block is any block in the current frame;
  • the calculation unit is configured to perform noise reduction processing on the current block according to the reference block.
  • the selection unit is specifically used for:
  • a matching block of the current block is determined from the reference frame, the current frame has at least one reference frame, and the current block exists in each reference frame A matching block of;
  • the matching block whose matching degree is greater than the matching threshold is used as the reference block of the current block.
  • the calculation unit is further used for:
  • a two-dimensional noise reduction algorithm is used to perform noise reduction processing on the current block.
  • the calculation unit is specifically used for:
  • the weight matrix of each reference block is calculated according to the matching degree between the reference block and the current block;
  • the pixels of the current block, the pixels of each reference block, and the corresponding weight matrix are used to perform a weighted summation to obtain the output pixels of the current block after noise reduction.
  • the calculation unit is further used for:
  • a two-dimensional noise reduction algorithm is used to perform pre-noise reduction processing on the current block.
  • the judging unit is further configured to determine that the current frame is an I frame in the target video
  • the calculation unit is further configured to perform noise reduction processing on the I frame by using a 2D noise reduction algorithm.
  • the selection unit is used for:
  • the reference block of the current block is determined from the reconstructed frame that is the updated reference frame.
  • the selection unit is also used for:
  • noise reduction processing is performed on the current block.
  • a computing device including at least one processor and at least one memory, wherein the memory stores a computer program, and when the program is executed by the processor, the processing The device executes the steps of the video noise reduction processing method provided in the embodiments of the present application.
  • a storage medium stores computer instructions.
  • the computer instructions When the computer instructions are executed on a computer, the computer executes the steps of the video noise reduction processing method provided in the embodiments of the present application.
  • FIG. 1 is a system architecture diagram of a video noise reduction processing system in an embodiment of the application
  • 2A is a flowchart of a method for processing video noise reduction in an embodiment of the application
  • 2B is a flowchart of a method for determining a reference block corresponding to a current block from a reference frame in an embodiment of the application;
  • 2C is a flowchart of a method for performing noise reduction processing on the current block according to the reference block in an embodiment of the application;
  • 2D is a flowchart of a method for determining a reference block of a current block from a reference frame in an embodiment of the application
  • Fig. 3A is a schematic diagram of a video sequence in an embodiment of the application.
  • 3B is a schematic diagram of a motion vector of a target between a current frame and a reference frame in an embodiment of the application;
  • FIG. 4A shows a flowchart of a video noise reduction processing method in an embodiment of the present application
  • 4B is a schematic diagram of a 3D filtering process in an embodiment of this application.
  • FIG. 5 is a comparison diagram of images before and after video noise reduction in an embodiment of the application.
  • FIG. 6 is a structural block diagram of a video noise reduction processing apparatus in an embodiment of this application.
  • FIG. 7 shows a structural block diagram of a device provided by an embodiment of the present application.
  • Video encoding The original video data generated after the image information is collected has a very large amount of data. For some applications where the image information is directly played locally after collection, there is no need to consider compression technology. But in reality, more applications involve video transmission and storage. Transmission networks and storage devices cannot tolerate the huge amount of original video data. The original video data must be encoded and compressed before transmission and storage.
  • Video coding includes intra-frame coding and inter-frame coding.
  • Intra-frame coding is spatial domain coding, which uses image spatial redundancy for image compression. What is processed is an independent image and does not span multiple images. Spatial coding relies on the similarity between adjacent pixels in an image and the main spatial frequency of the pattern area.
  • Inter-frame coding is time-domain coding, which uses the temporal redundancy between a set of consecutive images for image compression. If a certain frame of image can be used by the decoder, the decoder only needs to use the difference between the two frames to get the next frame of image. For example, several frames with smooth motion have large similarities and small differences, while several images with violent motion have small similarities and large differences.
  • Time domain coding relies on the similarity between consecutive image frames, and uses the received and processed image information as much as possible to "predict" and generate the current image.
  • Frame A single image frame in a video sequence.
  • I frame (Intra-coded picture, often referred to as a key frame) contains a complete picture of image information, belongs to the intra-coded picture, does not contain motion vectors, and does not need to refer to other frame images during encoding and decoding.
  • the I frame image is used to prevent the accumulation and spread of errors.
  • P frame (Predictive-coded picture, predictive-coded picture) is an inter-frame coded frame. In the encoding process of the encoder, the previous I frame or P frame is used for predictive encoding.
  • B frame (Bi-directionally predicted picture) is an inter-frame coded frame.
  • the I frame or P frame before and after it is used for bidirectional predictive coding.
  • Pre-analysis Before formally encoding the frame data, analyze a certain number of frames in advance, and apply the pre-analysis data to guide the subsequent encoding process.
  • Reconstruction frame In the encoder, after the source image is compressed, on the one hand, various coding information needs to be packaged (entropy coding) to output the code stream, and on the other hand, the inverse process of the coding process needs to be carried out according to various coding information, that is, the image Reconstruction and restoration are performed to obtain a reference frame that is consistent with the decoder for other frame encoding/decoding.
  • the restored frame is not equal to the source image, and we call it a reconstructed frame.
  • Reference frame In video coding and decoding, a reconstructed frame used as reference data for other frames to obtain inter-frame reference data during the encoding/decoding process of other frames.
  • Motion vector A two-dimensional vector used to describe the position shift that occurs when a coding block in the encoder moves from its position to another position.
  • a group of continuous images record the movement of the target, and the motion vector is used to measure the degree of movement of the target between two frames of images.
  • the motion vector is composed of both the horizontal displacement amount and the vertical displacement amount.
  • Block The basic unit of coding processing, including macro block, sub-macro block, block and other types.
  • One block is 4 ⁇ 4 pixels, and the macro block and the sub-macro block are composed of multiple blocks.
  • the size of the macro block is 16 ⁇ 16 pixels, which is divided into I, B, and P macro blocks.
  • BM3D Block Matching 3D: An industry-recognized processing technology for image noise reduction by searching for matching blocks and other transformation techniques or filtering within the image with better effects.
  • 3D video noise reduction A technical method in video that uses its own information (spatial information) of the frame of the image and the reference information (temporal information) of the frame images before and after it to reduce noise in order to improve the image quality or definition.
  • Noise reduction is performed within a frame by analyzing the relationship between the spatial neighboring data of the pixels in the frame, also known as spatial noise reduction.
  • Bilateral filtering algorithm a 2D neighborhood noise reduction algorithm.
  • the algorithm takes into account the distance and pixel difference between the pixel to be filtered and its surrounding pixels when taking the weight of adjacent pixels.
  • SSD (sum of squared difference), often used to measure the matching degree of two blocks, the smaller the value, the smaller the difference between the two blocks, and the higher the matching degree.
  • Video noise reduction technology based on the BM3D (Block Matching 3D) algorithm is generally used to reduce the noise of the video.
  • BM3D Block Matching 3D
  • the video processing speed is very slow, especially for bandwidth-constrained real-time streaming media services, mobile video phones, online video chats, etc., which have a greater impact.
  • the embodiment of this application proposes a video noise reduction processing method, which is mainly applied to real-time video calls, even video communication, video conferencing, real-time video capture, live transcoding and other scenarios, which can reduce the amount of calculations for video noise reduction and alleviate Network bandwidth pressure.
  • FIG. 1 shows an architecture diagram of a video noise reduction processing system provided by an embodiment of the present application, including a server 101, a first terminal 102, and a second terminal 103.
  • the first terminal 102 and/or the second terminal 103 may be an electronic device with wireless communication functions such as a mobile phone, a tablet computer, or a dedicated handheld device, or may be a personal computer (PC for short), a notebook computer, a server, etc. Wired access is used to connect Internet-connected devices.
  • the server 101 may be a network device such as a computer.
  • the server 101 may be an independent device or a server cluster formed by multiple servers.
  • the server 101 may use cloud computing technology for information processing.
  • the network in the system can be an INTERNET network, or a mobile communication system such as the Global System for Mobile Communications (GSM) and the Long Term Evolution (LTE) system.
  • GSM Global System for Mobile Communications
  • LTE Long Term Evolution
  • the video noise reduction processing method in the embodiment of the present application may be executed by the server 101.
  • the video in the embodiment of this application may be a communication video stream for the first terminal 102 to communicate with the second terminal 103, such as the video collected by the collection device transmitted by the first terminal 102 to the second terminal 103, or the second terminal 103 transmits to the first terminal 102 the video collected by its collection device.
  • the server 101 may be a server that provides video forwarding.
  • the video capture device can be integrated on the terminal, such as the front camera or the rear camera of a mobile phone terminal, and the image capture device can also be a device that exists independently of the terminal but is connected to the terminal, such as a camera connected to a computer terminal, etc. .
  • the server 101 may perform noise reduction processing on the target video transmitted by the first terminal 102 to the second terminal 103, or the server 101 may perform noise reduction processing on the target video transmitted by the second terminal 103 to the first terminal 102.
  • the processing can also be that the server 101 performs noise reduction processing on the target video stream transmitted by the first terminal 102 to the second terminal 103 and the target video stream transmitted by the second terminal 103 to the first terminal 102 at the same time.
  • the video noise reduction processing method may also be executed by the terminal.
  • the execution subject is changed from the server to the terminal.
  • the execution subject is changed from the server to the terminal.
  • the first terminal 102 it can perform noise reduction processing on the target video transmitted to the second terminal 103, and can also perform noise reduction processing on the target video transmitted by the second terminal 103 to itself, and can also perform noise reduction processing on the target video transmitted to the second terminal 103 at the same time.
  • the target video transmitted by the second terminal 103 and the target video transmitted by the second terminal 103 perform noise reduction processing.
  • the second terminal 103 it is similar to the first terminal 102.
  • the terminal executing the video noise reduction processing method of the embodiment of the present application may also be executed by the client installed on it.
  • the video noise reduction processing method may also be executed jointly by the server 101 and the terminal.
  • the main noise reduction logic is set on the user terminal, and the logic for how to configure the encoder is set on the server.
  • the terminal provides the target video to the server; the server generates corresponding encoding parameters according to the provided target video, including reference frames corresponding to P or B frames, and transmits them to the terminal side; the terminal controls the encoder to determine the reference according to the encoding parameters Block, and use the reference block to reduce the noise of the current block.
  • An embodiment of the present application provides a video noise reduction processing method, which is executed by a computing device, and the computing device may be a server or a terminal. As shown in Figure 2A, the method includes:
  • Step S201 Obtain the current frame of the target video.
  • the target video can be a video collected by the terminal through its integrated collection device, or it can be a video collected by an independent collection device connected to the terminal through a wired or wireless network.
  • the terminal may perform noise reduction processing on the target video, or the terminal may send the target video to the server, and the server may perform noise reduction processing on the target video.
  • Step S202 Determine that the current frame is a P frame or a B frame in the target video.
  • each frame is pre-analyzed, specifically the width and height corresponding to each frame are reduced 1/2 times the image complexity, intra-frame prediction, inter-frame prediction analysis, statistical intra/inter-frame cost comparison, scene detection, etc. combined with some specified coding parameters, such as the number of B frames or GOP (Group Of Pictures (picture group) size determines the frame type of each frame.
  • GOP Group Of Pictures
  • a GOP is a data stream composed of a section of images whose contents are not very different.
  • a GOP can be very long, because less motion changes mean that the content of the image screen changes very little, so you can set one I frame, and then set multiple P frames and B frames.
  • one GOP may be relatively short, for example, it only contains one I frame and 3 or 4 P frames.
  • the I frame is a compressed frame obtained by removing the redundant information of the image space at the maximum. It has all the image information and can be decoded independently without referring to other frames. It is called an intra-frame coded frame.
  • All videos contain at least one I frame, which is the first frame of the video, and the other I frames in the video are used to improve the video quality.
  • P frame uses the difference between the current frame and the previous frame (I frame or P frame) to decode the data of the current frame
  • B frame uses the difference between the current frame and the previous frame
  • the difference between the current frame and the subsequent frame Value to decode the frame data.
  • the 3D noise reduction algorithm in the embodiment of the present application mainly targets P frames and B frames in the target video.
  • Step S203 Determine the reference frame of the current frame from the target video according to the time-domain reference relationship between the current frame and the reference frame pre-established by the encoder.
  • the encoder will determine the time-domain reference relationship such as the reference frame and the motion vector of the current frame in the pre-analysis process for subsequent encoding and decoding of the current frame.
  • the embodiment of the present application multiplexes the time-domain reference relationship such as the reference frame and the motion vector used for encoding and decoding the current frame in the encoder, and uses the reference frame of the current frame to reduce the noise of the current frame.
  • the encoder will determine two forward reference frames for P-frames; for B-frames, two forward reference frames and two backward reference frames will be given, for a total of four reference frames.
  • the corresponding reference frames are an I frame with a sequence number of 0 and a P frame with a sequence number of 3; for the B frame with a sequence number of 4,
  • the corresponding reference frame numbers are two forward frames with sequence numbers 0 and 3, and two backward frames with sequence numbers 7 and 9 respectively.
  • the reference frame and the current frame may or may not be adjacent; different reference frames corresponding to the same current frame may or may not be adjacent.
  • the number of reference frames and the positional relationship with the current frame can be set arbitrarily, which is not limited in the embodiment of the present application.
  • Step S204 Determine the reference block corresponding to the current block from the reference frame; the current block is any block in the current frame.
  • each frame of image of the target video is processed in units of blocks, and the size of a reference block is the same as the size of the current block.
  • the size of the current block here may be 8 ⁇ 8 pixels, and the corresponding reference block size is also 8 ⁇ 8 pixels; or the size of the current block is 16 ⁇ 16 pixels, and the corresponding reference block size is also 16 ⁇ 16 pixels.
  • the embodiment of the present application does not limit the size of the block.
  • Step S205 Perform noise reduction processing on the current block according to the reference block.
  • the current frame of the target video is acquired inside the encoder, and after determining that the current frame is the P frame or the B frame in the target video, the encoder is used to determine the reference frame of the current frame from the target video. Determine the reference block corresponding to the current block from the reference frame, where the current block is any block in the current frame, and perform noise reduction processing on the current block according to the reference block. Since the reference relationship between the current frame and the reference frame needs to be established in the encoder for video encoding, the embodiment of the present application performs noise removal on the current frame by multiplexing the reference relationship between the current frame and the reference frame in the encoder. This can reduce the amount of calculation, shorten the calculation time, relieve network bandwidth pressure, and meet the requirements of real-time video noise reduction.
  • the I frame in the video sequence does not need to refer to other frame images during encoding and decoding, in this embodiment of the present application, for the I frame, 2D denoising is directly performed without referring to other frames.
  • 2D denoising is directly performed without referring to other frames.
  • the 2D noise reduction algorithm is used to perform noise reduction processing on the I frame.
  • a 2D noise reduction algorithm is used to perform noise reduction processing
  • the current frame is a P frame or a B frame
  • the 3D noise reduction algorithm is used to perform noise reduction processing.
  • the 2D noise reduction algorithm in the embodiment of the present application may be a regional bilateral filtering algorithm, a Gaussian filtering algorithm, and the like.
  • the following takes the regional bilateral filtering algorithm as an example to introduce.
  • the n ⁇ m area generally takes 3 ⁇ 3 pixels, or 5 ⁇ 5 pixels, or 7 ⁇ 7 pixels.
  • denoised_pixl[i][j] is the pixel value of the pixel point [i][j] after noise reduction;
  • src is the pixel value of the block block[i+m][j+n];
  • coef[m][n] The filter coefficient is selected according to the absolute value diffVal and the position relationship of the difference between the pixel and the center pixel in the n ⁇ m area, and the bilateral filter level table, which can be determined according to the following formula:
  • coefTable[t num ] is a bilateral filtering level table based on filtering levels in 2D filtering; in one embodiment, t num is 0-9, representing 10 noise reduction levels; in one embodiment, Table[t num ]
  • the values are: ⁇ 2,3,5,8,11,15,19,24,29,32 ⁇ ; diffval is the difference between the current pixel value and the pixel value in the filter template, which can be too large in the implementation process Limit the diffval of (for example: 0 ⁇ 31); [m][n] is the distance between the current pixel and the reference pixel in the n ⁇ m area, and the value is ⁇ 2 is used to measure the noise level.
  • the value may be t num ⁇ 4? 4: t num ⁇ 7? 6:9.
  • the 2D noise reduction algorithm uses local weighted averaging to reduce noise, it is inevitable that details or boundary blur will occur to a certain extent.
  • a certain amount of sharpening compensation can be added after noise reduction, and the pixel value after adding sharpening compensation is calculated according to the following formula:
  • denoised_ed_pixl average+sharp_ratio*(denoised_ed_pixl-average)+Formula 3
  • averag is the average value of pixels in the n ⁇ m area corresponding to the current pixel value
  • sharp_ratio is the sharpening compensation parameter
  • the type is double, and the value is -2.0 to 2.0.
  • the embodiment of the present application uses a current block with a size of 8 ⁇ 8 pixels or 16 ⁇ 16 pixels as a unit to perform filtering calculation.
  • the specific process is as follows:
  • the current block Before using the 3D noise reduction algorithm to perform the noise reduction process, the current block can also be pre-processed by the 2D noise reduction algorithm to further improve the noise reduction effect.
  • the noise monitoring algorithm Before using the 3D noise reduction algorithm to reduce the noise of the current block, you can also use the noise monitoring algorithm to calculate the internal variance of the current block. If the internal variance of the current block is greater than the noise reduction threshold, perform the subsequent noise reduction processing; If the internal variance of the block is less than or equal to the noise reduction threshold, the subsequent noise reduction processing is performed.
  • the internal variance of the current block can be used to indicate the internal complexity of the current block. When the internal variance is less than or equal to the noise reduction threshold, the internal complexity of the current block can be considered to be small, which can save the noise reduction process, thereby further reducing the amount of calculation To shorten the noise reduction duration of the video.
  • FIG. 2B shows a flowchart of a method for determining a reference block corresponding to a current block from a reference frame in an embodiment of the present application. As shown in Figure 2B, the method includes the following steps:
  • Step S2041 According to the motion vector corresponding to the current block in the encoder, a matching block of the current block is determined from a reference frame, the current frame has at least one reference frame, and each reference frame has a matching block of the current block Piece;
  • Step S2042 determining the degree of matching between each matching block and the current block
  • step S2043 a matching block with a matching degree greater than a matching threshold is used as a reference block of the current block.
  • a block with a size of 8 ⁇ 8 pixels or 16 ⁇ 16 pixels may be used as a unit for noise reduction, that is, the size of the current block may be 8 ⁇ 8 pixels, or 16 ⁇ 16 pixels.
  • the size of the matching block corresponding to the current block in the reference frame is also 8 ⁇ 8 pixels, or 16 ⁇ 16 pixels.
  • the relationship between the position of the current block in the current frame and the position of the matching block in the reference frame is represented by a motion vector, which is given by the encoder in the pre-analysis process.
  • FIG. 3B shows a schematic diagram of the motion vector of the target between the current frame and the reference frame. Since the image in the video is generally changing, when a target 301 moves, its position will change but the shape and color are basically unchanged.
  • the decoder uses the motion vector to show that the target moves from position 31 in the reference frame to the current frame. The position 32 is fine. Assuming the ideal situation in Fig. 3B, there is no change in any attribute of the target except the moving position, then the difference between the two images only includes the data amount of the motion vector.
  • the motion vector can be used to determine the only corresponding matching block in a reference frame, and the motion vector given by the encoder can be directly used to determine the matching block, which can reduce the amount of calculation.
  • the number of matching blocks is the same as the number of reference frames, that is, one matching block of the current block can be determined in one reference frame.
  • one current block can determine multiple corresponding matching blocks. For example, for a P frame, where there are 2 reference frames, there are 2 matching blocks for the current block in the P frame; for a B frame, where there are 4 reference frames, then there are 4 matching blocks for the current block in the B frame.
  • the matching degree between each matching block and the current block needs to be calculated, and the reference block of the current block is selected from the matching blocks according to the matching degree, and the matching block with the matching degree greater than the matching threshold is used as the reference block.
  • the matching blocks whose degree is less than or equal to the matching threshold are discarded.
  • the matching degree algorithm can be an SSD (sum of squared difference) algorithm, which is used to measure the matching degree of two blocks. The smaller the value of SSD, the smaller the difference between the current block and the matched block. , The higher the matching degree. Or, the correlation matching method is used, and the multiplication operation is adopted. The larger the value, the higher the matching degree.
  • the matching degree is the highest, and when the value is -1, the matching degree is the lowest.
  • SSE residual sum of squares/sum squared residual
  • the 2D noise reduction algorithm is used to perform noise reduction processing on the current block, thereby improving the effect of noise reduction.
  • Fig. 2C shows a flowchart of a method for performing noise reduction processing on the current block according to the reference block. As shown in Figure 2C, the method includes the following steps:
  • Step S2051 According to the 3D (three-dimensional) noise reduction algorithm, the weight matrix of each reference block is calculated according to the matching degree between the reference block and the current block.
  • the weight matrix of a reference block can be calculated according to the following formula:
  • w n [i][j] is the weight matrix of the nth reference block; n represents the nth reference block, and when n is 0, it represents the weight of the current block itself; poc 0 is the frame number of the current block; poc refn is the sequence number of the frame where the nth reference block is located; ⁇ is the attenuation factor based on the distance of the reference frame, a typical value is 0.95; diffVal is the [i,j]th pixel corresponding to the nth reference block and the current block
  • coefTable[t num ] is a bilateral filtering level table based on the filtering level in 2D filtering; num ref represents the number of reference blocks.
  • SSE n is the matching degree between the reference block and the current block; f(SSE n ) represents a weighting factor based on the matching degree, which can be calculated as follows:
  • ref_th is the matching degree.
  • refweiht can be taken as an empirical parameter: for example, the luminance component is 225 and the chrominance component is 256.
  • the value of ref_th is ⁇ 40,120,320,500 ⁇ for brightness and ⁇ 10,140,80,125 ⁇ for chroma.
  • Step S2052 Perform a weighted summation using the pixels of the current block, the pixels of each reference block, and the corresponding weight matrix to obtain the output pixels of the current block after noise reduction.
  • the output pixels of the current block after noise reduction can be calculated according to the following formula:
  • denoised_pixl[i][j] is the pixel value of the pixel point [i][j] after noise reduction;
  • w n [i][j] is the weight matrix of the nth reference block,
  • block n [i][j ] Is the input pixel of the nth reference block; in particular, when n is 0, it represents the weight matrix of the current block, and block n [i][j] is the input pixel of the current block.
  • the reference block of the current block is determined from the reference frame.
  • Fig. 2D shows a flowchart of a method for determining a reference block of a current block from a reference frame in an embodiment of the present application. As shown in Figure 2D, the method includes the following steps:
  • Step S241 reconstructing and restoring a reference frame to obtain a reconstructed frame, and use the reconstructed frame as the updated reference frame;
  • Step S242 Determine the reference block of the current block from the reconstructed frame that is the updated reference frame.
  • all matching blocks come from the reconstructed frame in the encoder, that is to say, the reference blocks also come from the reconstructed frame in the encoder.
  • the reconstructed frame is the inverse process of the encoding process after the encoder compresses the source image according to various encoding information, that is, the image is reconstructed and restored to obtain the reconstructed frame, and the reference frame is determined from the reconstructed frame for other Frame codec is used.
  • the embodiment of the present application determines the reference frame from the reconstructed frame in the encoding. Compared with the source image, the reconstructed frame is the frame that has been filtered and denoised. For the source video with noise pollution, the image quality of the reconstructed frame is higher. Can get better filtering effect.
  • FIG. 4A shows a flowchart of a video noise reduction processing method in an embodiment of the present application. As shown in FIG. 4A, the embodiment includes the following steps:
  • Step S401 Acquire the current frame in the target video.
  • Step S402 Determine the frame type of the current frame. If the current frame is an I frame, perform step S403; if the current frame is a P frame or B frame, perform step S404.
  • Step S403 using a bilateral filtering algorithm to reduce noise on the current frame.
  • step S404 a block with a size of 16 ⁇ 16 in the current frame is taken as the current block.
  • Step S405 Use the noise monitoring algorithm to calculate the internal variance of the current block, and determine whether the internal variance is greater than the noise reduction threshold. If the internal variance is greater than the noise reduction threshold, use the reference block to perform 3D noise reduction on the current block, and perform step S407; if If the internal variance is less than or equal to the noise reduction threshold, it can be considered that the internal complexity of the current block is small, the noise reduction process can be omitted, and step S406 is executed.
  • Step S406 End the operation on the current block, take the next block as the current block, and recalculate the internal variance of the next block as the current block.
  • step S407 before performing 3D noise reduction, the bilateral filtering algorithm is used to perform noise reduction on the current block.
  • Step S408 Obtain a reference frame corresponding to the current frame determined in the pre-analysis stage of the encoder.
  • the number of reference frames is 2; if the current frame is a B frame, the number of reference frames is 4. Simultaneously,
  • Step S409 Obtain a motion vector for each reference frame of the current block determined by the encoder.
  • Step S410 Determine a matching block of the current block according to the reference frame and the corresponding motion vector.
  • the current block is a block in the P frame, the current block corresponds to 2 matching blocks; if the current block is a block in the B frame, then the current block corresponds to 4 matching blocks.
  • Step S411 Calculate the SSE between each matching block and the current block, and determine the matching block of the current block according to the calculated SSE. Specifically, the matching block whose SSE is less than ref_th[3] in Formula 5 can be used as the reference block of the current block, and the matching block whose SSE is greater than or equal to ref_th[3] in Formula 5 is discarded.
  • step S412 it is determined whether the number of reference blocks corresponding to the current block is 0, if the number is 0, step S413 is executed; if the number is not equal to 0, step S414 is executed.
  • Step S413 End the processing of the current block, and continue to process the next block in the current frame.
  • Step S414 using the matching degree between the reference block and the current block, calculate the weight matrix of each reference block.
  • Step S415 Perform a weighted summation using the pixels of the current block, the pixels of each reference block, and the corresponding weight matrix to obtain the output pixels of the current block after noise reduction.
  • Fig. 4B shows a schematic diagram of the 3D filtering process.
  • the time-domain reference relationship within the encoder is multiplexed, which greatly reduces the complexity of the video noise reduction algorithm, thereby reducing the calculation time while ensuring the definition of the video.
  • a good balance of effect and performance can be obtained in application scenarios such as real-time encoding or transcoding.
  • the processing speed of 1280 ⁇ 720p video can be as high as 200fps or more in fast grades, which meets the needs of live transcoding, and is especially suitable for live broadcast scenes.
  • Figure 5 shows the image comparison before and after video noise reduction.
  • the left image corresponding to the two images is the original image, and the right image is the image after noise reduction. It can be seen that the video noise reduction processing method in the embodiment of the present application can effectively remove the noise generated during the shooting process, improve the video clarity, greatly reduce the bit rate, and save the bit rate as high as for some noisy videos. 100%.
  • FIG. 6 shows a structural block diagram of a video noise reduction processing apparatus provided by an embodiment of the present application.
  • the video noise reduction processing device realizes that it becomes all or part of the server in FIG. 1 or all or a part of the terminal in FIG. 1 through hardware or a combination of software and hardware.
  • the device includes: an acquisition unit 601, a judgment unit 602, a determination unit 603, a selection unit 604, and a calculation unit 605.
  • the acquiring unit 601 is configured to acquire the current frame of the target video
  • the judging unit 602 is configured to determine that the current frame is a P frame or a B frame in the target video
  • the determining unit 603 is configured to determine the reference frame of the current frame from the target video according to the time-domain reference relationship between the current frame and the reference frame established in advance by the encoder;
  • the selecting unit 604 is configured to determine the reference block corresponding to the current block from the reference frame; the current block is any block in the current frame;
  • the calculation unit 605 is configured to perform noise reduction processing on the current block according to the reference block.
  • the selecting unit 604 is specifically configured to:
  • the matching block whose matching degree is greater than the matching threshold is used as the reference block of the current block.
  • the calculation unit 605 is also used for:
  • the calculation unit 605 is specifically configured to:
  • the weight matrix of each reference block is calculated according to the matching degree between the reference block and the current block;
  • the pixels of the current block, the pixels of each reference block, and the corresponding weight matrix are used to perform a weighted summation to obtain the output pixels of the current block after noise reduction.
  • the calculation unit 605 is also used for:
  • the judging unit 602 is further configured to determine that the current frame is an I frame in the target video
  • the calculation unit is also used to perform noise reduction processing on the I frame by using a 2D noise reduction algorithm.
  • the selecting unit 604 is used to:
  • the selecting unit 604 is also used for:
  • the noise monitoring algorithm it is determined that the internal variance of the current block is greater than the noise reduction threshold.
  • FIG. 7 shows a structural block diagram of a computing device provided by an embodiment of the present application.
  • the computing device 700 may be implemented as the server or terminal in FIG. 1. Specifically:
  • the device 700 includes a central processing unit (CPU) 701, a system memory 704 including a random access memory (RAM) 702 and a read only memory (ROM) 703, and a system bus 705 connecting the system memory 704 and the central processing unit 701.
  • the server 700 also includes a basic input/output system (I/O system) 706 that helps transfer information between various devices in the computer, and a mass storage device 707 that stores the operating system 713, application programs 714, and other program modules 715.
  • I/O system basic input/output system
  • the basic input/output system 706 includes a display 708 for displaying information and an input device 709 such as a mouse and a keyboard for the user to input information.
  • the display 708 and the input device 709 are both connected to the central processing unit 701 through the input and output controller 710 connected to the system bus 705.
  • the basic input/output system 706 may also include an input and output controller 710 for receiving and processing input from multiple other devices such as a keyboard, a mouse, or an electronic stylus.
  • the input and output controller 710 also provides output to a display screen, a printer, or other types of output devices.
  • the mass storage device 707 is connected to the central processing unit 701 through a mass storage controller (not shown) connected to the system bus 705.
  • the mass storage device 707 and its associated computer-readable medium provide non-volatile storage for the server 700. That is, the mass storage device 707 may include a computer-readable medium (not shown) such as a hard disk or a CD-ROM drive.
  • Computer-readable media may include computer storage media and communication media.
  • Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storing information such as computer readable instructions, data structures, program modules or other data.
  • Computer storage media include RAM, ROM, EPROM, EEPROM, flash memory or other solid-state storage technologies, CD-ROM, DVD or other optical storage, tape cartridges, magnetic tape, disk storage or other magnetic storage devices.
  • RAM random access memory
  • ROM read-only memory
  • EPROM Erasable programmable read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • the server 700 may also be connected to a remote computer on the network to run through a network such as the Internet. That is, the server 700 can be connected to the network 712 through the network interface unit 711 connected to the system bus 705, or in other words, the network interface unit 711 can also be used to connect to other types of networks or remote computer systems (not shown).
  • the memory also includes one or more programs, one or more programs are stored in the memory, and the one or more programs include instructions for performing the video noise reduction processing method provided in the embodiments of the present application.
  • the program can be stored in a computer-readable storage medium, which can be Including: Read Only Memory (ROM, Read Only Memory), Random Access Memory (RAM, Random Access Memory), magnetic disks or optical disks, etc.
  • the program may be stored in a computer-readable storage medium, and the storage medium may include : Read-only memory (ROM, Read Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk, etc.

Abstract

本申请涉及图像处理技术领域,公开了一种视频降噪处理方法、装置及存储介质,用于减小视频降噪的计算量,缓解网络带宽压力。该方法包括:获取目标视频的当前帧;确定当前帧为目标视频中的P帧或B帧;根据编码器预先建立的当前帧与参考帧之间的时域参考关系,从目标视频中确定当前帧的参考帧;从参考帧中确定当前块对应的参考块;当前块为当前帧中的任一块;根据参考块对当前块进行降噪处理。

Description

一种视频降噪处理方法、装置及存储介质
本申请要求于2019年12月09日提交中国专利局、申请号为201911251050.6、名称为“一种视频降噪处理方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及图像处理技术领域,尤其涉及一种视频降噪处理方法、装置及存储介质。
背景
随着数字媒体技术和计算机技术的发展,视频应用于各个领域,如移动通信、网络监控、网络电视等。由于受镜头和成本限制,导致单像素上的光通量较小,采集的视频中含有大量随机噪声,尤其在场景较暗弱的情况下更加明显。这种噪声极大的破坏了图像的清晰度和质量,另一方面使得编码时残差过大,导致码流增加,加剧了网络和存储的负担。
技术内容
根据本申请实施例,提供了一种视频降噪处理方法,包括:
获取目标视频的当前帧;
确定所述当前帧为所述目标视频中的P帧或B帧;
根据编码器预先建立的当前帧与参考帧之间的时域参考关系,从所述目标视频中确定所述当前帧的参考帧;
从所述参考帧中确定当前块对应的参考块,所述当前块为所述当前帧中的任一块;
根据所述参考块对所述当前块进行降噪处理。
根据本申请实施例,提供了一种视频降噪处理装置,所述装置包括:
获取单元,用于获取目标视频的当前帧;
判断单元,用于确定所述当前帧为所述目标视频中的P帧或B帧;
确定单元,用于根据编码器预先建立的当前帧与参考帧之间的时域参考关系,从所述目标视频中确定所述当前帧的参考帧;
选取单元,用于从所述参考帧中确定当前块对应的参考块;所述当前块为所述当前帧中的任一块;
计算单元,用于根据所述参考块对所述当前块进行降噪处理。
一实施例中,所述选取单元,具体用于:
根据所述编码器中所述当前块对应的运动矢量,从所述参考帧中确定所述当前块的匹配块,所述当前帧具有至少一个参考帧,每个参考帧中存在所述当前块的一个匹配块;
确定每一个匹配块与所述当前块之间的匹配度;
将匹配度大于匹配阈值的匹配块作为所述当前块的参考块。
一些实施例中,所述计算单元,还用于:
如果确定各匹配块与所述当前块之间的匹配度均小于或等于所述匹配阈值,则利用二维降噪算法对所述当前块进行降噪处理。
一些实施例中,所述计算单元,具体用于:
依据三维降噪算法,根据参考块与所述当前块之间的匹配度,计算每一个参考块的权重矩阵;
利用所述当前块的像素、每一个参考块的像素以及对应的权重矩阵进行带权求和,得出所述当前块降噪后的输出像素。
一些实施例中,所述计算单元,还用于:
利用二维降噪算法对所述当前块进行预降噪处理。
一些实施例中,判断单元,还用于确定所述当前帧为所述目标视频中的I帧;
计算单元,还用于利用2D降噪算法对所述I帧进行降噪处理。
一些实施例中,所述选取单元,用于:
将所述参考帧进行重构并恢复,得到重建帧,并将所述重建帧作为更新后的所述参考帧;
从作为更新后的所述参考帧的所述重建帧中确定所述当前块的参考块。
一些实施例中,所述选取单元,还用于:
利用噪声监测算法,确定所述当前块的内部方差是否大于降噪阈值;如果所述当前块的内部方差大于降噪阈值,则对所述当前块执行降噪处理。
根据本申请实施例,提供了一种计算设备,包括至少一个处理器、以及至少一个存储器,其中,所述存储器存储有计算机程序,当所述程序被所述处理器执行时,使得所述处理器执行本申请实施例提供的视频降噪处理方法的步骤。
根据本申请实施例,提供了一种存储介质,所述存储介质存储有计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行本申请实施例提供的视频降噪处理方法的步骤。
附图说明
为了更清楚地说明本申请实施例或相关技术中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例。
图1为本申请实施例中的一种视频降噪处理系统的系统架构图;
图2A为本申请实施例中的一种视频降噪处理方法的流程图;
图2B为本申请实施例中从参考帧中确定当前块对应的参考块的方法的流程图;
图2C为本申请实施例中根据所述参考块对所述当前块进行降噪处理的方法的流程图;
图2D为本申请实施例中从参考帧中确定当前块的参考块的方法的流程图;
图3A为本申请实施例中视频序列的示意图;
图3B为本申请实施例中目标在当前帧与参考帧之间的运动矢量的示意图;
图4A示出了本申请实施例中的一种视频降噪处理方法的流程图;
图4B为本申请实施例中3D滤波过程的示意图;
图5为本申请实施例中视频降噪前后的图像对比图;
图6为本申请实施例中一种视频降噪处理装置的结构方框图;
图7示出了本申请一个实施例提供的设备的结构方框图。
实施方式
下面为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请技术方案的一部分实施例,而不是全部的实施例。基于本申请文件中记载的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其它实施例,都属于本申请技术方案保护的范围。
本申请的说明书和权利要求书及上述附图中的术语“第一”和“第二”是用于区别不同对象,而非用于描述特定顺序。此外,术语“包括”以及它们任何变形,意图在于覆盖不排他的保护。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是还包括没有列出的步骤或单元,或还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。
下面对本申请实施例中涉及的部分概念进行介绍。
视频编码:图像信息经采集后生成的原始视频数据,数据量非常大,对于某些采集后直接本地播放的应用场合,不需要考虑压缩技术。但现实中更多的应用场合,涉及视频的传输与存储,传输网络与存储设备无法容忍原始视频数据的巨大数据量,必须将原始视频数据经过编码压缩后,再进行传输与存储。
视频编码包括帧内编码和帧间编码,其中帧内编码是空间域编码,利用图像空间性冗余度进行图像压缩,处理的是一幅独立的图像,不会跨越多幅图像。空间域编码依赖于一幅图像中相邻像素间的相似性和图案区的主要空间域频率。帧间编码是时间域编码,是利用一组连续图像间的时间性冗余度进行图像压缩。如果某帧图像可被解码器使用,那么解码器只须利用两帧图像的差异即可得到下一帧图像。比如运动平缓的几帧图像的相似性大,差异性小,而运动剧烈的几幅图像则相似性小,差异性大。当得到一帧完整的图像信息后,可以利用与后一帧图像的差异值推算得到后一帧图像,这样就实现了数据量的压缩。时间域编码依赖于连续图像帧间的相似性,尽可能利用已接收处理的图像信息来“预测”生成当 前图像。
帧:视频序列中的单幅影像画面。
I帧:(Intra-coded picture,帧内编码帧,常称为关键帧)包含一幅完整的图像信息,属于帧内编码图像,不含运动矢量,在编解码时不需要参考其它帧图像。I帧图像用于阻止误差的累积和扩散。
P帧:(Predictive-coded picture,预测编码图像帧)是帧间编码帧,在编码器编码过程中,利用之前的I帧或P帧进行预测编码。
B帧:(Bi-directionally predicted picture,双向预测编码图像帧)是帧间编码帧,在编码器编码过程中,利用之前和之后的I帧或P帧进行双向预测编码。
预分析:在正式编码帧数据之前,提前分析一定数量的帧,应用预分析数据来指导随后的编码过程。
重建帧:编码器中,源图像经过压缩后,一方面需要对各种编码信息进行打包(熵编码)输出码流,另一边需要根据各种编码信息进行编码过程的逆过程,即对图像进性重构恢复,以获得与解码器中一致的参考帧,供其它帧编码/解码使用,这种重构恢复后的帧并不等于源图像,我们称之为重建帧。
参考帧:在视频编解码中,一种作为其它帧的参考数据,用于其它帧在编码/解码过程中获取帧间的参考数据的重建帧。
运动矢量:一个二维向量,用于描述编码器中一个编码块从其所在位置运动到另一个位置时,所发生的位置偏移。简单来说,一组连续图像记录了目标的运动,运动矢量用于衡量两帧图像间目标的运动程度。运动矢量由水平位移量和垂直位移量二者构成。
块:编码处理的基本单元,有宏块、亚宏块、块等类型。一个块是4×4像素,宏块和亚宏块由多个块组成。通常宏块大小为16×16像素,分为I、B、P宏块。
BM3D(三维块匹配,Block Matching 3D):一种业界公认的效果较好的在图像内部通过搜索匹配块以及其它变换技术或滤波进行图像降噪的处理技术。
3D视频降噪:视频中一种利用本帧图像的自身信息(空域信息)以及其前后帧图像的参考信息(时域信息)进行降噪,以提高图像质量或清晰度的技术方法。
2D降噪:在一个帧内部通过分析本帧内部像素的空间相邻数据的关系进行噪声降噪,又称空域降噪。
双边滤波算法:一种2D邻域降噪算法,该算法取相邻像素的权重时综合考虑了待滤波像素与其周围像素的距离与像素差异。
SSD:(sum of squared difference,误差平方和),常用于衡量两个块的匹配度,值越小,两个块的差异越小,匹配度越高。
目前,一般采用基于BM3D(三维块匹配,Block Matching 3D)算法的视频降噪技术对视频进行降噪处理。但是,由于该算法的复杂性,视频的处理速度非常慢,尤其对于带宽受限的实时流媒体服务、移动可视电话、网络视频聊天等具有较大的影响。
本申请实施例中提出了一种视频降噪处理方法,主要应用于实时视频通话、即使视频通讯、视频会议、实时视频采集、直播转码等场景,可以减小视频降噪的计算量,缓解网络带宽压力。请参考图1,其示出了本申请一个实施例提供的视频降噪处理系统架构图,包括服务器101、第一终端102和第二终端103。
第一终端102和/或第二终端103可以是手机、平板电脑或者是专用的手持设备等具有无线通信功能的电子设备,也可以是个人计算机(personal computer,简称PC),笔记本电脑,服务器等有线接入方式连接上网的设备。
服务器101可以是计算机等网络设备。服务器101可以是一个独立的设备,也可以是多个服务器所形成的服务器集群。服务器101可以采用云计算技术进行信息处理。
系统中的网络可以为INTERNET网络,也可以为全球移动通信系统(Global System for Mobile Communications,简称GSM)、长期演进(long term evolution,简称LTE)系统等移动通信系统。
本申请实施例中的视频降噪处理方法可以由服务器101来执行。本申请实施例中的视频可以为第一终端102与第二终端103进行通讯的通讯视频流,如第一终端102向第二终端103传送的其采集设备所采集到的视频,或第二终端103向第一终端102传送的其采集设备所采集到的视频。
此时,服务器101可以是提供视频转发的服务器。视频采集设备可以集成在 终端上,如为手机终端的前置摄像头或后置摄像头等,图像采集设备还可以为独立于终端存在但与终端通信连接的设备,如与电脑终端等连接的摄像头等。
本申请实施例中可以是服务器101对第一终端102向第二终端103传输的目标视频进行降噪处理,也可以是服务器101对第二终端103向第一终端102传输的目标视频进行降噪处理,还可以是服务器101同时对第一终端102向第二终端103传输的目标视频流以及第二终端103向第一终端102传输的目标视频流进行降噪处理。
另一些实施例中,视频降噪处理方法也可以由终端来执行,相对于前述实施例,其主要区别在于执行主体由服务器变更为了终端。如对于第一终端102而言,其可以对向第二终端103传输的目标视频进行降噪处理,也可对第二终端103向自己传输的目标视频进行降噪处理,还可同时对向第二终端103传输的目标视频以及第二终端103向自己传输的目标视频进行降噪处理。对于第二终端103而言,与第一终端102类似。终端执行本申请实施例的视频降噪处理方法也可以是由安装在其上的客户端来执行。
又一些实施例中,视频降噪处理方法还可以是由服务器101和终端共同执行。将主要降噪逻辑设置在用户终端上,而将如何配置编码器的逻辑设置在服务器上。例如,终端提供目标视频给服务器;由服务器根据所提供的目标视频生成对应的编码参数,其中包括P帧或B帧对应的参考帧,并传递给终端侧;终端控制编码器按照编码参数确定参考块,并利用参考块对当前块进行降噪。
需要注意的是,上文提及的应用场景仅是为了便于理解本申请的精神和原理而示出,本申请实施例在此方面不受任何限制。相反,本申请实施例可以应用于适用的任何场景。
下面结合图1所示的应用场景,对本申请实施例提供的视频降噪处理方法进行说明。
请参考图2A,本申请实施例提供了一种视频降噪处理方法,由计算设备执行,该计算设备可以是服务器也可以是终端。如图2A所示,该方法包括:
步骤S201:获取目标视频的当前帧。
具体实施过程中,目标视频可以是终端通过其集成的采集设备所采集到的视频,也可以是由与终端通过有线或无线网络相连的独立的采集设备所采集到的视频。终端获取目标视频之后,可以由终端对目标视频进行降噪处理,也可以为终端将目标视频发送至服务器,由服务器对目标视频进行降噪处理。
步骤S202:确定当前帧为目标视频中的P帧或B帧。
在编码器内部,例如X264、X265(视频编解码标准的一个编码实现)等,会在预分析处理阶段,通过对每一帧进行预分析,具体为对每一帧对应的宽高缩小了的1/2倍的图像的复杂度、帧内预测、帧间预测分析、统计帧内/帧间成本(cost)代价对比、场景检测等结合一些指定的编码参数,如B帧数量或GOP(Group Of Pictures,图像组)大小来决定每一帧的帧类型。
具体来说,一个GOP是一段内容差别不是很大的图像组成的一串数据流。当运动变化比较少的时候,一个GOP可以很长,因为运动变化少就代表图像画面的内容变动很小,所以可以设置一个I帧,之后设置多个P帧和B帧。当运动变化多时,可能一个GOP比较短,例如只包含一个I帧和3、4个P帧。I帧是最大去除图像空间冗余信息而压缩得到的帧,自带全部图像信息,不参考其他帧可独立解码,称为帧内编码帧。所有视频至少包含一个I帧,且作为视频的第一个帧,视频里的其他的I帧用来改善视频质量。P帧为利用本帧与之前的帧(I帧或P帧)的不同点来解编码本帧数据,B帧为利用本帧与之前的帧的差值、以及本帧与之后的帧的差值来解编码本帧数据。本申请实施例中的3D降噪算法主要是针对目标视频中的P帧和B帧。
步骤S203:根据编码器预先建立的当前帧与参考帧之间的时域参考关系,从目标视频中确定当前帧的参考帧。
具体地,如果当前帧为P帧或者B帧,编码器在预分析过程会确定当前帧的参考帧和运动矢量等时域参考关系,用于后续对当前帧进行编解码。本申请实施例复用编码器中用于对当前帧编解码的参考帧和运动矢量等时域参考关系,利用当前帧的参考帧对当前帧进行降噪。例如,编码器针对P帧会确定出前向两个参考帧;对于B帧会给出前向两个参考帧和后向两个参考帧,共四个参考帧。
举例来说,在如图3A所示的视频序列中,对于序号为5的P帧,对应的参 考帧为序号为0的I帧以及序号为3的P帧;对于序号为4的B帧,对应的参考帧序号分别为0和3的两个前向的帧,以及序号分别为7和9的两个后向的帧。
其中,参考帧与当前帧之间可以相邻,也可以不相邻;同一个当前帧对应的不同参考帧之间可以相邻,也可以不相邻。此外,对于参考帧的数量以及与当前帧之间的位置关系,可以任意设置,本申请实施例不做限制。
步骤S204:从参考帧中确定当前块对应的参考块;当前块为当前帧中的任一块。
具体来说,本申请实施例以块为单位对目标视频的每一帧图像进行处理,一个参考块的大小与当前块的大小相同。这里的当前块的大小可以为8×8像素,相对应的参考块大小也为8×8像素;或者当前块的大小为16×16像素,相对应的参考块大小也为16×16像素。当然,本申请实施例对于块的大小不做限制。
步骤S205:根据参考块对当前块进行降噪处理。
本申请实施例中,在编码器内部获取目标视频的当前帧,在确定当前帧为目标视频中的P帧或B帧后,利用编码器从目标视频中确定当前帧的参考帧。从参考帧中确定当前块对应的参考块,其中,当前块为当前帧中的任一块,并根据参考块对当前块进行降噪处理。由于在编码器内部为了进行视频编码需要建立当前帧与参考帧之间的参考关系,本申请实施例通过在编码器内部复用当前帧与参考帧之间的参考关系对当前帧进行噪声去除,从而可以减少计算量,缩短运算时间,缓解网络带宽压力,满足实时视频降噪的要求。
由于视频序列中的I帧在编解码时不需要参考其它帧图像,因此,在本申请实施例中,针对I帧,不参考其它帧直接进行2D去噪,则获取目标视频的当前帧之后,还包括:
如果确定当前帧为目标视频中的I帧,则利用2D降噪算法对I帧进行降噪处理。
具体地,获取当前帧之后,若当前帧为I帧,则利用2D降噪算法进行降噪处理;若当前帧为P帧或B帧,则利用3D降噪算法进行降噪处理。下面分别对两种降噪算法详细介绍。
本申请实施例中的2D降噪算法可以为区域双边滤波算法、高斯滤波算法等。 下面以区域双边滤波算法为例进行介绍。n×m区域双边滤波算法中,n×m区域一般取3×3像素,或者5×5像素,或者7×7像素。以5×5为例,利用以下公式进行滤波计算:
Figure PCTCN2020119910-appb-000001
其中,denoised_pixl[i][j]为像素点[i][j]降噪后的像素值;src为块block[i+m][j+n]的像素值;coef[m][n]为滤波系数,是根据n×m区域内像素与中心像素差值的绝对值diffVal和位置关系以及双边滤波等级表来选择,具体可以根据以下公式确定:
Figure PCTCN2020119910-appb-000002
其中,coefTable[t num]为2D滤波中基于滤波等级的双边滤波等级表;在一个实施例中,t num为0~9,代表10个降噪等级;在一个实施例中,Table[t num]取值分别为:{2,3,5,8,11,15,19,24,29,32};diffval为当前像素值与滤波模板中的像素值的差异,实现过程中可以将过大的diffval进行限制处理(例如:0~31);[m][n]为当前像素与n×m区域中参考像素的距离,取值为
Figure PCTCN2020119910-appb-000003
δ 2用于衡量噪声等级,在一个实施例中,取值可以是t num<4?4:t num<7?6:9。
进一步地,由于2D降噪算法是通过局部加权平均进行降噪,不可避免的在一定程度上会出现细节或者边界模糊。为了提高2D降噪效果,可以在降噪后加入一定量的锐化补偿,则加入锐化补偿后的像素值根据以下公式进行计算:
denoised_ed_pixl=average+sharp_ratio*(denoised_ed_pixl-average)……公式3
其中,averag为当前像素值对应的n×m区域内像素的平均值;sharp_ratio为锐化补偿参数,其类型为double,取值为-2.0到2.0。
针对目标视频中的P帧或者B帧,本申请实施例是以大小为8×8像素或者16×16像素的当前块为单位进行滤波计算。具体过程如下:
在利用3D降噪算法进行降噪处理之前,也可以先通过2D降噪算法对当前块进行预降噪处理,从而进一步提高降噪效果。
在利用3D降噪算法对当前块进行降噪之前,还可以先利用噪声监测算法,计算当前块的内部方差,若当前块的内部方差大于降噪阈值,则进行后续的降噪处理;若当前块的内部方差小于或等于降噪阈值,则进行后续的降噪处理。其中当前块的内部方差可用于表示当前块的内部复杂度,内部方差小于或等于降噪阈值时,可以认为当前块的内部复杂度较小,可以省去降噪处理过程,从而进一步减少计算量,缩短视频的降噪时长。
若确定当前块的内部方差大于降噪阈值,则利用参考块对当前块进行3D降噪。关于参考块的确定,图2B示出了本申请实施例中从参考帧中确定当前块对应的参考块的方法的流程图。如图2B所示,该方法包括以下步骤:
步骤S2041,根据所述编码器中当前块对应的运动矢量,从参考帧中确定当前块的匹配块,所述当前帧具有至少一个参考帧,每个参考帧中存在所述当前块的一个匹配块;
步骤S2042,确定每一个匹配块与当前块之间的匹配度;
步骤S2043,将匹配度大于匹配阈值的匹配块作为当前块的参考块。
具体实施过程中,可以利用大小为8×8像素,或者16×16像素的块为单位进行降噪,即当前块的大小可以为8×8像素,或者16×16像素。对应的,参考帧中与当前块相对应的匹配块的大小也为8×8像素,或者16×16像素。其中,当前块在当前帧中的位置与匹配块在参考帧中的位置关系利用运动矢量表示,该运动矢量为编码器在预分析过程中给出。
图3B示出了目标在当前帧与参考帧之间的运动矢量的示意图。由于视频中的图像一般是在变动的,当某一目标301运动时,其位置会变化但形状颜色等基本不变,解码器利用运动矢量表示出目标从参考帧中的位置31移动到当前帧中的位置32即可。假设图3B中是理想情况,目标除移动位置外其他任何属性无任何变化,则两幅图像间的差值仅包含运动矢量这一数据量。
因此,针对当前块,可以利用运动矢量确定出一个参考帧中对应的唯一一个匹配块,直接利用编码器给出的运动矢量来确定匹配块,可以减少计算量。其中,匹配块的数量与参考帧的数量相同,即一个参考帧中可以确定出当前块的一个匹配块。又由于当前帧对应多个参考帧,因此,一个当前块可以确定出对应的多个匹配块。例如,对于P帧,其参考帧为2个,则P帧中的当前块存在2个匹配块;对于B帧,其参考帧为4个,则B帧中的当前块存在4个匹配块。
针对多个匹配块,需要计算每一个匹配块与当前块之间的匹配度,并根据匹配度从匹配块中选取当前块的参考块,将匹配度大于匹配阈值的匹配块作为参考块,匹配度小于或等于匹配阈值的匹配块丢弃。例如,匹配度算法可以为SSD(sum of squared difference,误差平方和)算法,用于衡量两个块的匹配度,其中,SSD的值越小则表明当前块与匹配块之间的差异越小,则匹配度越高。或者,利用相关匹配法,采用乘法操作,数值越大表明匹配度越高。或者采用相关系数匹配法,当值为1时匹配度最高,当值为-1时匹配度最低。再或者,通过SSE(residual sum of squares/sum squared residual,残差平方和)衡量匹配度,如果匹配块与当前块之间的SSE小于阈值,则将该匹配块作为参考块,否则舍弃该匹配块。
进一步的,若所有匹配块与当前块之间的匹配度均小于或等于匹配阈值,则利用2D降噪算法对当前块进行降噪处理,从而改善降噪的效果。
对于匹配度大于匹配阈值的参考块,根据参考块对当前块进行降噪处理。图2C示出了根据所述参考块对所述当前块进行降噪处理的方法的流程图。如图2C所示,该方法包括以下步骤:
步骤S2051,依据3D(三维)降噪算法,根据参考块与当前块之间的匹配度,计算每一个参考块的权重矩阵。
具体来说,一个参考块的权重矩阵可以依据以下公式进行计算:
Figure PCTCN2020119910-appb-000004
其中,w n[i][j]为第n个参考块的权重矩阵;n表示第n个参考块,当n为 0时表示当前块本身的权重;poc 0为当前块所在的帧序号;poc refn为第n个参考块所在帧的序号;α为基于参考帧距离的衰减因子,一个典型的值为0.95;diffVal为对应第n个参考块的第[i,j]个像素与当前块的第[i,j]个像素的绝对值差;coefTable[t num]为2D滤波中基于滤波等级的双边滤波等级表;num ref表示参考块的数量。另外,SSE n为参考块与当前块的匹配度;f(SSE n)表示基于匹配度的一个权重因子,可以按如下计算:
Figure PCTCN2020119910-appb-000005
其中,ref_th为匹配度,对于当前块大小为16×16像素(色度8×8)进行处理时,refweiht可以取值为经验参数:例如,亮度分量为225,色度分量为256。ref_th取值为亮度为{40,120,320,500},色度为{10,140,80,125}。
步骤S2052,利用当前块的像素、每一个参考块的像素以及对应的权重矩阵进行带权求和,得出当前块降噪后的输出像素。
具体地,当前块降噪后的输出像素可以根据以下公式计算:
Figure PCTCN2020119910-appb-000006
其中,denoised_pixl[i][j]为像素点[i][j]降噪后的像素值;w n[i][j]为第n个参考块的权重矩阵,block n[i][j]为第n个参考块的输入像素;特殊的,当n为0时表示当前块本身的权重矩阵,block n[i][j]为当前块的输入像素。
在一些实施例中,为了进一步改善降噪的质量,从参考帧中确定当前块的参考块。图2D示出了本申请实施例中从参考帧中确定当前块的参考块的方法的流程图。如图2D所示,该方法包括以下步骤:
步骤S241,将参考帧进行重构并恢复,得到重建帧,并将所述重建帧作为 更新后的所述参考帧;
步骤S242,从作为更新后的所述参考帧的重建帧中确定当前块的参考块。
具体实施过程中,所有的匹配块均来自于编码器中的重建帧,也就是说参考块也是均来自于编码器中的重建帧。其中,重建帧为编码器将源图像进行压缩后,根据各种编码信息进行编码过程的逆过程,即对图像进行重构恢复,以获得重建帧,并从重建帧中确定参考帧,供其它帧编解码使用。本申请实施例从编码中的重建帧确定参考帧,相比源图像,重建帧是已经经过滤波降噪后的帧,对于存在噪声污染的源视频而言,重建帧的图像质量更高,从而能得到更好的滤波效果。
下面以具体实施例对上述流程进行详细介绍,图4A示出了本申请实施例中的一种视频降噪处理方法的流程图。如图4A所示,实施例包括以下步骤:
步骤S401,获取目标视频中的当前帧。
步骤S402,判断当前帧的帧类型,若当前帧为I帧,则执行步骤S403;若当前帧为P帧或B帧,则执行步骤S404。
步骤S403,利用双边滤波算法对当前帧进行降噪。
步骤S404,将当前帧中大小为16×16的块作为当前块。
步骤S405,利用噪声监测算法,计算当前块的内部方差,并判断内部方差是否大于降噪阈值,若内部方差大于降噪阈值,则利用参考块对当前块进行3D降噪,执行步骤S407;若内部方差小于或等于降噪阈值,则可以认为当前块的内部复杂度较小,可以省去降噪处理过程,执行步骤S406。
步骤S406,结束对当前块的操作,将下一个块作为当前块,重新计算作为当前块的下一块的内部方差。
步骤S407,在进行3D降噪前,利用双边滤波算法对当前块进行降噪。
步骤S408,获取编码器预分析阶段确定的当前帧对应的参考帧。
若当前帧为P帧,则参考帧的数量为2个;若当前帧为B帧,则参考帧的数量为4个。同时,
步骤S409,获取编码器确定的当前块针对每一个参考帧的运动矢量。
步骤S410,根据参考帧以及对应的运动矢量,确定当前块的匹配块。
若当前块为P帧中的块,则当前块对应有2个匹配块;若当前块为B帧中的块,则当前块对应有4个匹配块。
步骤S411,计算每一个匹配块与当前块之间的SSE,根据计算得到的SSE来确定当前块的匹配块。具体可以将SSE小于公式5中ref_th[3]的匹配块作为当前块的参考块,将SSE大于或等于公式5中ref_th[3]的匹配块丢弃。
步骤S412,判断当前块对应的参考块个数是否为0,若个数为0,则执行步骤S413;若个数不等于0,则执行步骤S414。
步骤S413,结束对当前块的处理,继续处理当前帧中的下一个块。
步骤S414,利用参考块与当前块之间的匹配度,计算每一个参考块的权重矩阵。
步骤S415,利用当前块的像素、每一个参考块的像素以及对应的权重矩阵进行带权求和,得出当前块降噪后的输出像素。
依次对当前帧中的所有块进行降噪计算。
目前视频降噪算法有很多种,但每一类算法均具有局限性,一般计算量小的降噪方式得到的视频图像的清晰度、码率等较差,而降噪效果好的算法计算量大,不能很好地适应视频的实时需求。
图4B示出了3D滤波过程的示意图。如图4B所示,本申请实施例中,复用编码器内部的时域参考关系,极大的降低了视频降噪算法的复杂度,从而在保证视频清晰度的同时减少了计算时长,可以在实时编码或转码等应用场景下获得一个良好的效果与性能的平衡。该算法在X264中集成实现后,在Intel E5-2670 V4 cpu机器上对于1280×720p的视频,在快速档次的处理速度高达200fps以上,满足直播转码需求,尤其适用于直播类场景。图5示出了视频降噪前后的图像对比,对应两幅图的左图为原始图像,右图为降噪后的图像。可以看出,本申请实施例中的视频降噪处理方法,可以有效去除在拍摄过程中产生的噪声,提高视频清晰度,极大的降低码率,对一些噪声较大的视频节约码率高达100%。
下述为本申请装置实施例,对于装置实施例中未详尽描述的细节,可以参考上述一一对应的方法实施例。
请参考图6,其示出了本申请一个实施例提供的视频降噪处理装置的结构方 框图。该视频降噪处理装置通过硬件或者软硬件的结合实现成为图1中服务器的全部或者一部分,或者成为图1中终端的全部或者一部分。该装置包括:获取单元601、判断单元602、确定单元603、选取单元604、计算单元605。
获取单元601,用于获取目标视频的当前帧;
判断单元602,用于确定当前帧为目标视频中的P帧或B帧;
确定单元603,用于根据编码器预先建立的当前帧与参考帧之间的时域参考关系,从目标视频中确定当前帧的参考帧;
选取单元604,用于从参考帧中确定当前块对应的参考块;当前块为当前帧中的任一块;
计算单元605,用于根据参考块对当前块进行降噪处理。
一些实施例中,选取单元604,具体用于:
根据所述编码器中当前块对应的运动矢量,从参考帧中确定当前块的匹配块,所述当前帧具有至少一个参考帧,每个参考帧中存在所述当前块的一个匹配块;
确定每一个匹配块与当前块之间的匹配度;
将匹配度大于匹配阈值的匹配块作为当前块的参考块。
一些实施例中,计算单元605,还用于:
确定所有匹配块与当前块之间的匹配度均小于或等于匹配阈值;
利用2D降噪算法对当前块进行降噪处理。
一些实施例中,计算单元605,具体用于:
依据3D降噪算法,根据参考块与当前块之间的匹配度,计算每一个参考块的权重矩阵;
利用当前块的像素、每一个参考块的像素以及对应的权重矩阵进行带权求和,得出当前块降噪后的输出像素。
一些实施例中,计算单元605,还用于:
利用2D降噪算法对当前块预降噪。
一些实施例中,判断单元602,还用于确定当前帧为目标视频中的I帧;
计算单元,还用于利用2D降噪算法对I帧进行降噪处理。
一些实施例中,选取单元604,用于:
将参考帧进行重构并恢复,得到重建帧;
从重建帧中确定当前块的参考块。
一些实施例中,选取单元604,还用于:
利用噪声监测算法,确定当前块的内部方差大于降噪阈值。
请参考图7,其示出了本申请一个实施例提供的计算设备的结构方框图。该计算设备700可以实现为图1中的服务器或者终端。具体来讲:
设备700包括中央处理单元(CPU)701、包括随机存取存储器(RAM)702和只读存储器(ROM)703的系统存储器704,以及连接系统存储器704和中央处理单元701的系统总线705。服务器700还包括帮助计算机内的各个器件之间传输信息的基本输入/输出系统(I/O系统)706,和用于存储操作系统713、应用程序714和其它程序模块715的大容量存储设备707。
基本输入/输出系统706包括有用于显示信息的显示器708和用于用户输入信息的诸如鼠标、键盘之类的输入设备709。其中显示器708和输入设备709都通过连接到系统总线705的输入输出控制器710连接到中央处理单元701。基本输入/输出系统706还可以包括输入输出控制器710以用于接收和处理来自键盘、鼠标、或电子触控笔等多个其它设备的输入。类似地,输入输出控制器710还提供输出到显示屏、打印机或其它类型的输出设备。
大容量存储设备707通过连接到系统总线705的大容量存储控制器(未示出)连接到中央处理单元701。大容量存储设备707及其相关联的计算机可读介质为服务器700提供非易失性存储。也就是说,大容量存储设备707可以包括诸如硬盘或者CD-ROM驱动器之类的计算机可读介质(未示出)。
不失一般性,计算机可读介质可以包括计算机存储介质和通信介质。计算机存储介质包括以用于存储诸如计算机可读指令、数据结构、程序模块或其它数据等信息的任何方法或技术实现的易失性和非易失性、可移动和不可移动介质。计算机存储介质包括RAM、ROM、EPROM、EEPROM、闪存或其它固态存储其技术,CD-ROM、DVD或其它光学存储、磁带盒、磁带、磁盘存储或其它磁性存储设备。当然,本领域技术人员可知计算机存储介质不局限于上述几种。上述 的系统存储器704和大容量存储设备707可以统称为存储器。
根据本申请的各种实施例,服务器700还可以通过诸如因特网等网络连接到网络上的远程计算机运行。也即服务器700可以通过连接在系统总线705上的网络接口单元711连接到网络712,或者说,也可以使用网络接口单元711来连接到其它类型的网络或远程计算机系统(未示出)。
存储器还包括一个或者一个以上的程序,一个或者一个以上程序存储于存储器中,一个或者一个以上程序包含用于进行本申请实施例提供的视频降噪处理方法的指令。
本领域普通技术人员可以理解上述实施例的视频降噪处理方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质可以包括:只读存储器(ROM,Read Only Memory)、随机存取记忆体(RAM,Random Access Memory)、磁盘或光盘等。
本领域普通技术人员可以理解上述实施例的视频降噪处理方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于计算机可读存储介质中,存储介质可以包括:只读存储器(ROM,Read Only Memory)、随机存取记忆体(RAM,Random Access Memory)、磁盘或光盘等。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
以上仅为本申请的较佳实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (15)

  1. 一种视频降噪处理方法,由计算设备执行,所述方法包括:
    获取目标视频的当前帧;
    确定所述当前帧为所述目标视频中的P帧或B帧;
    根据编码器预先建立的当前帧与参考帧之间的时域参考关系,从所述目标视频中确定所述当前帧的参考帧;
    从所述参考帧中确定当前块对应的参考块,所述当前块为所述当前帧中的任一块;
    根据所述参考块对所述当前块进行降噪处理。
  2. 根据权利要求1所述的方法,其中,所述从所述参考帧中确定当前块对应的参考块,包括:
    根据所述编码器中所述当前块对应的运动矢量,从所述参考帧中确定所述当前块的匹配块,所述当前帧具有至少一个参考帧,每个参考帧中存在所述当前块的一个匹配块;
    确定每一个匹配块与所述当前块之间的匹配度;
    将匹配度大于匹配阈值的匹配块作为所述当前块的参考块。
  3. 根据权利要求2所述的方法,其中,所述确定每一个匹配块与所述当前块之间的匹配度之后,还包括:
    如果确定各匹配块与所述当前块之间的匹配度均小于或等于所述匹配阈值,则利用二维降噪算法对所述当前块进行降噪处理。
  4. 根据权利要求2所述的方法,其中,所述根据所述参考块对所述当前块进行降噪处理,包括:
    依据三维降噪算法,根据参考块与所述当前块之间的匹配度,计算每一个参考块的权重矩阵;
    利用所述当前块的像素、每一个参考块的像素以及对应的权重矩阵进行带权求和,得出所述当前块降噪后的输出像素。
  5. 根据权利要求4所述的方法,其中,所述依据三维降噪算法,根据参考块与所述当前块之间的匹配度,计算每一个参考块的权重矩阵之前,还包括:
    利用二维降噪算法对所述当前块进行预降噪处理。
  6. 根据权利要求4所述的方法,其中,所述获取目标视频的当前帧之后,还包括:
    确定所述当前帧为所述目标视频中的I帧;
    利用2D降噪算法对所述I帧进行降噪处理。
  7. 根据权利要求1至6任一项所述的方法,其中,所述从所述参考帧中确定当前块的参考块,包括:
    将所述参考帧进行重构并恢复,得到重建帧,并将所述重建帧作为更新后的所述参考帧;
    从作为更新后的所述参考帧的所述重建帧中确定所述当前块的参考块。
  8. 根据权利要求1至6任一项所述的方法,其中,所述从所述参考帧中确定当前块的参考块之前,还包括:
    利用噪声监测算法,确定所述当前块的内部方差是否大于降噪阈值;
    如果所述当前块的内部方差大于降噪阈值,则对所述当前块执行降噪处理。
  9. 一种视频降噪处理装置,所述装置包括:
    获取单元,用于获取目标视频的当前帧;
    判断单元,用于确定所述当前帧为所述目标视频中的P帧或B帧;
    确定单元,用于根据编码器预先建立的当前帧与参考帧之间的时域参考关系,从所述目标视频中确定所述当前帧的参考帧;
    选取单元,用于从所述参考帧中确定当前块对应的参考块,所述当前块为所述当前帧中的任一块;
    计算单元,用于根据所述参考块对所述当前块进行降噪处理。
  10. 根据权利要求9所述的视频降噪处理装置,其中,所述选取单元,用于:
    根据所述编码器中所述当前块对应的运动矢量,从所述参考帧中确定所述当前块的匹配块,所述当前帧具有至少一个参考帧,每个参考帧中存在所述当前块的一个匹配块;
    确定每一个匹配块与所述当前块之间的匹配度;
    将匹配度大于匹配阈值的匹配块作为所述当前块的参考块。
  11. 根据权利要求10所述的视频降噪处理装置,在确定所述匹配块与所述当前块之间的匹配度之后,如果所述选取单元确定各匹配块与所述当前块之间的 匹配度均小于或等于所述匹配阈值,则所述计算单元,用于利用二维降噪算法对所述当前块进行降噪处理。
  12. 根据权利要求10所述的视频降噪处理装置,其中,所述计算单元,用于:
    依据三维降噪算法,根据参考块与所述当前块之间的匹配度,计算每一个参考块的权重矩阵;
    利用所述当前块的像素、每一个参考块的像素以及对应的权重矩阵进行带权求和,得出所述当前块降噪后的输出像素。
  13. 根据权利要求12所述的视频降噪处理装置,其中,所述计算单元,还用于利用二维降噪算法对所述当前块进行预降噪处理。
  14. 一种计算设备,包括处理器和存储器;所述存储器存储有计算机程序,所述计算机程序被所述处理器执行时,使得所述处理器执行如权利要求1至8中任一项所述的视频降噪处理方法。
  15. 一种计算机可读存储介质,所述存储介质存储有计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行权利要求1至8任一项所述的视频降噪处理方法。
PCT/CN2020/119910 2019-12-09 2020-10-09 一种视频降噪处理方法、装置及存储介质 WO2021114846A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/520,169 US20220058775A1 (en) 2019-12-09 2021-11-05 Video denoising method and apparatus, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911251050.6 2019-12-09
CN201911251050.6A CN111010495B (zh) 2019-12-09 2019-12-09 一种视频降噪处理方法及装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/520,169 Continuation US20220058775A1 (en) 2019-12-09 2021-11-05 Video denoising method and apparatus, and storage medium

Publications (1)

Publication Number Publication Date
WO2021114846A1 true WO2021114846A1 (zh) 2021-06-17

Family

ID=70114157

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/119910 WO2021114846A1 (zh) 2019-12-09 2020-10-09 一种视频降噪处理方法、装置及存储介质

Country Status (3)

Country Link
US (1) US20220058775A1 (zh)
CN (1) CN111010495B (zh)
WO (1) WO2021114846A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114339031A (zh) * 2021-12-06 2022-04-12 深圳市金九天视实业有限公司 画面调节方法、装置、设备以及存储介质

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109873953A (zh) * 2019-03-06 2019-06-11 深圳市道通智能航空技术有限公司 图像处理方法、夜间拍摄方法、图像处理芯片及航拍相机
CN111010495B (zh) * 2019-12-09 2023-03-14 腾讯科技(深圳)有限公司 一种视频降噪处理方法及装置
CN111629262B (zh) * 2020-05-08 2022-04-12 Oppo广东移动通信有限公司 视频图像处理方法和装置、电子设备及存储介质
CN111787319B (zh) * 2020-07-22 2021-09-14 腾讯科技(深圳)有限公司 一种视频信息处理方法、多媒体信息处理方法及装置
CN113115075B (zh) * 2021-03-23 2023-05-26 广州虎牙科技有限公司 一种视频画质增强的方法、装置、设备以及存储介质
CN113852860A (zh) * 2021-09-26 2021-12-28 北京金山云网络技术有限公司 视频处理方法、装置、系统及存储介质
CN114401405A (zh) * 2022-01-14 2022-04-26 安谋科技(中国)有限公司 一种视频编码方法、介质及电子设备
CN116506732B (zh) * 2023-06-26 2023-12-05 浙江华诺康科技有限公司 一种图像抓拍防抖的方法、装置、系统和计算机设备

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020176502A1 (en) * 1998-06-26 2002-11-28 Compaq Information Technologies Group, L.P. Method and system for distributed video compression in personal computer architecture
CN101262559A (zh) * 2008-03-28 2008-09-10 北京中星微电子有限公司 一种序列图像噪声消除的方法及装置
CN102355556A (zh) * 2011-11-02 2012-02-15 无锡博视芯半导体科技有限公司 一种基于运动估计的视频和图像的三维降噪方法
CN102769722A (zh) * 2012-07-20 2012-11-07 上海富瀚微电子有限公司 时域与空域结合的视频降噪装置及方法
CN104023166A (zh) * 2014-06-20 2014-09-03 武汉烽火众智数字技术有限责任公司 一种环境自适应视频图像降噪方法及装置
US20180343448A1 (en) * 2017-05-23 2018-11-29 Intel Corporation Content adaptive motion compensated temporal filtering for denoising of noisy video for efficient coding
CN109963048A (zh) * 2017-12-14 2019-07-02 多方科技(广州)有限公司 降噪方法、降噪装置及降噪电路系统
CN111010495A (zh) * 2019-12-09 2020-04-14 腾讯科技(深圳)有限公司 一种视频降噪处理方法及装置

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070014365A1 (en) * 2005-07-18 2007-01-18 Macinnis Alexander Method and system for motion estimation
US8009732B2 (en) * 2006-09-01 2011-08-30 Seiko Epson Corporation In-loop noise reduction within an encoder framework
US8149336B2 (en) * 2008-05-07 2012-04-03 Honeywell International Inc. Method for digital noise reduction in low light video
CN101742288B (zh) * 2008-11-11 2013-03-27 北京中星微电子有限公司 视频降噪编码方法和视频降噪编码装置
CN101742290B (zh) * 2008-11-12 2013-03-27 北京中星微电子有限公司 视频编、解码降噪方法和视频编、解码降噪装置
CN101540834B (zh) * 2009-04-16 2011-03-30 杭州华三通信技术有限公司 去除视频图像噪声的方法和视频编码装置
JP2011139208A (ja) * 2009-12-28 2011-07-14 Sony Corp 画像処理装置および方法
US9041834B2 (en) * 2012-09-19 2015-05-26 Ziilabs Inc., Ltd. Systems and methods for reducing noise in video streams
CN103269412B (zh) * 2013-04-19 2017-03-08 华为技术有限公司 一种视频图像的降噪方法及装置
CN105472205B (zh) * 2015-11-18 2020-01-24 腾讯科技(深圳)有限公司 编码过程中的实时视频降噪方法和装置
EP3364342A1 (en) * 2017-02-17 2018-08-22 Cogisen SRL Method for image processing and video compression
EP3379820B1 (en) * 2017-03-24 2020-01-15 Axis AB Controller, video camera, and method for controlling a video camera
WO2019050427A1 (en) * 2017-09-05 2019-03-14 Huawei Technologies Co., Ltd. EARLY TERMINATION OF IMAGE BLOCK MATCHING FOR COLLABORATIVE FILTERING
US10595019B2 (en) * 2017-09-20 2020-03-17 Futurewei Technologies, Inc. Noise suppression filter parameter estimation for video coding
CN110087077A (zh) * 2019-06-05 2019-08-02 广州酷狗计算机科技有限公司 视频编码方法及装置、存储介质

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020176502A1 (en) * 1998-06-26 2002-11-28 Compaq Information Technologies Group, L.P. Method and system for distributed video compression in personal computer architecture
CN101262559A (zh) * 2008-03-28 2008-09-10 北京中星微电子有限公司 一种序列图像噪声消除的方法及装置
CN102355556A (zh) * 2011-11-02 2012-02-15 无锡博视芯半导体科技有限公司 一种基于运动估计的视频和图像的三维降噪方法
CN102769722A (zh) * 2012-07-20 2012-11-07 上海富瀚微电子有限公司 时域与空域结合的视频降噪装置及方法
CN104023166A (zh) * 2014-06-20 2014-09-03 武汉烽火众智数字技术有限责任公司 一种环境自适应视频图像降噪方法及装置
US20180343448A1 (en) * 2017-05-23 2018-11-29 Intel Corporation Content adaptive motion compensated temporal filtering for denoising of noisy video for efficient coding
CN109963048A (zh) * 2017-12-14 2019-07-02 多方科技(广州)有限公司 降噪方法、降噪装置及降噪电路系统
CN111010495A (zh) * 2019-12-09 2020-04-14 腾讯科技(深圳)有限公司 一种视频降噪处理方法及装置

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114339031A (zh) * 2021-12-06 2022-04-12 深圳市金九天视实业有限公司 画面调节方法、装置、设备以及存储介质

Also Published As

Publication number Publication date
CN111010495B (zh) 2023-03-14
CN111010495A (zh) 2020-04-14
US20220058775A1 (en) 2022-02-24

Similar Documents

Publication Publication Date Title
WO2021114846A1 (zh) 一种视频降噪处理方法、装置及存储介质
US10321138B2 (en) Adaptive video processing of an interactive environment
US9071841B2 (en) Video transcoding with dynamically modifiable spatial resolution
US11206405B2 (en) Video encoding method and apparatus, video decoding method and apparatus, computer device, and storage medium
US10097821B2 (en) Hybrid-resolution encoding and decoding method and a video apparatus using the same
US9414086B2 (en) Partial frame utilization in video codecs
TWI613910B (zh) 用於訊框序列之影像編碼的方法和編碼器
CN105472205B (zh) 编码过程中的实时视频降噪方法和装置
CA2883133C (en) A video encoding method and a video encoding apparatus using the same
US8243117B2 (en) Processing aspects of a video scene
JP6761033B2 (ja) 前フレーム残差を用いた動きベクトル予測
MX2007000810A (es) Metodo y aparato para conversion ascendente de velocidad por cuadro asistido de codificador (ea-fruc) para compresion de video.
CN110324623B (zh) 一种双向帧间预测方法及装置
KR20140110008A (ko) 객체 검출 정보에 따른 인코딩
CN109688407B (zh) 编码单元的参考块选择方法、装置、电子设备及存储介质
CN109451310B (zh) 一种基于显著性加权的率失真优化方法及装置
CN104125466A (zh) 一种基于gpu的hevc并行解码方法
WO2017101350A1 (zh) 变分辨率的编码模式预测方法及装置
TW202218428A (zh) 圖像編碼方法、圖像解碼方法及相關裝置
WO2018169571A1 (en) Segmentation-based parameterized motion models
CN113259671B (zh) 视频编解码中的环路滤波方法、装置、设备及存储介质
US20120195364A1 (en) Dynamic mode search order control for a video encoder
CN108713318A (zh) 一种视频帧的处理方法及设备
CN101404773A (zh) 一种基于dsp的图像编码方法
TWI411305B (zh) 動態參照訊框選擇方法和系統

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20900471

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20900471

Country of ref document: EP

Kind code of ref document: A1