WO2021098030A1 - 一种视频编码的方法和装置 - Google Patents

一种视频编码的方法和装置 Download PDF

Info

Publication number
WO2021098030A1
WO2021098030A1 PCT/CN2020/070701 CN2020070701W WO2021098030A1 WO 2021098030 A1 WO2021098030 A1 WO 2021098030A1 CN 2020070701 W CN2020070701 W CN 2020070701W WO 2021098030 A1 WO2021098030 A1 WO 2021098030A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
encoded
pixel
frame image
encoding
Prior art date
Application number
PCT/CN2020/070701
Other languages
English (en)
French (fr)
Inventor
郑振贵
黄学辉
林鹏程
Original Assignee
网宿科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 网宿科技股份有限公司 filed Critical 网宿科技股份有限公司
Publication of WO2021098030A1 publication Critical patent/WO2021098030A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/625Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using discrete cosine transform [DCT]

Definitions

  • the present invention relates to the technical field of video processing, in particular to a method and device for video coding.
  • Existing video coding can mainly include intra-frame coding and inter-frame coding: during intra-frame coding, discrete cosine transform, quantization, and entropy coding can be performed on the frame image in sequence to obtain compressed image data;
  • inter-frame encoding the motion vector between the frame image to be encoded and the reference frame image can be calculated, the predicted image is generated through the reference frame image and the motion vector, and then the predicted image and the frame image to be encoded are compared to generate a differential image, and the difference can be calculated
  • the image is processed by discrete cosine transform, quantization, and entropy coding in sequence, so that compressed image data can be obtained.
  • the frame image is often divided into multiple image blocks, and all the image blocks are quantized with the same quantization parameter, or the quantization parameter is selected directly according to the content complexity of each image block for quantization.
  • encoding according to the above two quantization methods may cause the secondary picture content to be too finely coded, while the key picture content cannot be more finely coded, that is, the equipment coding resources cannot be reasonably allocated. As a result, the viewing experience of the encoded video picture is lower and the picture quality is poor.
  • a video encoding method includes:
  • a video encoding device in a second aspect, includes:
  • the image analysis module is used to extract the frame image to be encoded of the target video, and to determine the encoding weight corresponding to each image area in the frame image to be encoded through image analysis technology;
  • the video encoding module is configured to encode the frame image to be encoded based on the set quantization parameter corresponding to each image area.
  • a network device in a third aspect, includes a processor and a memory.
  • the memory stores at least one instruction, at least one program, code set, or instruction set. A piece of program, the code set or the instruction set is loaded and executed by the processor to implement the video encoding method as described in the first aspect.
  • a computer-readable storage medium stores at least one instruction, at least one program, code set, or instruction set, the at least one instruction, the at least one program, and the code
  • the set or instruction set is loaded and executed by the processor to implement the video encoding method as described in the first aspect.
  • the frame image to be encoded of the target video is extracted, and the encoding weight corresponding to each image region in the frame image to be encoded is determined through image analysis technology; according to the encoding weight corresponding to each image region, the corresponding encoding weight of each image region is set The quantization parameter; based on the quantization parameter corresponding to each image area after setting, the frame image to be coded is coded.
  • Figure 1 is a flowchart of a video encoding method provided by an embodiment of the present invention
  • FIG. 2 is a simple schematic diagram of a two-dimensional mask diagram provided by an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of the principle of a cropping two-dimensional mask image provided by an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of the structure of a video-enhanced coding apparatus provided by an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of the structure of a video-enhanced coding apparatus provided by an embodiment of the present invention.
  • Fig. 6 is a schematic structural diagram of a network device provided by an embodiment of the present invention.
  • Step 101 Extract a frame image to be encoded of a target video, and determine the encoding weight corresponding to each image region in the frame image to be encoded by image analysis technology.
  • the network device after the network device obtains the original video data of the target video, it can first use a video processing tool such as ffmpeg to obtain the video frame sequence of the target video, and then sequentially extract the frame images of the target video as the waiting Encode the frame image. After that, for each frame image to be encoded, the network device can analyze the image content of the frame image to be encoded through image analysis technology, so as to determine the encoding weight corresponding to each image region in the frame image to be encoded.
  • the coding weight may be a numerical value used to indicate the coding fineness of each image region. It can be set that the larger the coding weight, the higher the coding fineness of the corresponding image region, and the better the picture quality of the image region after coding. .
  • the image area in the frame image to be encoded can be divided after analysis by image analysis technology, or it can be manually divided by technicians in advance.
  • the frame image to be encoded is divided into 20 ⁇ 20 specifications.
  • the 400 image areas can also be divided according to the specified area size, such as setting each image area equal to the size of 9 macroblocks.
  • the network device can determine the statistical feature value of each pixel in the frame image to be coded through image analysis technology.
  • the statistical feature value can be the initial analysis data directly obtained after the frame image to be coded is analyzed through image analysis technology.
  • Different image analysis techniques can be used to obtain statistical feature values of different dimensions.
  • the network device may adopt the feature value selection rule under the dimension to which the statistical feature value belongs, and determine the coding weight of each pixel based on the statistical feature value of each pixel. For example, the statistical feature value of dimension A is obtained through the first image analysis technology.
  • the feature value selection rule of dimension A is that the closer the value is to 1, the higher the coding weight, and the value range of the coding weight is 0-100;
  • the second image analysis technique obtains the statistical feature value of the dimension B.
  • the feature value selection rule of the dimension B is that the lower the value, the higher the encoding weight, and the value range of the encoding weight is 0-100.
  • the network device may calculate the encoding weight corresponding to each image area in the frame image to be encoded based on the encoding weights of all pixels in each image area according to a preset calculation formula.
  • the average value of the coding weights of all pixels in the image area can be used (arithmetic average, geometric average, square average, harmonic average, or weighted average can be used as appropriate. Different average calculation methods), as the coding weight corresponding to the image area.
  • Case 1 When the image analysis technology is the salient target detection technology, input the frame image to be encoded into the salient target detection model based on CAM (Class Activation Mapping) technology, and obtain the second-to-last generation of the salient target detection model In the feature map, the feature data corresponding to each pixel in the feature map is used as the statistical feature value of each pixel of the frame image to be encoded.
  • CAM Class Activation Mapping
  • the frame image to be encoded can be input into the salient target detection model based on CAM technology.
  • the salient target detection model based on CAM technology can rely on convolutional neural networks such as Googlenet, Resnet, Densenet, etc., and uses a global average pooling layer to replace the fully connected layer in the convolutional neural network, which can Make the salient target detection model have the ability of target positioning.
  • the technician can use the image set with the obvious target to perform personalized training on the salient target detection model to enhance the accuracy of the model detection.
  • Case 2 When the image analysis technology is the optical flow method, the optical flow value corresponding to each pixel of the frame image to be encoded is calculated based on the optical flow method, and the optical flow value is used as the statistical characteristic value of each pixel.
  • the optical flow method uses the changes in the time domain of the pixels in the image sequence and the correlation between adjacent frames to determine the corresponding relationship between the previous frame image and the current frame image, thereby calculating A method for the motion information of objects between adjacent frames of images. Therefore, the network device can compare the frame image to be coded with the previous frame image, and use the change of pixels between the two frames to calculate the optical flow value corresponding to each pixel of the frame image to be coded, and compare the light The flow value is used as the statistical characteristic value of each pixel.
  • texture is an inherent characteristic of the surface of an object. It can be considered as an external feature composed of grayscale or color with a certain variation in space. Different regions in the image often have different textures. Therefore, the network device can calculate the texture feature value of each pixel of the frame image to be encoded by means of texture analysis, so that the texture feature value can be used as the statistical feature value of each pixel.
  • Case 4 When the image analysis technology is the target detection technology, the frame image to be encoded is input to the target detection model; according to the image content detection result output by the target detection model, a corresponding pixel point of each image content in the frame image to be encoded is set Statistical feature value.
  • a target detection model can be preset on the network device, and multiple things in an image can be located and classified separately through the target detection model.
  • the network device analyzes the frame image to be encoded, it can input the frame image to be encoded into the above-mentioned target detection model, so that the target detection model can output the image content detection result.
  • the location and classification of multiple things ie image content.
  • the network device can set corresponding statistical feature values for the pixels of each image content according to the position and classification of each image content in the frame image to be encoded.
  • the network device may assign the same preset statistical feature values to pixels in the same type of image content, and assign different preset statistical feature values to pixels in different types of image content.
  • the statistical feature value corresponding to each pixel can be normalized to limit the data value to the range of [0, 1].
  • the specifics can be as follows Processing, assuming that cam is the original statistical feature value, min is the minimum value in the statistical feature value, and max is the maximum value in the statistical feature value, the normalization process can be: (cam-min)/max.
  • the coding weights corresponding to each image area can be calculated in units of macroblocks, and the corresponding processing can be as follows: For the target image area, determine all the macroblocks contained in the target image area; The average value of the coding weights of the points is used as the coding weight of the macroblock; the coding weights of all the macroblocks in the target image area are recorded as the coding weights corresponding to the target image area.
  • a frame image in video coding, can usually be divided into several macroblocks.
  • macroblocks are used as a unit, and the macroblocks are coded one by one, so that they can be organized into a continuous video stream. Therefore, when calculating the coding weight corresponding to the target image area, the network device may divide the target image area into multiple macroblocks, and then calculate the average value of the coding weights of all pixels contained in each macroblock. Furthermore, the network device may use the average value as the coding weight of each macroblock, and then simultaneously record the coding weights of all macroblocks in the target image area as the coding weights corresponding to the target image area.
  • an image area can correspond to multiple encoding weights at the same time. Therefore, when the quantization parameter is subsequently set, different quantization parameters can be set in an image area in units of macroblocks. In this way, the fineness of the video frame image coding can be further improved, and the video picture quality after coding can be improved.
  • the network device determines the encoding weight of each pixel in the frame image to be encoded, it can first construct a two-dimensional mask map with the same resolution as the frame image to be encoded, and the pixels in the two-dimensional mask map The number and arrangement of are consistent with the frame image to be encoded, and the pixels therein correspond to the pixels in the frame image to be encoded one-to-one.
  • the network device can use the two-dimensional mask map to record the encoding weight of each pixel in the frame image to be encoded. Specifically, it can record the corresponding pixel in the frame image to be encoded at the position corresponding to each pixel in the two-dimensional mask image. The encoding weight of the pixel.
  • the left side is the frame image to be encoded, which contains 64 pixels
  • the right side is the corresponding two-dimensional mask map, which records 64 encoding weights (the specific values are only for illustrative purposes, not Actual value).
  • the use of a two-dimensional mask map to record the encoding weight can accurately and methodically complete the data recording, and it is convenient for subsequent manual review and inspection of the encoding weight.
  • the network device uses the two-dimensional mask map to record the encoding weight of each pixel in the frame image to be encoded, it can use the encoding weight recorded at each pixel in the two-dimensional mask map and the aforementioned preset Set the value standard, crop the two-dimensional mask image to retain all target pixels in the two-dimensional mask image whose encoding weights meet the preset value standard, and at the same time record the position information of the target pixel.
  • the default value can be set to the corresponding encoding weight for the pixels that are cropped.
  • the two-dimensional mask map can be traversed first, and the number of target pixels included is less than a certain value and the area is greater than a predetermined threshold. Determine these continuous areas as areas to be cropped. Further, when recording the position information of the target pixel, for a target area composed of a large number of target pixels, it is possible to record only the position information of the first target pixel and the last target pixel of the target area without recording all target pixels. Point location information.
  • the two-dimensional mask image may contain multiple regions to be cropped, and the cropped two-dimensional mask
  • the code map contains a large number of target pixels, and a small number of pixels with a coding weight of zero. In this way, cutting the two-dimensional mask map can compress the intermediate data in the encoding process to a certain extent, save data storage space, and reduce the amount of data transmission within the network device during the video encoding process.
  • Step 102 Set a quantization parameter corresponding to each image area according to the encoding weight corresponding to each image area.
  • the network device after the network device determines the encoding weight corresponding to each image region in the frame image to be encoded, it can set the quantization parameter corresponding to each image region according to the encoding weight.
  • the quantization parameter (which can be referred to as QP) can be used to reflect the compression of image details.
  • QP the quantization parameter
  • the QP value When the QP value is small, most of the image details will be preserved. When the QP value is large, some image details will be lost. After encoding, The image distortion is higher and the picture quality is degraded.
  • the required degree of coding precision is higher, and the quantization parameter corresponding to the image area can be appropriately reduced; and for an image area with a smaller coding weight, the required If the coding precision is low, the quantization parameter corresponding to the image area can be appropriately increased.
  • the processing of step 102 may be specifically as follows: For the target macroblocks contained in the target image area, according to the preset quantization parameter fluctuation range and target The coding weight of the macro block is used to calculate the quantization parameter corresponding to the target macro block.
  • the target macroblock may be any macroblock contained in the target image area.
  • the network device after the network device records the coding weights of all the macroblocks contained in each image area in the frame image to be encoded, it can calculate each macroblock according to the preset fluctuation range of the quantization parameter and the coding weight of each macroblock.
  • the fluctuation range of the quantization parameter may be preset by the technician according to the image quality requirements of the video frame image, and configured in the network device. Specifically, by setting the maximum and minimum values of different quantization parameter fluctuation ranges, the average image quality of the encoded video frame image and the adjustment of the difference in fineness of each image region can be realized.
  • the larger the coding weight of the macroblock the smaller the corresponding quantization parameter QP.
  • the macroblock corresponds to the smallest quantization parameter, that is, QP min ;
  • the smaller the encoding weight the larger the corresponding quantization parameter QP.
  • the macroblock corresponds to the largest quantization parameter, that is, QP max .
  • Step 103 Based on the set quantization parameter corresponding to each image area, the frame image to be encoded is encoded.
  • F(x,y) is the DCT coefficient obtained after discrete cosine transform
  • Q is the quantization step size, which has a certain corresponding relationship with the quantization parameter, and the quantization step size increases with the increase of the quantization parameter
  • the round() function is Rounding function for rounding
  • q(x,y) is the value obtained after quantization.
  • the processing of step 103 may be specifically as follows: Based on the calculated quantization parameter corresponding to each macroblock in each image area, the code to be coded is Frame images are encoded.
  • the network device after the network device has calculated the quantization parameters corresponding to all the macroblocks in each image area in the frame image to be encoded, it can refer to the corresponding relationship between the quantization parameter and the quantization step to determine the quantization step corresponding to each macroblock. Then, according to the quantization step size corresponding to each macro block, the quantization processing of each macro block contained in each image area is completed, and then subsequent scanning, entropy encoding, and encapsulation processing are performed to realize the encoding processing of the frame image to be encoded. In this way, performing different degrees of quantization processing on multiple macroblocks with different quantization step sizes can further improve the fineness of video frame image encoding and improve the quality of the encoded video picture.
  • the frame image to be encoded of the target video is extracted, and the encoding weight corresponding to each image region in the frame image to be encoded is determined through image analysis technology; according to the encoding weight corresponding to each image region, the corresponding encoding weight of each image region is set The quantization parameter; based on the quantization parameter corresponding to each image area after setting, the frame image to be coded is coded.
  • an embodiment of the present invention also provides a video encoding device. As shown in FIG. 4, the device includes:
  • the parameter setting module 402 is configured to set the quantization parameter corresponding to each image area according to the coding weight corresponding to each image area;
  • the video encoding module 403 is configured to encode the frame image to be encoded based on the set quantization parameter corresponding to each image region.
  • the image analysis module 401 is specifically configured to:
  • the encoding weight corresponding to each image area in the frame image to be encoded is calculated.
  • the device further includes:
  • the weight recording module 404 is configured to construct a two-dimensional mask image with the same resolution as the frame image to be encoded, and to record the image of the frame image to be encoded at a position corresponding to each pixel in the two-dimensional mask image. The encoding weight of each pixel.
  • Fig. 6 is a schematic structural diagram of a network device provided by an embodiment of the present invention.
  • the network device 600 may have relatively large differences due to different configurations or performances, and may include one or more central processing units 622 (for example, one or more processors) and a memory 632, and one or more storage application programs 642 or
  • the storage medium 630 of the data 644 (for example, one or a storage device with a large amount of storage).
  • the memory 632 and the storage medium 630 may be short-term storage or persistent storage.
  • the program stored in the storage medium 630 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations on the network device 600.
  • the central processing unit 622 may be configured to communicate with the storage medium 630, and execute a series of instruction operations in the storage medium 630 on the network device 600.
  • the network device 600 may also include one or more power supplies 629, one or more wired or wireless network interfaces 650, one or more input and output interfaces 658, one or more keyboards 656, and/or, one or more operating systems 641, such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, etc.
  • the network device 600 may include a memory and one or more programs. One or more programs are stored in the memory and configured to be executed by one or more processors. The above video encoding instructions.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Discrete Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本发明公开了一种视频编码的方法和装置,属于视频处理技术领域。所述方法包括:提取目标视频的待编码帧图像,通过图像分析技术确定所述待编码帧图像中各个图像区域对应的编码权值;依据所述各个图像区域对应的编码权值,设置每个所述图像区域对应的量化参数;基于设置后的每个所述图像区域对应的量化参数,对所述待编码帧图像进行编码。采用本发明,可以使得编码资源能够得到更合理的分配,编码后的视频画面质量和视频观看体验也能整体上得到提升。

Description

一种视频编码的方法和装置 技术领域
本发明涉及视频处理技术领域,特别涉及一种视频编码的方法和装置。
背景技术
摄像机等图像采集设备在采集完图像信息后可以生成原始视频数据,原始视频数据由大量按序排列的帧图像构成,其数据量非常大。为了便于视频的传输与存储,往往可以先利用视频编码技术对原始视频数据进行编码压缩,以去除原始视频数据中的冗余信息,再传输或存储编码后的视频数据。
现有的视频编码主要可以包括帧内编码和帧间编码:在进行帧内编码时,可以对帧图像依次执行离散余弦变换、量化和熵编码等处理,从而可以得到压缩后的图像数据;在进行帧间编码时,可以计算待编码帧图像与参考帧图像之间的运动矢量,通过参考帧图像与运动矢量生成预测图像,再对比预测图像与待编码帧图像生成差分图像,进而可以对差分图像依次执行离散余弦变换、量化和熵编码等处理,从而可以得到压缩后的图像数据。
在实现本发明的过程中,发明人发现现有技术至少存在以下问题:
目前对帧图像进行量化处理时,往往是将帧图像分为多个图像块后,对所有图像块采用相同的量化参数进行量化,或者直接根据各图像块的内容复杂度选取量化参数进行量化。然而,按照上述两中量化方式进行编码,均可能会导致次要的画面内容得到过于精细的编码,而关键的画面内容无法得到更精细的编码处理,也即设备编码资源未能得到合理分配,从而使得编码后的视频画面的观看体验较低,画面质量较差。
发明内容
为了解决现有技术的问题,本发明实施例提供了一种视频编码的方法和装置。所述技术方案如下:
第一方面,提供了一种视频编码的方法,所述方法包括:
提取目标视频的待编码帧图像,通过图像分析技术确定所述待编码帧图像中各个图像区域对应的编码权值;
依据所述各个图像区域对应的编码权值,设置每个所述图像区域对应的量化参数;
基于设置后的每个所述图像区域对应的量化参数,对所述待编码帧图像进行编码。
第二方面,提供了一种视频编码的装置,所述装置包括:
图像分析模块,用于提取目标视频的待编码帧图像,通过图像分析技术确定所述待编码帧图像中各个图像区域对应的编码权值;
参数设置模块,用于依据所述各个图像区域对应的编码权值,设置每个所述图像区域对应的量化参数;
视频编码模块,用于基于设置后的每个所述图像区域对应的量化参数,对所述待编码帧图像进行编码。
第三方面,提供了一种网络设备,所述网络设备包括处理器和存储器,所述存储器中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由所述处理器加载并执行以实现如第一方面所述的视频编码的方法。
第四方面,提供了一种计算机可读存储介质,所述存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由处理器加载并执行以实现如第一方面所述的视频编码的方法。
本发明实施例提供的技术方案带来的有益效果是:
本发明实施例中,提取目标视频的待编码帧图像,通过图像分析技术确定待编码帧图像中各个图像区域对应的编码权值;依据各个图像区域对应的编码权值,设置每个图像区域对应的量化参数;基于设置后的每个图像区域对应的量化参数,对待编码帧图像进行编码。这样,对同一视频帧图像中的不同图像 区域设定不同的量化参数,从而可以对多个图像区域分别实现不同精细程度的编码处理,故而可以提高关键的画面内容的编码质量,并一定程度上降低次要的画面内容的编码质量,使得编码资源能够得到更合理的分配,编码后的视频画面质量和视频观看体验也能整体上得到提升。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本发明实施例提供的一种视频编码的方法流程图;
图2是本发明实施例提供的一种二维掩码图的简单示意图;
图3是本发明实施例提供的一种裁剪二维掩码图的原理示意图;
图4是本发明实施例提供的一种增视频编码的装置结构示意图;
图5是本发明实施例提供的一种增视频编码的装置结构示意图;
图6是本发明实施例提供的一种网络设备的结构示意图。
具体实施方式
为使本发明的目的、技术方案和优点更加清楚,下面将结合附图对本发明实施方式作进一步地详细描述。
本发明实施例提供了一种视频编码的方法,该方法的执行主体可以是任意具备视频帧图像处理功能的网络设备。网络设备可以用于对获取到的视频帧图像进行图像分析,并基于图像分析结果对视频帧图像进行编码处理,从而可以将原始的模拟数据转换为数字数据,以便于视频数据的传输与存储。在编码过程中,网络设备可以对同一视频帧图像内的不同图像区域进行不同精细程度的编码,从而使得同一视频帧图像内的不同图像区域,可以具备不同等级的画面质量。上述网络设备可以包括处理器、存储器和收发器,处理器可以用于进行下述流程中视频编码的处理,存储器可以用于存储下述处理过程中需要的数据以及产生的数据,收发器可以用于接收和发送下述处理过程中的相关数据。
下面将结合具体实施方式,对图1所示的处理流程进行详细的说明,内容 可以如下:
步骤101,提取目标视频的待编码帧图像,通过图像分析技术确定待编码帧图像中各个图像区域对应的编码权值。
在实施中,网络设备在获取到目标视频的原始视频数据之后,可以先利用如ffmpeg的视频处理工具获取目标视频的视频帧序列,然后根据视频帧序列依序提取目标视频中的帧图像作为待编码帧图像。之后,对于每个待编码帧图像,网络设备可以通过图像分析技术分析待编码帧图像的图像内容,从而确定出待编码帧图像中各个图像区域对应的编码权值。此处,编码权值可以是用于指示各个图像区域的编码精细程度的数值,可以设定编码权值越大,相应的图像区域的编码精细程度越高,编码后图像区域的画面质量更好。值得一提的是,待编码帧图像中的图像区域可以是经图像分析技术分析后划分的,也可以是预先由技术人员人工划分的,如将待编码帧图像按照20×20的规格划分出400个图像区域,还可以由按照指定的区域大小划分而成的,如设定每个图像区域等于9个宏块的大小。
可选的,可以通过图像区域包含的所有像素点的编码权值来计算图像区域的编码权值,相应的,步骤101的处理可以如下:通过图像分析技术确定待编码帧图像的各个像素点的统计特征值;基于各个像素点的统计特征值和预设的特征值选取规则,确定各个像素点的编码权值;根据各个像素点的编码权值,计算待编码帧图像中各个图像区域对应的编码权值。
在实施中,网络设备可以通过图像分析技术确定出待编码帧图像中各个像素点的统计特征值,该统计特征值可以是通过图像分析技术对待编码帧图像进行分析后,直接得到的初始分析数据,采用不同的图像分析技术可以得到不同维度的统计特征值。之后,网络设备可以采用上述统计特征值所属维度下的特征值选取规则,以各个像素点的统计特征值为基础,确定各个像素点的编码权值。例如,通过第一图像分析技术得到维度A的统计特征值,该维度A的特征值选取规则为取值越接近1的编码权值越高,编码权值的取值范围为0-100;通过第二图像分析技术得到维度B的统计特征值,该维度B的特征值选取规则为取值越低的编码权值越高,编码权值的取值范围为0-100。进而,网络设备可以基于各个图像区域中的所有像素点的编码权值,按照预设计算公式计算待编码帧图像中各个图像区域对应的编码权值。例如,对于任一图像区域,可以将该 图像区域中所有像素点的编码权值的平均值(可以视具体情况采用算术平均值、几何平均值、平方平均值、调和平均值或加权平均值等不同的平均值计算方式),作为图像区域对应的编码权值。
可选的,基于不同的图像分析技术,可以采用不同的数据作为像素点的统计特征值,如下示例性地给出了几种情况:
情况一,当图像分析技术为显著目标检测技术时,则将待编码帧图像输入基于CAM(Class Activation Mapping,类激活映射)技术的显著目标检测模型,获取显著目标检测模型中倒数第二层生成的特征图,将特征图中各像素点对应的特征数据作为待编码帧图像的各个像素点的统计特征值。
在实施中,网络设备采用显著目标检测技术来确定待编码帧图像的各个像素点的统计特征值时,可以将待编码帧图像输入基于CAM技术的显著目标检测模型。其中,基于CAM技术的显著目标检测模型可以依托于如Google net、Res net、Dense net等卷积神经网络,并且采用了全局平均池化层来替代卷积神经网络中的全连接层,从而可以使得显著目标检测模型具备目标定位的能力。进一步的,技术人员可以利用标明显著目标的图像集对显著目标检测模型进行个性化训练,以增强模型检测的准确性。之后,网络设备可以获取显著目标检测模型中倒数第二层(即全局平均池化层)生成的特征图,再将特征图中各像素点对应的特征数据作为待编码帧图像的各个像素点的统计特征值。
情况二,当图像分析技术为光流法时,则基于光流法计算待编码帧图像的各个像素点对应的光流值,将光流值作为各个像素点的统计特征值。
在实施中,光流法是利用图像序列中像素点在时间域上的变化以及相邻帧图像之间的相关性来确定上一帧图像跟当前帧图像之间存在的对应关系,从而计算出相邻帧图像之间物体的运动信息的一种方法。故而,网络设备可以将待编码帧图像与其上一帧图像进行比对,利用两帧图像间像素点的变化情况,以计算待编码帧图像的各个像素点对应的光流值,并将该光流值作为各个像素点的统计特征值。
情况三,当图像分析技术为纹理分析时,则计算待编码帧图像的各个像素点的纹理特征值,将纹理特征值作为各个像素点的统计特征值。
在实施中,纹理是物体表面固有的一种特性,可以认为是灰度或颜色在空间上以一定的变化规律而构成的外在特征,图像中的不同区域常常具备不同的 纹理。故而,网络设备可以通过纹理分析的方式,计算待编码帧图像的各个像素点的纹理特征值,从而可以将纹理特征值作为各个像素点的统计特征值。
情况四,当图像分析技术为目标检测技术时,则将待编码帧图像输入目标检测模型;根据目标检测模型输出的图像内容检测结果,为待编码帧图像中各个图像内容的像素点设置对应的统计特征值。
在实施中,网络设备上可以预设有目标检测模型,通过该目标检测模型可以对一幅图像中的多个事物分别进行定位和分类。这样,网络设备在对待编码帧图像进行分析时,可以将待编码帧图像输入上述目标检测模型,从而目标检测模型可以输出图像内容检测结果,该图像内容检测结果中可以标记出待编码帧图像中的多个事物(即图像内容)的位置和分类。之后,网络设备可以根据待编码帧图像中各个图像内容的位置和分类,为各个图像内容的像素点设置对应的统计特征值。具体的,网络设备可以为同一类的图像内容中的像素点分配相同的预设的统计特征值,而为不同类的图像内容中的像素点分配不同的预设的统计特征值。
值得一提的是,为了后续便于计算编码权值,可以对各像素点对应的统计特征值进行归一化处理,以使数据取值限制在[0,1]范围之内,具体可以采用如下处理,假设cam为原始统计特征值,min为统计特征值中的最小值,max为统计特征值中的最大值,则归一化处理可以为:(cam-min)/max。进一步的,为了减少图像分析时的数据处理量,可以在先对待编码帧图像进行预处理:如可以先将待编码帧图像缩放到固定分辨率;然后对图像数据进行如下归一化处理,其中,“image[channel]”为各通道(图像一般包含RGB三个通道)的图像数据,“mean[channel]”与“std[channel”分别为根据经验值设定的各通道的图像数据的平均值和标准差,针对RGB三通道,取值可以分别为mean=[0.485,0.456,0.406],std=[0.229,0.224,0.225],归一化处理可以为:(image[channel]-mean[channel])/std[channel]。
可选的,可以以宏块为单位计算各个图像区域对应的编码权值,相应的处理可以如下:对于目标图像区域,确定目标图像区域包含的所有宏块;将每个宏块包含的全部像素点的编码权值的平均值作为宏块的编码权值;将目标图像区域中所有宏块的编码权值记作目标图像区域对应的编码权值。
其中,目标图像区域可以是待编码帧图像中的任一图像区域。
在实施中,在视频编码中一幅帧图像通常可以划分成若干宏块,编码过程中以宏块为单位,逐个宏块进行编码,从而可以组织成连续的视频码流。因此,网络设备在计算目标图像区域对应的编码权值时,可以将目标图像区域划分为多个宏块,再计算每个宏块包含的全部像素点的编码权值的平均值。进而,网络设备可以将该平均值作为每个宏块的编码权值,然后将目标图像区域中所有宏块的编码权值同时记作目标图像区域对应的编码权值。可以理解,将图像区域细分为宏块粒度时,一个图像区域可以同时对应多个编码权值,故而在后续设置量化参数时,可以在一个图像区域内以宏块为单位设置不同的量化参数,从而可以进一步提高视频帧图像编码的精细程度,提高编码后的视频画面质量。
可选的,可以通过二维掩码图的形式记录各个像素点的编码权值,相应的可以存在如下处理:构造分辨率与待编码帧图像相同的二维掩码图,在二维掩码图中各像素点对应的位置处记录待编码帧图像的各个像素点的编码权值。
在实施中,网络设备在确定了待编码帧图像中各个像素点的编码权值之后,可以先构造分辨率与待编码帧图像相同的二维掩码图,二维掩码图中的像素点的数量及排列方式与待编码帧图像一致,且其中的像素点与待编码帧图像中的像素点一一对应。之后,网络设备可以利用上述二维掩码图记录待编码帧图像中各个像素点的编码权值,具体可以在二维掩码图中各个像素点对应的位置处记录待编码帧图像中对应的像素点的编码权值。具体可以如图2所示,左侧为待编码帧图像,包含64个像素点,右侧为对应的二维掩码图,记录有64个编码权值(具体数值仅用于示例说明,非实际取值)。这样,采用二维掩码图来记录编码权值,可以精确且有条理地完成数据的记录,并且便于后续人工对编码权值的查阅与检验。
可选的,可以通过对二维掩码图进行裁剪的方式,以实现对数据的压缩,相应的处理可以如下:根据二维掩码图中各像素点处记录的编码权值,对二维掩码图进行裁剪;保留二维掩码图中所有编码权值满足预设取值标准的目标像素点,并记录目标像素点的位置信息。
在实施中,技术人员可以预先针对编码权值设定预设取值标准,当像素点的编码权值不满足该预设取值标准时,可以选用统一的默认值作为像素点的编码权值,故而,即使不针对这些像素点的编码权值进行记录,也不会影响后续确定对应的量化参数的处理。基于上述设定,网络设备在采用二维掩码图记录 了待编码帧图像中各个像素点的编码权值后,可以根据二维掩码图中各像素点处记录的编码权值以及上述预设取值标准,对二维掩码图进行裁剪,以保留二维掩码图中所有编码权值满足预设取值标准的目标像素点,同时可以记录下目标像素点的位置信息。这样,在后续确定量化参数的过程中,对于被裁剪掉的像素点,可以将默认值设置为对应的编码权值。具体来说,在对编码权值进行裁剪的过程中,可以先遍历二维掩码图,确定出包含的目标像素点的个数小于一定数值、且面积大于预设阈值的连续区域,进而可以将这些连续区域确定为待裁剪区域。进一步的,在记录目标像素点的位置信息时,针对大量目标像素点构成的目标区域,可以仅记录目标区域的首个目标像素点和最后一个目标像素点的位置信息,而无需记录全部目标像素点的位置信息。例如,如图3所示,预设取值标准为不为零,且与零的差值大于预设数值,则二维掩码图中可以包含多个待裁剪区域,裁剪后的二维掩码图包含大量的目标像素点,和少量编码权值为零的像素点。这样,对二维掩码图进行裁剪,可以一定程度上压缩编码过程中的中间数据,节省数据存储空间,减少对视频编码过程中网络设备内部的数据传输量。
步骤102,依据各个图像区域对应的编码权值,设置每个图像区域对应的量化参数。
在实施中,网络设备在确定了待编码帧图像中各个图像区域对应的编码权值后,可以依据该编码权值,对每个图像区域对应的量化参数进行设置。其中,量化参数(可简称为QP)可以用于反映图像细节的压缩情况,QP取值较小时,大部分的图像细节都会被保留,QP取值较大时,一些图像细节将丢失,编码后的图像失真度较高和画面质量下降。故而,对于编码权值较大的图像区域,其所需的编码精细程度较高,则可以将该图像区域对应的量化参数适当缩小;而对于编码权值较小的图像区域,其所需的编码精细程度较低,则可以将该图像区域对应的量化参数适当增大。
可选的,基于上述一个图像区域同时对应多个宏块的编码权值的情况,步骤102的处理具体可以如下:对于目标图像区域包含的目标宏块,根据预设的量化参数波动范围和目标宏块的编码权值,计算目标宏块对应的量化参数。
其中,目标宏块可以是目标图像区域包含的任一宏块。
在实施中,网络设备在记录了待编码帧图像中各个图像区域包含的所有宏 块的编码权值之后,可以根据预设的量化参数波动范围和各个宏块的编码权值,计算出各个宏块对应的量化参数。此处,量化参数波动范围可以是技术人员根据视频帧图像的画质需求预先设定,并配置在网络设备中的。具体来说,可以通过设定不同的量化参数波动范围的最大值和最小值的方式,实现编码后的视频帧图像的平均画质以及各图像区域的精细度差别的调整。例如,若需要较高的平均画质,则可以适当减小量化参数波动范围的最大值;若需要各图像区域的精细度差别较小,则可以适当缩小量化参数波动范围的最大值和最小值间的差值。以目标图像区域包含的目标宏块为例,设定QP i代表目标宏块对应的量化参数,QP max和QP min代表量化参数波动范围的最大值和最小值,B i代表目标宏块的编码权值(取值范围为0至1),则存在公式:
QP i=QP min+(QP max-QP min)×(1-B i)
由上述公式可知:宏块的编码权值越大时,对应的量化参数QP也就越小,当编码权值为1时,宏块对应最小的量化参数,即QP min;反之,宏块的编码权值越小时,对应的量化参数QP也就越大,编码权值为0时,宏块对应最大的量化参数,即QP max
步骤103,基于设置后的每个图像区域对应的量化参数,对待编码帧图像进行编码。
在实施中,网络设备设置了待编码图像中各个图像区域对应的量化参数后,可以基于该量化参数,对待编码帧图像进行编码处理。具体的,网络设备可以先对待编码帧图像进行离散余弦变换,以将空间域的图像数据变换为频率域的DCT系数。之后,网络设备可以基于待编码帧图像中变换后的DCT系数执行量化处理,公式可以为:q(x,y)=round(F(x,y)/Q+0.5)。其中,F(x,y)为经过离散余弦变换后得到的DCT系数;Q为量化步长,与量化参数存在一定的对应关系,量化步长随量化参数的增加而增加;round()函数为四舍五入的取整函数;q(x,y)为经过量化后得到的值。例如某个像素点经过离散余弦变换后的DCT系数为203,量化步长Q取值为28,那么,q(x,y)=round(205/28+0.5)=round(7.8214)=8。进而,网络设备可以对量化后的待编码帧图像进行数据扫描,以将二维矩阵形式的图像数据转换为一维数列形式的图像数据,再对扫描得到的一维数列进行熵编码、封装等处理,从而可以完成对待编码帧图像的编码处理。进一步的,网络设备可以重复执行步骤101-步骤103的处理, 以实现对目标视频的视频帧序列中所有帧图像的编码处理。
可选的,基于上述一个图像区域同时对应多个宏块的编码权值的情况,步骤103的处理具体可以如下:基于计算得到的每个图像区域中每个宏块对应的量化参数,对待编码帧图像进行编码。
在实施中,网络设备在计算得到了待编码帧图像中各个图像区域的所有宏块对应的量化参数之后,可以参照量化参数与量化步长的对应关系,确定每个宏块对应的量化步长,然后按照各个宏块对应的量化步长,完成各个图像区域中包含的各个宏块的量化处理,继而执行后续的扫描、熵编码和封装等处理,以实现对待编码帧图像的编码处理。这样,以不同的量化步长对多个宏块执行不同程度的量化处理,可以进一步提高视频帧图像编码的精细程度,提高编码后的视频画面质量。
本发明实施例中,提取目标视频的待编码帧图像,通过图像分析技术确定待编码帧图像中各个图像区域对应的编码权值;依据各个图像区域对应的编码权值,设置每个图像区域对应的量化参数;基于设置后的每个图像区域对应的量化参数,对待编码帧图像进行编码。这样,对同一视频帧图像中的不同图像区域设定不同的量化参数,从而可以对多个图像区域分别实现不同精细程度的编码处理,故而可以提高关键的画面内容的编码质量,并一定程度上降低次要的画面内容的编码质量,使得编码资源能够得到更合理的分配,编码后的视频画面质量和视频观看体验也能整体上得到提升。
基于相同的技术构思,本发明实施例还提供了一种视频编码的装置,如图4所示,所述装置包括:
图像分析模块401,用于提取目标视频的待编码帧图像,通过图像分析技术确定所述待编码帧图像中各个图像区域对应的编码权值;
参数设置模块402,用于依据所述各个图像区域对应的编码权值,设置每个所述图像区域对应的量化参数;
视频编码模块403,用于基于设置后的每个所述图像区域对应的量化参数,对所述待编码帧图像进行编码。
可选的,所述图像分析模块401,具体用于:
通过图像分析技术确定所述待编码帧图像的各个像素点的统计特征值;
基于所述各个像素点的统计特征值和预设的特征值选取规则,确定所述各个像素点的编码权值;
根据所述各个像素点的编码权值,计算所述待编码帧图像中各个图像区域对应的编码权值。
可选的,如图5所示,所述装置还包括:
权值记录模块404,用于构造分辨率与所述待编码帧图像相同的二维掩码图,在所述二维掩码图中各像素点对应的位置处记录所述待编码帧图像的各个像素点的编码权值。
图6是本发明实施例提供的网络设备的结构示意图。该网络设备600可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上中央处理器622(例如,一个或一个以上处理器)和存储器632,一个或一个以上存储应用程序642或数据644的存储介质630(例如一个或一个以上海量存储设备)。其中,存储器632和存储介质630可以是短暂存储或持久存储。存储在存储介质630的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对网络设备600中的一系列指令操作。更进一步地,中央处理器622可以设置为与存储介质630通信,在网络设备600上执行存储介质630中的一系列指令操作。
网络设备600还可以包括一个或一个以上电源629,一个或一个以上有线或无线网络接口650,一个或一个以上输入输出接口658,一个或一个以上键盘656,和/或,一个或一个以上操作系统641,例如Windows Server,Mac OS X,Unix,Linux,FreeBSD等等。
网络设备600可以包括有存储器,以及一个或者一个以上的程序,其中一个或者一个以上程序存储于存储器中,且经配置以由一个或者一个以上处理器执行所述一个或者一个以上程序包含用于进行上述视频编码的指令。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
以上所述仅为本发明的较佳实施例,并不用以限制本发明,凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (15)

  1. 一种视频编码的方法,其特征在于,所述方法包括:
    提取目标视频的待编码帧图像,通过图像分析技术确定所述待编码帧图像中各个图像区域对应的编码权值;
    依据所述各个图像区域对应的编码权值,设置每个所述图像区域对应的量化参数;
    基于设置后的每个所述图像区域对应的量化参数,对所述待编码帧图像进行编码。
  2. 根据权利要求1所述的方法,其特征在于,所述通过图像分析技术确定所述待编码帧图像中各个图像区域对应的编码权值,包括:
    通过图像分析技术确定所述待编码帧图像的各个像素点的统计特征值;
    基于所述各个像素点的统计特征值和预设的特征值选取规则,确定所述各个像素点的编码权值;
    根据所述各个像素点的编码权值,计算所述待编码帧图像中各个图像区域对应的编码权值。
  3. 根据权利要求2所述的方法,其特征在于,所述通过图像分析技术确定所述待编码帧图像的各个像素点的统计特征值,包括:
    将所述待编码帧图像输入基于CAM技术的显著目标检测模型;
    获取所述显著目标检测模型中倒数第二层生成的特征图,将所述特征图中各像素点对应的特征数据作为所述待编码帧图像的各个像素点的统计特征值。
  4. 根据权利要求2所述的方法,其特征在于,所述通过图像分析技术确定所述待编码帧图像的各个像素点的统计特征值,包括:
    基于光流法计算所述待编码帧图像的各个像素点对应的光流值,将所述光流值作为所述各个像素点的统计特征值。
  5. 根据权利要求2所述的方法,其特征在于,所述通过图像分析技术确定 所述待编码帧图像的各个像素点的统计特征值,包括:
    计算所述待编码帧图像的各个像素点的纹理特征值,将所述纹理特征值作为所述各个像素点的统计特征值。
  6. 根据权利要求2所述的方法,其特征在于,所述通过图像分析技术确定所述待编码帧图像的各个像素点的统计特征值,包括:
    将所述待编码帧图像输入目标检测模型;
    根据所述目标检测模型输出的图像内容检测结果,为所述待编码帧图像中各个图像内容的像素点设置对应的统计特征值。
  7. 根据权利要求2所述的方法,其特征在于,所述根据所述各个像素点的编码权值,计算所述待编码帧图像中各个图像区域对应的编码权值,包括:
    对于目标图像区域,确定所述目标图像区域包含的所有宏块;
    将每个所述宏块包含的全部像素点的编码权值的平均值作为所述宏块的编码权值;
    将所述目标图像区域中所有宏块的编码权值记作所述目标图像区域对应的编码权值。
  8. 根据权利要求7所述的方法,其特征在于,所述依据所述各个图像区域对应的编码权值,设置每个所述图像区域对应的量化参数,包括:
    对于目标图像区域包含的目标宏块,根据预设的量化参数波动范围和所述目标宏块的编码权值,计算所述目标宏块对应的量化参数;
    所述基于设置后的每个所述图像区域对应的量化参数,对所述待编码帧图像进行编码,包括:
    基于计算得到的每个所述图像区域中每个宏块对应的量化参数,对所述待编码帧图像进行编码。
  9. 根据权利要求2所述的方法,其特征在于,所述基于所述各个像素点的统计特征值和预设的特征值选取规则,确定所述各个像素点的编码权值之后,还包括:
    构造分辨率与所述待编码帧图像相同的二维掩码图,在所述二维掩码图中各像素点对应的位置处记录所述待编码帧图像的各个像素点的编码权值。
  10. 根据权利要求9所述的方法,其特征在于,所述方法还包括:
    根据所述二维掩码图中各像素点处记录的编码权值,对所述二维掩码图进行裁剪;
    保留所述二维掩码图中所有所述编码权值满足预设取值标准的目标像素点,并记录所述目标像素点的位置信息。
  11. 一种视频编码的装置,其特征在于,所述装置包括:
    图像分析模块,用于提取目标视频的待编码帧图像,通过图像分析技术确定所述待编码帧图像中各个图像区域对应的编码权值;
    参数设置模块,用于依据所述各个图像区域对应的编码权值,设置每个所述图像区域对应的量化参数;
    视频编码模块,用于基于设置后的每个所述图像区域对应的量化参数,对所述待编码帧图像进行编码。
  12. 根据权利要求11所述的装置,其特征在于,所述图像分析模块,具体用于:
    通过图像分析技术确定所述待编码帧图像的各个像素点的统计特征值;
    基于所述各个像素点的统计特征值和预设的特征值选取规则,确定所述各个像素点的编码权值;
    根据所述各个像素点的编码权值,计算所述待编码帧图像中各个图像区域对应的编码权值。
  13. 根据权利要求12所述的装置,其特征在于,所述装置还包括:
    权值记录模块,用于构造分辨率与所述待编码帧图像相同的二维掩码图,在所述二维掩码图中各像素点对应的位置处记录所述待编码帧图像的各个像素点的编码权值。
  14. 一种网络设备,其特征在于,所述网络设备包括处理器和存储器,所述存储器中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由所述处理器加载并执行以实现如权利要求1至10任一所述的视频编码的方法。
  15. 一种计算机可读存储介质,其特征在于,所述存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由处理器加载并执行以实现如权利要求1至10任一所述的视频编码的方法。
PCT/CN2020/070701 2019-11-22 2020-01-07 一种视频编码的方法和装置 WO2021098030A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911157969.9A CN110996101B (zh) 2019-11-22 2019-11-22 一种视频编码的方法和装置
CN201911157969.9 2019-11-22

Publications (1)

Publication Number Publication Date
WO2021098030A1 true WO2021098030A1 (zh) 2021-05-27

Family

ID=70086001

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/070701 WO2021098030A1 (zh) 2019-11-22 2020-01-07 一种视频编码的方法和装置

Country Status (2)

Country Link
CN (1) CN110996101B (zh)
WO (1) WO2021098030A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114025200A (zh) * 2021-09-15 2022-02-08 湖南广播影视集团有限公司 一种基于云技术的超高清后期制作解决方法
CN116260976A (zh) * 2023-05-15 2023-06-13 深圳比特耐特信息技术股份有限公司 一种视频数据处理应用系统
CN116385706A (zh) * 2023-06-06 2023-07-04 山东外事职业大学 基于图像识别技术的信号检测方法和系统

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116156196B (zh) * 2023-04-19 2023-06-30 探长信息技术(苏州)有限公司 一种用于视频数据的高效传输方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102263943A (zh) * 2010-05-25 2011-11-30 财团法人工业技术研究院 视频位率控制装置与方法
JP2016082395A (ja) * 2014-10-16 2016-05-16 キヤノン株式会社 符号化装置、符号化方法及びプログラム
CN106412594A (zh) * 2016-10-21 2017-02-15 乐视控股(北京)有限公司 全景图像编码方法和装置
CN107147912A (zh) * 2017-05-04 2017-09-08 浙江大华技术股份有限公司 一种视频编码方法及装置
CN107770525A (zh) * 2016-08-15 2018-03-06 华为技术有限公司 一种图像编码的方法及装置
CN110177277A (zh) * 2019-06-28 2019-08-27 广东中星微电子有限公司 图像编码方法、装置、计算机可读存储介质及电子设备

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101309422B (zh) * 2008-06-23 2010-09-29 北京工业大学 宏块级量化参数处理方法及装置
CN103079063B (zh) * 2012-12-19 2015-08-26 华南理工大学 一种低码率下视觉关注区域的视频编码方法
CN106101602B (zh) * 2016-08-30 2019-03-29 北京北信源软件股份有限公司 一种带宽自适应改善网络视频质量的方法
US10904531B2 (en) * 2017-03-23 2021-01-26 Qualcomm Incorporated Adaptive parameters for coding of 360-degree video
JP6946979B2 (ja) * 2017-11-29 2021-10-13 富士通株式会社 動画像符号化装置、動画像符号化方法、及び動画像符号化プログラム
CN110163076B (zh) * 2019-03-05 2024-05-24 腾讯科技(深圳)有限公司 一种图像数据处理方法和相关装置
CN110225342B (zh) * 2019-04-10 2021-03-09 中国科学技术大学 基于语义失真度量的视频编码的比特分配系统及方法
CN110087075B (zh) * 2019-04-22 2021-07-30 浙江大华技术股份有限公司 一种图像的编码方法、编码装置以及计算机存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102263943A (zh) * 2010-05-25 2011-11-30 财团法人工业技术研究院 视频位率控制装置与方法
JP2016082395A (ja) * 2014-10-16 2016-05-16 キヤノン株式会社 符号化装置、符号化方法及びプログラム
CN107770525A (zh) * 2016-08-15 2018-03-06 华为技术有限公司 一种图像编码的方法及装置
CN106412594A (zh) * 2016-10-21 2017-02-15 乐视控股(北京)有限公司 全景图像编码方法和装置
CN107147912A (zh) * 2017-05-04 2017-09-08 浙江大华技术股份有限公司 一种视频编码方法及装置
CN110177277A (zh) * 2019-06-28 2019-08-27 广东中星微电子有限公司 图像编码方法、装置、计算机可读存储介质及电子设备

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114025200A (zh) * 2021-09-15 2022-02-08 湖南广播影视集团有限公司 一种基于云技术的超高清后期制作解决方法
CN116260976A (zh) * 2023-05-15 2023-06-13 深圳比特耐特信息技术股份有限公司 一种视频数据处理应用系统
CN116260976B (zh) * 2023-05-15 2023-07-18 深圳比特耐特信息技术股份有限公司 一种视频数据处理应用系统
CN116385706A (zh) * 2023-06-06 2023-07-04 山东外事职业大学 基于图像识别技术的信号检测方法和系统
CN116385706B (zh) * 2023-06-06 2023-08-25 山东外事职业大学 基于图像识别技术的信号检测方法和系统

Also Published As

Publication number Publication date
CN110996101A (zh) 2020-04-10
CN110996101B (zh) 2022-05-27

Similar Documents

Publication Publication Date Title
WO2021098030A1 (zh) 一种视频编码的方法和装置
US11310501B2 (en) Efficient use of quantization parameters in machine-learning models for video coding
US20210051322A1 (en) Receptive-field-conforming convolutional models for video coding
US9762917B2 (en) Quantization method and apparatus in encoding/decoding
KR101528895B1 (ko) 관심 특성 색 모델 변수의 적응성 추정을 위한 방법 및 장치
TWI743919B (zh) 視訊處理裝置及視訊串流的處理方法
JP2012085344A (ja) 動画像復号装置
CN111988611A (zh) 量化偏移信息的确定方法、图像编码方法、装置及电子设备
US20180063526A1 (en) Method and device for compressing image on basis of photography information
CN112738533B (zh) 一种机巡图像分区域压缩方法
EP3922032A1 (en) Quantization step parameter for point cloud compression
WO2020061008A1 (en) Receptive-field-conforming convolution models for video coding
CN116916036A (zh) 视频压缩方法、装置及系统
KR20200139775A (ko) 코딩된 이미지 또는 화상을 디코딩하기 위한 방법, 장치 및 매체
WO2022067775A1 (zh) 点云的编码、解码方法、编码器、解码器以及编解码系统
Chen et al. CNN-optimized image compression with uncertainty based resource allocation
US10645418B2 (en) Morphological anti-ringing filter for lossy image compression
CN115665413A (zh) 图像压缩最优量化参数的估计方法
CN112672162A (zh) 编码方法、装置及存储介质
WO2020019279A1 (zh) 视频压缩的方法、装置、计算机系统和可移动设备
AU2016201449B2 (en) Encoding and decoding using perceptual representations
JP6486120B2 (ja) 符号化装置、符号化装置の制御方法、及びプログラム
CN112422964B (zh) 一种渐进式编码方法及装置
US20240283952A1 (en) Adaptive coding tool selection with content classification
US20230113529A1 (en) Encoding method for point cloud compression and electronic device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20891086

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20891086

Country of ref document: EP

Kind code of ref document: A1