CN104661035A

CN104661035A - Compression method and system for local feature descriptor of video and video compression method

Info

Publication number: CN104661035A
Application number: CN201510073614.7A
Authority: CN
Inventors: 马思伟; 张翔; 王苫社; 王诗淇
Original assignee: Peking University
Current assignee: Peking University
Priority date: 2015-02-11
Filing date: 2015-02-11
Publication date: 2015-05-27
Anticipated expiration: 2035-02-11
Also published as: CN104661035B

Abstract

The invention discloses a compression method and system for a local feature descriptor of a video and a video compression method. The compression method for the local feature descriptor of the video comprises the following steps: performing inter-frame prediction, quantization and entropy coding on the combined video content, calculating a residual coefficient of a prediction signal, quantizing the residual coefficient to obtain a quantization coefficient, performing entropy coding on the quantization coefficient, outputting a code stream formed after entropy coding, and finishing compression of the local feature descriptor of the video. The local feature descriptor of the video is compressed through the compression method and system, compressed video data is compactly expressed on the basis of the combined video content, and high compression ratio is achieved, so that the transmission rate of the video data is increased, the storage efficiency and the compression efficiency of the video data are improved, and the retrieval efficiency of the video data is improved.

Description

Video local feature descriptor compression method, system and video compression method

技术领域technical field

本发明属于数据处理技术领域，具体涉及一种视频局部特征描述子的压缩方法、压缩系统及视频压缩方法。The invention belongs to the technical field of data processing, and in particular relates to a compression method, a compression system and a video compression method of a video local feature descriptor.

背景技术Background technique

随着互联网的广泛应用，在网络的传输和存储过程中，视频数据的数据越来越庞大。对视频数据的压缩则显得尤为重要。而局部特征描述子，在图像/视频数据处理以及计算机视觉领域得到了广泛的应用，描述子的压缩也是提高视频数据检索等视频数据应用的关键技术。例如，对128维的SIFT局部特征描述子而言，将每个维度量化到8比特，一帧图像包含1000个SIFT特征点，则一个300帧的视频包含的SIFT特征数据量为300*1000*128*8bits＝292.97Mb，无论对于存储还是传输都是需要付出巨大的代价，在实际应用中是难以承受的。With the wide application of the Internet, the data of video data is getting larger and larger in the process of network transmission and storage. Compression of video data is particularly important. Local feature descriptors have been widely used in image/video data processing and computer vision. Descriptor compression is also a key technology to improve video data applications such as video data retrieval. For example, for the 128-dimensional SIFT local feature descriptor, each dimension is quantized to 8 bits, and one frame of image contains 1000 SIFT feature points, then the amount of SIFT feature data contained in a 300-frame video is 300*1000* 128*8bits=292.97Mb, no matter for storage or transmission, it needs to pay a huge price, which is unbearable in practical applications.

在对视频数据处理的现有技术中，大多单独对描述子进行压缩，限制了描述子的压缩比，不利于数据检索效率的提高，同时也阻碍了视频检索技术的发展。In the prior art of video data processing, most of the descriptors are compressed separately, which limits the compression ratio of the descriptors, is not conducive to the improvement of data retrieval efficiency, and also hinders the development of video retrieval technology.

发明内容Contents of the invention

本发明实施例的目的是提供一种视频局部特征描述子的压缩方法、系统及视频压缩方法，以提高视频局部特征描述子的压缩效率，从而提高视频压缩数据的传输速率和存储效率。The purpose of the embodiment of the present invention is to provide a video local feature descriptor compression method, system and video compression method, so as to improve the compression efficiency of the video local feature descriptor, thereby improving the transmission rate and storage efficiency of video compressed data.

根据本发明的一个方面，提供了一种视频局部特征描述子的压缩方法，所述方法包括如下步骤：According to one aspect of the present invention, a method for compressing video local feature descriptors is provided, said method comprising the steps of:

联合视频内容对当前局部特征描述子进行帧间预测得到预测信号；Combine video content to perform inter-frame prediction on the current local feature descriptor to obtain a prediction signal;

计算所述预测信号的残差系数；calculating residual coefficients of said predicted signal;

对所述残差系数进行量化得到量化系数；Quantizing the residual coefficients to obtain quantized coefficients;

对所述量化系数进行熵编码，输出码流。Entropy encoding is performed on the quantized coefficients to output a code stream.

上述方案中，所述计算残差系数之后，得到量化系数之前，所述方法还包括：In the above solution, after calculating the residual coefficient and before obtaining the quantized coefficient, the method further includes:

对所述残差系数进行变换得到变换系数；transforming the residual coefficients to obtain transformation coefficients;

对所述残差系数进行量化得到量化系数，进一步为，对所述变换系数进行量化并得到量化系数。Quantizing the residual coefficients to obtain quantized coefficients, and further, quantizing the transformation coefficients to obtain quantized coefficients.

上述方案中，所述方法还包括：对所述量化系数进行反量化和反变换，得到重构描述子，并存储所述重构描述子。In the above solution, the method further includes: performing inverse quantization and inverse transformation on the quantized coefficients to obtain a reconstruction descriptor, and storing the reconstruction descriptor.

上述方案中，所述联合视频内容对当前局部特征描述子进行帧间预测得到预测信号，进一步包括：在所述当前局部特征描述子所在帧的已编码的前一帧中，找到一个与所述当前局部特征描述子最相近的重构描述子，作为预测信号。In the above solution, the joint video content performs inter-frame prediction on the current local feature descriptor to obtain a prediction signal, which further includes: finding a frame that is identical to the current local feature descriptor in the coded previous frame of the frame where the current local feature descriptor is located The reconstruction descriptor closest to the current local feature descriptor is used as a prediction signal.

上述方案中，所述量化为标量量化或矢量量化。In the above solution, the quantization is scalar quantization or vector quantization.

根据本发明的另一个方面，还提供了一种视频局部特征描述子的压缩系统，所述系统包括：预测模块、残差系数计算模块、量化模块、编码模块According to another aspect of the present invention, a compression system of video local feature descriptors is also provided, the system includes: a prediction module, a residual coefficient calculation module, a quantization module, and an encoding module

所述预测模块用于联合视频内容对当前局部特征描述子进行帧间预测得到预测信号；The prediction module is used to perform inter-frame prediction on the current local feature descriptor in conjunction with video content to obtain a prediction signal;

所述残差系数计算模块与所述预测模块相连，用于计算所述预测信号的残差系数；The residual coefficient calculation module is connected to the prediction module, and is used to calculate the residual coefficient of the prediction signal;

所述量化模块与所述残差系数计算模块相连，用于对所述残差系数进行量化得到量化系数；The quantization module is connected to the residual coefficient calculation module, and is used to quantize the residual coefficient to obtain a quantized coefficient;

所述编码模块与所述量化模块相连，用于对所述量化系数进行熵编码，输出码流。The coding module is connected to the quantization module, and is used for performing entropy coding on the quantization coefficients and outputting a code stream.

上述方案中，所述系统还包括变换模块，所述变换模块与所述残差系数计算模块和量化模块相连，用于对所述残差系数进行变换得到变换系数；In the above solution, the system further includes a transformation module, the transformation module is connected to the residual coefficient calculation module and the quantization module, and is used to transform the residual coefficient to obtain a transformation coefficient;

所述量化模块还用于对所述变换系数进行量化并得到量化系数。The quantization module is also used to quantize the transform coefficients and obtain quantized coefficients.

上述方案中，所述系统还包括：重构描述子存储模块，用于对所述量化系数进行反量化和反变换，得到重构描述子，并存储所述重构描述子。In the above solution, the system further includes: a reconstruction descriptor storage module, configured to perform inverse quantization and inverse transformation on the quantized coefficients to obtain a reconstruction descriptor, and store the reconstruction descriptor.

上述方案中，所述预测模块进一步用于在所述当前局部特征描述子所在帧的已编码的前一帧中，找到一个与所述当前局部特征描述子最相近的重构描述子，作为预测信号。In the above solution, the prediction module is further used to find a reconstructed descriptor closest to the current local feature descriptor in the coded previous frame of the frame where the current local feature descriptor is located, as a prediction Signal.

根据本发明的再一个方面，还提供了一种视频压缩方法，所述方法包括：According to still another aspect of the present invention, a video compression method is also provided, the method comprising:

对视频原始数据进压缩，得到视频自身码流，并得到重构帧组成的视频重构数据；Compress the original video data to obtain the code stream of the video itself, and obtain the reconstructed video data composed of reconstructed frames;

联合所述视频重构数据中的重构帧的内容，采用权利要求1至5任一项所述的视频局部特征描述子的压缩方法，对视频原始数据中的局部特征描述子进行压缩，得到熵编码后的描述子码流；In combination with the content of the reconstructed frame in the video reconstruction data, the local feature descriptor in the original video data is compressed by using the compression method of the video local feature descriptor described in any one of claims 1 to 5, to obtain Entropy coded descriptor stream;

将所述描述子码流输出到视频自身码流中，输出视频码流。Output the description sub-code stream into the video code stream itself, and output the video code stream.

本发明实施例提供了一种视频局部特征描述子的压缩方法、系统和视频压缩方法，所述视频局部特征描述子的压缩方法包括预测、量化、熵编码，其中预测为联合视频内容的帧间预测，而后计算预测信号的残差系数，对残差系数进行量化得到量化系数，再对量化系数进行熵编码，输出码流，完成对视频的局部特征描述子的压缩。通过本发明实施例的方法对视频局部特征描述子进行压缩，在联合视频内容的基础上，使压缩后的视频数据得到紧凑的表示，达到高的压缩比，从而提高视频数据的传输速率和存储效率，同时提高了视频数据的检索效率。Embodiments of the present invention provide a compression method, system, and video compression method for local video feature descriptors. The compression method for local video feature descriptors includes prediction, quantization, and entropy coding. Prediction, and then calculate the residual coefficient of the prediction signal, quantize the residual coefficient to obtain the quantized coefficient, and then perform entropy coding on the quantized coefficient, output the code stream, and complete the compression of the local feature descriptor of the video. Through the method of the embodiment of the present invention, the local feature descriptor of the video is compressed, and the compressed video data is compactly represented on the basis of joint video content, achieving a high compression ratio, thereby improving the transmission rate and storage of the video data Efficiency, while improving the retrieval efficiency of video data.

附图说明Description of drawings

图1是本发明第一实施方式的视频局部特征描述子的压缩方法流程示意图；Fig. 1 is a schematic flow chart of a method for compressing video local feature descriptors according to the first embodiment of the present invention;

图2是本发明第二实施方式的视频局部特征描述子的压缩方法流程示意图；Fig. 2 is a schematic flow chart of a method for compressing video local feature descriptors according to the second embodiment of the present invention;

图3是本发明第二实施方式具体实施例的压缩方法流程图；Fig. 3 is a flowchart of a compression method in a specific example of the second embodiment of the present invention;

图4是本发明第三实施方式的视频局部特征描述子的压缩系统结构示意图；4 is a schematic structural diagram of a compression system of a video local feature descriptor according to a third embodiment of the present invention;

图5是本发明第四实施方式的视频局部特征描述子的压缩系统结构示意图；5 is a schematic structural diagram of a compression system of a video local feature descriptor according to a fourth embodiment of the present invention;

图6是本发明第五实施方式的视频压缩方法流程示意图。FIG. 6 is a schematic flowchart of a video compression method according to a fifth embodiment of the present invention.

具体实施方式Detailed ways

为使本发明的目的、技术方案和优点更加清楚明了，下面结合具体实施方式并参照附图，对本发明进一步详细说明。应该理解，这些描述只是示例性的，而并非要限制本发明的范围。此外，在以下说明中，省略了对公知结构和技术的描述，以避免不必要地混淆本发明的概念。In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in combination with specific embodiments and with reference to the accompanying drawings. It should be understood that these descriptions are exemplary only, and are not intended to limit the scope of the present invention. Also, in the following description, descriptions of well-known structures and techniques are omitted to avoid unnecessarily obscuring the concept of the present invention.

本发明实施例以局部特征描述子在计算机视觉领域的广泛应用为基础，对视频局部特征描述子进行压缩，将重构后的带有重构描述子内容的视频内容本身应用在压缩过程中，即联合视频内容，指导局部特征描述子的压缩，提出了视频局部特征描述子的压缩方法、系统及视频压缩方法。以下结合实施方式及附图，对本发明作详细说明。The embodiment of the present invention is based on the wide application of local feature descriptors in the field of computer vision, compresses video local feature descriptors, and applies the reconstructed video content itself with reconstructed descriptor content in the compression process, That is to combine video content, guide the compression of local feature descriptors, and propose a compression method, system and video compression method for video local feature descriptors. Hereinafter, the present invention will be described in detail in conjunction with the embodiments and the accompanying drawings.

图1是本发明第一实施方式的视频局部特征描述子的压缩方法流程示意图。本实施方式对于视频局部特征描述子的压缩方法的框架，包括联合视频内容的预测、量化、熵编码。FIG. 1 is a schematic flowchart of a method for compressing video local feature descriptors according to the first embodiment of the present invention. In this embodiment, the framework of the method for compressing video local feature descriptors includes joint prediction, quantization, and entropy coding of video content.

如图1所示，本实施例的视频局部特征描述子的压缩方法，包括如下步骤：As shown in Figure 1, the compression method of the video local feature descriptor of the present embodiment comprises the following steps:

步骤S101，联合视频内容对当前局部特征描述子进行帧间预测得到预测信号。Step S101 , inter-frame prediction is performed on the current local feature descriptor in conjunction with the video content to obtain a prediction signal.

本步骤中，对所述当前局部特征描述子进行帧间预测，具体为，在所述视频的已编码的帧中，找到一个与所述当前局部特征描述子最相近的重构描述子，作为预测信号。视频数据是由帧组成，每一帧中都有相应的局部特征描述子。处于压缩过程中的当前局部特征描述子所属的帧，即当前帧。优选的，所述视频的已编码的帧，为当前帧的前一帧。这里的最相近，是指所述已编码的重构描述子与所述当前描述子具有的相同特征数据最多，也可以通过绝对值误差和(SAD,Sum of Absolute Difference)进行选择。这一最值选择可通过一个可编程逻辑控制器(PLC)实现。在前一帧中搜索最相近重构描述子时，可以在前一帧的全部数据中搜索，也可以自定义搜索范围，如，根据所述视频的运动矢量进行搜索，在当前描述子运动矢量所指向的位置处，距离最近的五个节点的范围内搜索最相近的重构描述子。In this step, the inter-frame prediction is performed on the current local feature descriptor, specifically, in the encoded frame of the video, a reconstructed descriptor closest to the current local feature descriptor is found as predictive signal. Video data is composed of frames, and each frame has a corresponding local feature descriptor. The frame to which the current local feature descriptor in the process of compression belongs, that is, the current frame. Preferably, the encoded frame of the video is the previous frame of the current frame. The closest here means that the encoded reconstructed descriptor has the most identical feature data with the current descriptor, and it can also be selected through the sum of absolute value errors (SAD, Sum of Absolute Difference). This optimal value selection can be realized by a programmable logic controller (PLC). When searching for the closest reconstructed descriptor in the previous frame, you can search in all the data in the previous frame, or you can customize the search range, for example, search according to the motion vector of the video, in the current descriptor motion vector At the pointed position, the closest reconstruction descriptor is searched within the range of the five nearest nodes.

特别的，上述帧间预测时，作为当前局部特征描述子的预测结果的所述重构描述子，除在相应范围内的最相近的重构描述子外，也可以根据需要进行其他定义，如对应位置的重构描述子、相反特征的重构描述子。In particular, during the above-mentioned inter-frame prediction, the reconstruction descriptor that is the prediction result of the current local feature descriptor, in addition to the closest reconstruction descriptor within the corresponding range, can also be defined according to other needs, such as The reconstruction descriptor of the corresponding position and the reconstruction descriptor of the opposite feature.

步骤S102，计算所述预测信号的残差系数。Step S102, calculating residual coefficients of the prediction signal.

优选的，本步骤中计算预测残差，通过作差的方法进行。Preferably, in this step, the prediction residual is calculated by means of difference.

步骤S104，对所述残差系数进行量化得到量化系数。Step S104, quantizing the residual coefficients to obtain quantized coefficients.

本步骤中的量化方式，可以为标量量化，也可以为矢量量化。The quantization method in this step may be scalar quantization or vector quantization.

步骤S105，对所述量化系数进行熵编码，输出码流。Step S105, performing entropy coding on the quantized coefficients, and outputting a code stream.

本步骤中的熵编码，优选的，编码模型采用视频编码标准中常用的上下文自适应的二进制熵编码方法(CABAC,Context-Adaptive Binary ArithmeticCoding)，最终形成二值化码流。For entropy coding in this step, preferably, the coding model adopts a context-adaptive binary entropy coding method (CABAC, Context-Adaptive Binary Arithmetic Coding) commonly used in video coding standards, and finally forms a binary code stream.

通过本实施例所述的视频局部特征描述子压缩方法，对视频局部特征描述子进行压缩，在联合视频内容的基础上，使压缩后的视频数据得到紧凑的表示，达到的很高的压缩比，从而提高了视频数据的传输速率和存储效率，同时提高了视频数据的检索效率。Through the video local feature descriptor compression method described in this embodiment, the video local feature descriptor is compressed, and on the basis of joint video content, the compressed video data is expressed in a compact manner, achieving a very high compression ratio , thereby improving the transmission rate and storage efficiency of video data, and at the same time improving the retrieval efficiency of video data.

图2是本发明第二实施方式的视频局部特征描述子的压缩方法流程示意图。Fig. 2 is a schematic flowchart of a method for compressing video local feature descriptors according to the second embodiment of the present invention.

如图2所示，本实施方式中的步骤S201、S202、S205与第一实施方式中的步骤S102、S102、S104、S105基本相同，不同的是，本实施方式对局部特征描述子的压缩方法中，在步骤S202之后、步骤S204之前还包括步骤S203，同时相应的步骤S204有所不同，具体的：As shown in Figure 2, steps S201, S202, and S205 in this embodiment are basically the same as steps S102, S102, S104, and S105 in the first embodiment. The difference is that the method for compressing local feature descriptors in this embodiment Among them, step S203 is also included after step S202 and before step S204, and the corresponding step S204 is different at the same time, specifically:

步骤S203，对所述残差系数进行变换得到变换系数。Step S203, transforming the residual coefficients to obtain transformation coefficients.

优选的，本步骤中的变换方式为DCT变换或KLT变换或DST变换变换，从而得到变换系数。这里的变换可以采用一维系数矩阵，也可以采用二维系数矩阵。Preferably, the transformation method in this step is DCT transformation, KLT transformation or DST transformation transformation, so as to obtain transformation coefficients. The transformation here can use a one-dimensional coefficient matrix or a two-dimensional coefficient matrix.

步骤S204，对所述变换系数进行量化并得到量化系数。这里的量化可以为标量量化，也可以为矢量量化。Step S204, quantize the transform coefficients to obtain quantized coefficients. The quantization here can be scalar quantization or vector quantization.

图3是本发明第二实施方式具体实施例的压缩方法流程图。Fig. 3 is a flowchart of a compression method in a specific example of the second embodiment of the present invention.

本实施例以一个视频数据的128维SIFT局部特征描述子为例，详细说明本发明的压缩方法。This embodiment takes a 128-dimensional SIFT local feature descriptor of video data as an example to describe the compression method of the present invention in detail.

如图3所示，本实施例的基于视频的局部特征描述子的压缩方法，包括如下步骤：As shown in Figure 3, the video-based local feature descriptor compression method of the present embodiment includes the following steps:

步骤S301，首先对原始视频帧所携带的视频内容通过视频编码器进行编码，得到视频的运动矢量信息。In step S301, firstly, the video content carried by the original video frame is encoded by a video encoder to obtain motion vector information of the video.

步骤S302，从原始视频帧中提取局部特征描述子，具体的：Step S302, extracting local feature descriptors from the original video frame, specifically:

假设当前正在编码的第i帧的第j个局部特征描述子为其中是描述子的空间位置信息，是局部特征描述子向量。Assume that the j-th local feature descriptor of the i-th frame currently being encoded is in is the spatial location information of the descriptor, is the local feature descriptor vector.

步骤S303，对所提取的所述局部特征描述子进行帧间预测，通过式(1)Step S303, performing inter-frame prediction on the extracted local feature descriptors, by formula (1)

$(({\overset{~ ~}{v v}}_{inter inter},, \overset{~ ~}{k k})) = = \underset{{\overset{~ ~}{v v}}_{t t}^{i i - - {d d}_{i i}} &Element; &Element; Ψ Ψ,, t t &Element; &Element; [[00,, {K K}_{Ψ Ψ}))}{arg arg min min} {| | | | {v v}_{j j}^{i i} - - {\overset{~ ~}{v v}}_{t t}^{i i - - {d d}_{i i}} | | | |}_{11} - - - - - - ((11))$

进行帧间预测，即：Perform inter-frame prediction, namely:

首先在视频编码器中得到当前位置的运动矢量MV(d_i,d_x,d_y)，d_i,d_x,d_y分别表示时间维度的偏移值(di)和二维空间坐标的偏移值(dx,dy)。设在(i-d_i)^th帧中以坐标为起点的搜索范围Ψ(包含K_Ψ个特征点)中，得到帧间预测信号即预测信号 First get the current position in the video encoder The motion vector MV(d _i , d _x , d _y ), d _i , d _x , d _y represent the offset value (di) of the time dimension and the offset value (dx,dy) of the two-dimensional space coordinates, respectively. Set in (id _i ) ^th frame with In the search range Ψ (including K _Ψ feature points) where the coordinates are the starting point, the inter-frame prediction signal is obtained, that is, the prediction signal

步骤S304，将原始信号和预测信号做差得到残差系数 Step S304, making a difference between the original signal and the predicted signal to obtain the residual coefficient

步骤S305，对残差系数进行如步骤S305A1～S305C1的变换：Step S305, for the residual coefficient Carry out the transformation as steps S305A1～S305C1:

步骤S305A1，将128维的划分为两个8x8的二维矩阵；In step S305A1, the 128-dimensional Divided into two 8x8 two-dimensional matrices;

步骤S305B1，分别用两个8x8的DCT变换作用于这两个变换矩阵，得到两个变换后的系数矩阵；Step S305B1, using two 8x8 DCT transformations to act on the two transformation matrices respectively, to obtain two transformed coefficient matrices;

步骤S305C1，将两个8x8的系数矩阵合并成一个128维的变换系数 Step S305C1, combining two 8x8 coefficient matrices into one 128-dimensional transformation coefficient

或者，对残差系数进行如步骤S305A2～S305C2的变换：Alternatively, the residual coefficients are transformed as in steps S305A2 to S305C2:

步骤S305A2，将128维的按照8个不同方向划分成8个4x4的二维矩阵，即将每个方向的16个元素，按照位置排列成4x4的矩阵；In step S305A2, the 128-dimensional According to 8 different directions, it is divided into 8 4x4 two-dimensional matrices, that is, the 16 elements in each direction are arranged into a 4x4 matrix according to the position;

步骤S305B2，分别用8个4x4的DCT变换作用于这8个变换矩阵，得到8个变换后的系数矩阵；Step S305B2, using 8 4x4 DCT transforms to act on the 8 transform matrices respectively, to obtain 8 transformed coefficient matrices;

步骤S305C2，将8个4x4的系数矩阵合并成一个128维的变换系数 Step S305C2, combining eight 4x4 coefficient matrices into one 128-dimensional transformation coefficient

步骤S305还可以包括：Step S305 may also include:

步骤S305D，通过率失真选择决定是否进行变换，如果跳过变换，那么残差系数就是此时进行量化的变换系数。当跳过变换直接对残差系数进行量化时，即为第一实施方式的具体实施例。Step S305D, decide whether to perform transformation through rate-distortion selection, if the transformation is skipped, then the residual coefficient is the transformation coefficient to be quantized at this time. When the transformation is skipped and the residual coefficients are directly quantized, it is a specific example of the first embodiment.

步骤S306，通过设定的量化步长(Qs,Quantization step)，对变换系数进行标量量化得到量化后的量化系数这里的量化也可以是矢量量化，本实施例优选为标量量化。Step S306, perform scalar quantization on the transform coefficients by the set quantization step size (Qs, Quantization step) Get the quantization coefficient after quantization The quantization here may also be vector quantization, and scalar quantization is preferred in this embodiment.

步骤S307，通过熵编码器对量化系数进行熵编码，编码模型采用视频编码标准中常用的上下文自适应的二进制熵编码方法(CABAC,Context-Adaptive Binary Arithmetic Coding)，最终形成二值化码流，熵编码中的二值化的处理过程具体步骤包括：Step S307, quantize the coefficients by an entropy encoder Entropy coding is carried out, and the coding model adopts the context-adaptive binary entropy coding method (CABAC, Context-Adaptive Binary Arithmetic Coding) commonly used in video coding standards, and finally forms a binary code stream, and the binarization process in entropy coding Specific steps include:

步骤S307A，输出一个bin表示是否为全零系数，如果是算法结束；Step S307A, output a bin representation Whether it is an all-zero coefficient, if it is the end of the algorithm;

步骤S307B，输出一个二值向量，其中每个bin表示中对应系数是否为零；Step S307B, output a binary vector, wherein each bin represents Whether the corresponding coefficient in is zero;

步骤S307C，对于每一个非零系数，首先输出一个符号位，接着输出一个bin表示这个系数的绝对值是否为1；如果不是再输出一个bin表示系数绝对值是否为2；如果还不是，将这个系数绝对值减去3之后的指数哥伦布码输出。编码完成后得到重构描述子，输出码流。Step S307C, for each non-zero coefficient, first output a sign bit, then output a bin to indicate whether the absolute value of this coefficient is 1; if not output a bin to indicate whether the absolute value of the coefficient is 2; if not, set this Exponential Golomb code output after subtracting 3 from the absolute value of the coefficient. After the encoding is completed, the reconstructed descriptor is obtained, and the code stream is output.

步骤S308，对量化后的量化系数进行反量化和/或反变换，得到重构描述子，存储在缓存中。Step S308, for quantized quantized coefficients Perform inverse quantization and/or inverse transformation to obtain a reconstructed descriptor and store it in the cache.

需要说明的是，步骤S307中进行熵编码时，同时对变换控制标识也进行熵编码，将所做的变换过程记录在编码中。这里的变换控制标识，指的是，是否进行了步骤S305的变换和步骤S308的反变换。这里的变换和反变换是相应出现的，如果进行了变换，则在重构描述子时，需要在反量化之后进行反变换；如果在压缩过程中没有进行变换，则在重构描述子时，不需要在反量化之后进行反变换。It should be noted that when entropy encoding is performed in step S307, entropy encoding is also performed on the transformation control flag at the same time, and the transformation process done is recorded in the encoding. The transformation control flag here refers to whether the transformation in step S305 and the inverse transformation in step S308 have been performed. The transformation and inverse transformation here appear accordingly. If the transformation is performed, when the descriptor is reconstructed, the inverse transformation needs to be performed after dequantization; if no transformation is performed during the compression process, when the descriptor is reconstructed, There is no need for inverse transform after inverse quantization.

本实施例所述的基于视频的局部特征描述子的压缩方法，对视频局部特征描述子进行压缩，在联合视频内容的基础上，使压缩后的视频数据得到紧凑的表示，达到的很高的压缩比，从而提高了视频数据的传输速率和存储效率，同时提高了视频数据的检索效率。The video-based local feature descriptor compression method described in this embodiment compresses the video local feature descriptor, and on the basis of joint video content, the compressed video data can be represented in a compact manner, achieving a very high Compression ratio, thereby improving the transmission rate and storage efficiency of video data, and at the same time improving the retrieval efficiency of video data.

图4是本发明第三实施方式的基于视频的局部特征描述子的压缩系统结构示意图。Fig. 4 is a schematic structural diagram of a video-based local feature descriptor compression system according to the third embodiment of the present invention.

如图4所示，本实施方式的基于视频的局部特征描述子的压缩系统，包括：预测模块11、残差系数计算模块12、量化模块14、编码模块15；其中，As shown in FIG. 4 , the compression system based on video local feature descriptors in this embodiment includes: a prediction module 11, a residual coefficient calculation module 12, a quantization module 14, and an encoding module 15; wherein,

所述预测模块11用于联合视频内容对当前局部特征描述子进行帧间预测得到预测信号。The prediction module 11 is used to perform inter-frame prediction on the current local feature descriptor in conjunction with video content to obtain a prediction signal.

预测模块11对所述当前局部特征描述子进行帧间预测，具体为，预测模块11在所述视频的已编码的帧中，找到一个与所述当前局部特征描述子最相近的重构描述子，作为预测信号。视频数据是由帧组成，每一帧中都有相应的局部特征描述子。处于压缩过程中的当前局部特征描述子所属的帧，即当前帧。优选的，所述视频的已编码的帧，为当前帧的前一帧。这里的最相近，是指所述已编码的重构描述子与所述当前描述子具有的相同特征数据最多，也可以通过绝对值误差和(SAD,Sum of Absolute Difference)进行选择。这一最值选择可通过一个可编程逻辑控制器(PLC)实现。预测模块11在前一帧中搜索最相近重构描述子时，可以在前一帧的全部数据中搜索，也可以自定义搜索范围，如，根据所述视频的运动矢量进行搜索，在当前描述子运动矢量所指向的位置处，距离最近的五个节点的范围内搜索最相近的重构描述子。The prediction module 11 performs inter-frame prediction on the current local feature descriptor, specifically, the prediction module 11 finds a reconstructed descriptor closest to the current local feature descriptor in the encoded frame of the video , as a predictive signal. Video data is composed of frames, and each frame has a corresponding local feature descriptor. The frame to which the current local feature descriptor in the process of compression belongs, that is, the current frame. Preferably, the encoded frame of the video is the previous frame of the current frame. The closest here means that the encoded reconstructed descriptor has the most identical feature data with the current descriptor, and it can also be selected through the sum of absolute value errors (SAD, Sum of Absolute Difference). This optimal value selection can be realized by a programmable logic controller (PLC). When the prediction module 11 searches for the closest reconstructed descriptor in the previous frame, it can search in all the data of the previous frame, or can customize the search range, such as searching according to the motion vector of the video, in the current description At the position pointed by the sub-motion vector, the closest reconstruction descriptor is searched within the range of the five nearest nodes.

特别的，预测模块11进行帧间预测时，作为当前局部特征描述子的预测结果的所述重构描述子，除在相应范围内的最相近的重构描述子外，也可以根据需要进行其他定义，如对应位置的重构描述子、相反特征的重构描述子。In particular, when the prediction module 11 performs inter-frame prediction, the reconstructed descriptor as the prediction result of the current local feature descriptor, in addition to the closest reconstructed descriptor within the corresponding range, can also perform other reconstructions as required. Definition, such as the reconstruction descriptor of the corresponding position and the reconstruction descriptor of the opposite feature.

所述残差系数计算模块12与所述预测模块11相连，用于计算所述预测信号的残差系数。优选的，本步骤中计算预测信号的残差系数，通过作差的方法进行。The residual coefficient calculation module 12 is connected to the prediction module 11 for calculating the residual coefficient of the prediction signal. Preferably, in this step, the residual coefficient of the predicted signal is calculated by means of difference.

所述量化模块14与所述预测选择模块12相连，用于接收所述预测模块12所输出的残差系数，并用于对所述残差系数进行量化得到并输出量化系数。量化模块14对所述残差系数进行标量量化，或矢量量化。The quantization module 14 is connected to the prediction selection module 12, and is used to receive the residual coefficient output by the prediction module 12, and is used to quantize the residual coefficient to obtain and output a quantized coefficient. The quantization module 14 performs scalar quantization or vector quantization on the residual coefficients.

所述编码模块15与所述量化模块14相连，用于接收所述量化模块14输出的量化系数，并用于对所述量化系数进行熵编码，输出码流。The encoding module 15 is connected to the quantization module 14, and is used for receiving the quantization coefficient output by the quantization module 14, and for performing entropy encoding on the quantization coefficient, and outputting a code stream.

通过本实施例所述的视频局部特征描述子压缩系统，对视频局部特征描述子进行压缩，在联合视频内容的基础上，使压缩后的视频数据得到紧凑的表示，达到的很高的压缩比，从而提高了视频数据的传输速率和存储效率，同时提高了视频数据的检索效率。Through the video local feature descriptor compression system described in this embodiment, the video local feature descriptor is compressed, and on the basis of joint video content, the compressed video data is expressed in a compact manner, achieving a very high compression ratio , thereby improving the transmission rate and storage efficiency of video data, and at the same time improving the retrieval efficiency of video data.

图5是本发明第四实施方式的基于视频的局部特征描述子的压缩系统结构示意图。FIG. 5 is a schematic structural diagram of a video-based local feature descriptor compression system according to a fourth embodiment of the present invention.

如图5所示，本实施方式所述描述子的压缩系统与第三实施方式所述的局部特征描述子的压缩系统包括相同的预测模块11、残差系数计算模块12、编码模块15，不同的是，本实施方式还包括变换模块13和重构描述子存储模块16，同时在量化模块14的连接关系上有所不同，具体的：As shown in FIG. 5 , the descriptor compression system in this embodiment and the local feature descriptor compression system in the third embodiment include the same prediction module 11, residual coefficient calculation module 12, and encoding module 15. Notably, this embodiment also includes a transform module 13 and a reconstructed descriptor storage module 16, while the connection relationship of the quantization module 14 is different, specifically:

所述变换模块13与所述残差系数计算模块12和量化模块14相连，用于接收所述残差系数计算模块12输出的残差系数，并用于对所述残差系数进行变换得到并输出变换系数。The transformation module 13 is connected to the residual coefficient calculation module 12 and the quantization module 14, and is used to receive the residual coefficient output by the residual coefficient calculation module 12, and is used to transform the residual coefficient to obtain and output transform factor.

所述量化模块14用于接收所述变换模块13所输出的变换系数，并用于对所述变换系数进行量化得到量化系数。The quantization module 14 is used for receiving the transform coefficients output by the transform module 13, and for quantizing the transform coefficients to obtain quantized coefficients.

所述重构描述子存储模块16与编码模块15和预测模块11相连，用于对所述量化系数进行反量化和反变换，得到重构描述子，并存储所述重构描述子。The reconstructed descriptor storage module 16 is connected to the encoding module 15 and the predicting module 11, and is used for performing inverse quantization and inverse transformation on the quantized coefficients to obtain a reconstructed descriptor and store the reconstructed descriptor.

如图6所示，本实施方式所述的视频压缩方法包括：As shown in Figure 6, the video compression method described in this embodiment includes:

步骤S400，对视频原始数据进压缩，得到视频自身码流，并得到视频的运动矢量信息。Step S400, compressing the original video data to obtain the code stream of the video itself, and obtain the motion vector information of the video.

步骤S401，联合视频内容对当前局部特征描述子进行帧间预测得到预测信号。Step S401 , inter-frame prediction is performed on the current local feature descriptor in conjunction with video content to obtain a prediction signal.

本步骤中，对所述当前局部特征描述子进行帧间预测，具体为，在所述视频的已编码的帧中，根据所述运动矢量信息，找到一个与所述当前局部特征描述子最相近的重构描述子，作为预测信号。In this step, inter-frame prediction is performed on the current local feature descriptor, specifically, in the coded frame of the video, according to the motion vector information, find a descriptor that is closest to the current local feature descriptor The reconstructed descriptor of is used as a prediction signal.

步骤S402，计算所述预测信号的残差系数。优选的，本步骤中计算预测信号的残差系数，通过作差的方法进行。Step S402, calculating residual coefficients of the prediction signal. Preferably, in this step, the residual coefficient of the predicted signal is calculated by means of difference.

步骤S403，对所述残差系数进行变换得到变换系数。Step S403, transforming the residual coefficients to obtain transformation coefficients.

本步骤中，优选的，变换方式为DCT作用于二维矩阵的方式得到系数矩阵，从而得到变换系数。In this step, preferably, the transformation method is to obtain a coefficient matrix by applying DCT to a two-dimensional matrix, thereby obtaining transformation coefficients.

步骤S404，对所述残差系数变换后的所述变换系数进行量化并得到量化系数。优选的，这里的量化为标量量化。Step S404: Quantize the transform coefficient after transforming the residual coefficient to obtain a quantized coefficient. Preferably, the quantization here is scalar quantization.

步骤S405，对所述量化系数进行熵编码，完成对视频原始数据中的局部特征描述子的压缩，得到熵编码后的描述子码流。本步骤中的熵编码，优选的，编码模型采用视频编码标准中常用的上下文自适应的二进制熵编码方法(CABAC,Context-Adaptive Binary Arithmetic Coding)，最终形成二值化的描述子码流。Step S405, performing entropy coding on the quantized coefficients, completing the compression of local feature descriptors in the original video data, and obtaining entropy coded descriptor streams. For the entropy coding in this step, preferably, the coding model adopts a context-adaptive binary entropy coding method (CABAC, Context-Adaptive Binary Arithmetic Coding) commonly used in video coding standards, and finally forms a binary descriptor stream.

步骤S406，将所述描述子码流输出到所述视频自身码流中，输出视频码流。Step S406, output the description sub-code stream into the video code stream itself, and output the video code stream.

通过本实施方式所述的视频压缩方法，对视频进行压缩，在联合视频内容的基础上压缩视频局部特征描述子，使压缩后的视频数据得到紧凑的表示，达到的很高的压缩比，从而提高了视频数据的传输速率和存储效率，同时提高了视频数据的检索效率。Through the video compression method described in this embodiment, the video is compressed, and the local feature descriptor of the video is compressed on the basis of joint video content, so that the compressed video data can be represented in a compact manner, and a high compression ratio can be achieved, thereby The transmission rate and storage efficiency of the video data are improved, and the retrieval efficiency of the video data is also improved.

本领域普通技术人员可以理解实现本发明所述方法和系统的全部或部分步骤和模块可以通过硬件来完成，或通过程序指令硬件来完成，该程序可以存储于计算机可读存储介质中，如磁盘、存储器等。Those of ordinary skill in the art can understand that all or part of the steps and modules for realizing the method and system of the present invention can be completed by hardware, or by program instruction hardware, and the program can be stored in a computer-readable storage medium, such as a disk , storage, etc.

应当理解的是，本发明的上述具体实施方式仅仅用于示例性说明或解释本发明的原理，而不构成对本发明的限制。因此，在不偏离本发明的精神和范围的情况下所做的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。此外，本发明所附权利要求旨在涵盖落入所附权利要求范围和边界、或者这种范围和边界的等同形式内的全部变化和修改例。It should be understood that the above specific embodiments of the present invention are only used to illustrate or explain the principle of the present invention, and not to limit the present invention. Therefore, any modification, equivalent replacement, improvement, etc. made without departing from the spirit and scope of the present invention shall fall within the protection scope of the present invention. Furthermore, it is intended that the appended claims of the present invention embrace all changes and modifications that come within the scope and metesques of the appended claims, or equivalents of such scope and metes and bounds.

Claims

1. a compression method of video local feature descriptor, is characterized in that, described method comprises the steps:

Combine video content to perform inter-frame prediction on the current local feature descriptor to obtain a prediction signal;

calculating residual coefficients of said predicted signal;

Quantizing the residual coefficients to obtain quantized coefficients;

Entropy encoding is performed on the quantized coefficients to output a code stream.

2. the compression method of video local feature descriptor according to claim 1, is characterized in that, after described calculating residual coefficient, before obtaining quantization coefficient, described method also comprises:

transforming the residual coefficients to obtain transformation coefficients;

Quantizing the residual coefficients to obtain quantized coefficients, and further, quantizing the transformation coefficients to obtain quantized coefficients.

3. The method for compressing video local feature descriptors according to claim 1, wherein the method further comprises: performing inverse quantization and inverse transformation on the quantization coefficients to obtain reconstruction descriptors, and storing the Refactoring descriptors.

4. The method for compressing video local feature descriptors according to claim 3, wherein the joint video content performs inter-frame prediction on the current local feature descriptor to obtain a prediction signal, further comprising: In the coded previous frame of the frame where the descriptor is located, a reconstructed descriptor closest to the current local feature descriptor is found as a prediction signal.

5. The method for compressing video local feature descriptors according to claim 4, wherein said quantization is scalar quantization or vector quantization.

6. A compression system for video local feature descriptors, characterized in that the system includes: a prediction module, a residual coefficient calculation module, a quantization module, and an encoding module

The prediction module is used to perform inter-frame prediction on the current local feature descriptor in conjunction with video content to obtain a prediction signal;

The residual coefficient calculation module is connected to the prediction module, and is used to calculate the residual coefficient of the prediction signal;

The quantization module is connected to the residual coefficient calculation module, and is used to quantize the residual coefficient to obtain a quantized coefficient;

The coding module is connected to the quantization module, and is used for performing entropy coding on the quantization coefficients and outputting a code stream.

7. the compression system of video local feature descriptor according to claim 6, is characterized in that, described system also comprises transformation module, and described transformation module is connected with described residual coefficient calculation module and quantization module, is used for Transforming the residual coefficients to obtain transformation coefficients;

The quantization module is also used to quantize the transform coefficients and obtain quantized coefficients.

8. the compression system of video local feature descriptor according to claim 6, is characterized in that, described system also comprises: Reconstruction descriptor storage module, is used for dequantization and inverse transformation are carried out to described quantization coefficient, obtains reconstruct the descriptor, and store the reconstructed descriptor.

9. The compression system of the video local feature descriptor according to claim 8, wherein the prediction module is further used to find a coded previous frame of the frame where the current local feature descriptor is located. The reconstructed descriptor closest to the current local feature descriptor is used as a prediction signal.

10. A video compression method, characterized in that the method comprises:

Compress the original data of the video to obtain the code stream of the video itself;

In combination with the content of the video, the local feature descriptor in the original video data is compressed by using the compression method of the video local feature descriptor described in any one of claims 1 to 5 to obtain an entropy-coded descriptor stream ;

Output the description sub-code stream into the video code stream itself, and output the video code stream.