WO2020019279A1

WO2020019279A1 - Video compression method and apparatus, computer system, and mobile device

Info

Publication number: WO2020019279A1
Application number: PCT/CN2018/097343
Authority: WO
Inventors: 朱磊; 高修峰; 林茂疆
Original assignee: 深圳市大疆创新科技有限公司
Priority date: 2018-07-27
Filing date: 2018-07-27
Publication date: 2020-01-30
Also published as: CN110771167A

Abstract

Disclosed are a video compression method and apparatus, a computer system, and a mobile device. The method comprises: obtaining a first quantization matrix, wherein a quantization parameter of a high frequency component in the first quantization matrix is smaller than a quantization parameter of a low frequency component therein, and the frequency corresponding to the quantization parameter of the high frequency component is higher than the frequency corresponding to the quantization parameter of the low frequency component; and compressing a first video according to the first quantization matrix. The technical solution of embodiments of the present application can improve the video compression efficiency.

Description

Video compression method, device, computer system and mobile device

Copyright statement

The content disclosed in this patent document contains material which is subject to copyright protection. The copyright is owned by the copyright owner. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the official records and archives of the Patent and Trademark Office.

Technical field

The present application relates to the field of video processing, and more particularly, to a method, an apparatus, a computer system, and a mobile device for video compression.

Background technique

Video content analysis is the use of visual algorithms to analyze the saved video in order to use the video content analysis results for corresponding applications. For example, the video content analysis of the video stored in the black box of the flight system can be used to analyze the cause of the accident.

In order to ensure the accuracy of the analysis results of the video content, a higher quality video is required. However, higher quality video is usually lossless or low compression, which in turn increases resource requirements.

Therefore, there is a need for a video compression method suitable for video content analysis to improve the efficiency of video compression.

Summary of the Invention

The embodiments of the present application provide a method, a device, a computer system, and a mobile device for video compression, which can improve the efficiency of video compression.

In a first aspect, a video compression method is provided, including: obtaining a first quantization matrix, wherein the quantization parameter of a high frequency component in the first quantization matrix is smaller than the quantization parameter of a low frequency component, and the quantization parameter of the high frequency component The corresponding frequency is higher than the frequency corresponding to the quantization parameter of the low-frequency component; the first video is compressed according to the first quantization matrix.

In a second aspect, an apparatus for video compression is provided, including: an acquisition module for acquiring a first quantization matrix, wherein a quantization parameter of a high frequency component in the first quantization matrix is smaller than a quantization parameter of a low frequency component, and the high frequency component The frequency corresponding to the quantization parameter of is higher than the frequency corresponding to the quantization parameter of the low-frequency component; a compression module is configured to compress the first video according to the first quantization matrix. .

According to a third aspect, a computer system is provided, including: a memory for storing computer-executable instructions; a processor for accessing the memory and executing the computer-executable instructions to perform the method of the first aspect Operation.

According to a fourth aspect, a mobile device is provided, including: the video compression device of the second aspect; or the computer system of the third aspect.

According to a fifth aspect, a computer storage medium is provided, and the computer storage medium stores program code, where the program code may be used to instruct execution of the method of the first aspect.

The technical solution of the embodiment of the present application compresses a video according to a quantization matrix with a quantization parameter of a high-frequency component smaller than a quantization parameter of a low-frequency component, and can compress a video while satisfying a video content analysis requirement, thereby improving the efficiency of video compression.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is an architecture diagram of a technical solution to which an embodiment of the present application is applied.

FIG. 2 is a schematic diagram of data to be processed according to an embodiment of the present application.

FIG. 3 is a schematic diagram of a coding framework according to an embodiment of the present application.

FIG. 4 is a schematic architecture diagram of a mobile device according to an embodiment of the present application.

FIG. 5 is a schematic flowchart of a video compression method according to an embodiment of the present application.

FIG. 6 is a schematic block diagram of a video compression apparatus according to an embodiment of the present application.

FIG. 7 is a schematic block diagram of a video compression apparatus according to another embodiment of the present application.

FIG. 8 is a schematic block diagram of a computer system according to an embodiment of the present application.

detailed description

The technical solutions in the embodiments of the present application will be described below with reference to the drawings.

It should be understood that the specific examples in this document are only to help those skilled in the art to better understand the embodiments of the present application, but not to limit the scope of the embodiments of the present application.

It should also be understood that the formulas in the embodiments of the present application are merely examples, and do not limit the scope of the embodiments of the present application. Each formula can be modified, and these modifications should also belong to the scope of protection of the present application.

It should also be understood that, in the various embodiments of the present application, the size of the sequence number of each process does not mean the order of execution. The execution order of each process should be determined by its function and internal logic, and should not deal with the embodiments of the present application. The implementation process constitutes any limitation.

It should also be understood that the various embodiments described in this specification may be implemented individually or in combination, which is not limited in the examples of the present application.

Unless otherwise stated, all technical and scientific terms used in the examples of this application have the same meanings as commonly understood by those skilled in the technical field of this application. The terminology used in this application is for the purpose of describing specific embodiments only and is not intended to limit the scope of the application. The term "and / or" as used herein includes any and all combinations of one or more of the associated listed items.

As shown in FIG. 1, the system 100 may receive data to be processed 102, process the data to be processed 102, and generate processed data 108. For example, the system 100 may receive a video to be compressed, and perform processing such as encoding on the video to be compressed to compress the video. In some embodiments, the components in the system 100 may be implemented by one or more processors, which may be processors in a computing device or processors in a removable device (such as a drone). The processor may be any kind of processor, which is not limited in the embodiment of the present application. In some possible designs, the processor may include an encoder, a decoder, or a codec. The system 100 may also include one or more memories. The memory may be used to store instructions and data, for example, computer-executable instructions that implement the technical solutions of the embodiments of the present application, data to be processed 102, data 108 to be processed, and the like. The memory may be any kind of memory, which is not limited in the embodiment of the present application.

As shown in FIG. 2, the data to be processed 202 may include a plurality of frames 204. For example, multiple frames 204 may represent consecutive image frames in a video stream. Each frame 204 may include one or more tiles or tiles 206. Each slice or tile 206 may include one or more macroblocks or coding units 208. Each macroblock or coding unit 208 may include one or more blocks 210. Each block 210 may include one or more pixels 212. Each pixel 212 may include one or more data sets corresponding to one or more data portions, such as a luminance data portion and a chrominance data portion. The data unit may be a frame, a slice, a tile, a coding unit, a macro block, a block, a pixel, or a group of any of the above. In different embodiments, the size of the data unit may vary. As an example, a frame 204 may include 100 slices 206, each slice 206 may include 10 macro blocks 208, each macro block 208 may include 4 (for example, 2x2) blocks 210, and each block 210 may include 64 (e.g., 8x8) pixels 212.

In order to reduce the bandwidth occupied by video storage and transmission, the video data needs to be encoded and compressed. Any suitable encoding technique can be used to encode the data to be encoded. The type of encoding depends on the data being encoded and the specific encoding requirements.

In some embodiments, the encoder may implement one or more different codecs. Each codec can include code, instructions, or computer programs that implement different encoding algorithms. Based on various factors, including the type and / or source of the data to be encoded, the receiving entity of the encoded data, available computing resources, network environment, business environment, rules and standards, etc., a suitable encoding algorithm can be selected to encode a given Data to be encoded.

For example, the encoder may be configured to encode a series of video frames. A series of steps can be used to encode the data in each frame. In some embodiments, the encoding step may include processing steps such as prediction, transform, quantization, and entropy encoding.

Prediction includes two types of intra prediction and inter prediction, the purpose of which is to remove redundant information of the current image block to be encoded by using prediction block information. Intra prediction uses the information of the image in this frame to obtain prediction block data. Inter prediction uses the information of the reference frame to obtain prediction block data, and the process includes dividing the image block to be encoded into several sub-image blocks; then, for each sub-image block, searching in the reference image for an image that best matches the current sub-image block The block is used as a prediction block; thereafter, the sub-pixel block is subtracted from the corresponding pixel value of the prediction block to obtain a residual, and the residuals corresponding to the obtained sub-image blocks are combined to obtain the residual of the image block.

Using a transformation matrix to transform the residual block of the image can remove the correlation of the residual of the image block, that is, remove the redundant information of the image block in order to improve the coding efficiency. The transformation of the data block in the image block usually uses two-dimensional transformation. That is, at the encoding end, the residual information of the data block is respectively multiplied with an NxM transform matrix and its transpose matrix, and the transform coefficients are obtained after the multiplication. Transform coefficients can be quantized to obtain quantized coefficients. Finally, the quantized coefficients are entropy encoded. Finally, the bit stream obtained by entropy encoding and the encoding mode information after encoding are performed, such as intra prediction mode and motion vector information. Store or send to the decoder. At the decoding side of the image, the entropy-coded bitstream is first obtained and then the entropy decoding is performed to obtain the corresponding residuals. According to the predicted image block corresponding to the information image block such as the motion vector or intra prediction obtained by the decoding, according to the predicted image block and the image block Residual to get the value of each pixel in the current sub-image block.

As shown in Figure 3, when inter prediction is used, the encoding process can be as follows:

In 301, a current frame image is acquired. In 302, a reference frame image is acquired. In 303a, a reference frame image is used to perform motion estimation to obtain a motion vector (MV) of each image block of the current frame image. In 304a, the motion vector obtained by the motion estimation is used to perform motion compensation to obtain an estimated value / predicted value of the current image block. In 305, the estimated value / predicted value of the current image block is subtracted from the current image block to obtain a residual error. In 306, the residuals are transformed to obtain transformation coefficients. In 307, the transform coefficient is quantized to obtain a quantized coefficient. In 308, the quantized coefficients are subjected to entropy encoding, and finally the bit stream obtained by entropy encoding and the encoding mode information after encoding are stored or sent to the decoding end. In 309, the quantized result is inversely quantized. In 310, the inverse quantization result is inversely transformed. In 311, the inverse transform result and the motion compensation result are used to obtain a reconstructed pixel. In 312, the reconstructed pixels are filtered (ie, video compressed). In 313, the filtered reconstructed pixels are output. Subsequently, the reconstructed image can be used as a reference frame image for other frame images for inter-frame prediction.

When using intra prediction, the encoding process can be as follows:

In 302, a current frame image is acquired. In 303b, intra prediction selection is performed on the current frame image. In 304b, the current image block in the current frame is intra-predicted. In 305, the estimated value of the current image block is subtracted from the current image block to obtain a residual. In 306, the residuals of the image blocks are transformed to obtain transformation coefficients. In 307, the transform coefficient is quantized to obtain a quantized coefficient. In 308, the quantized coefficients are subjected to entropy encoding, and finally the bit stream obtained by the entropy encoding and the encoding mode information after encoding are stored or sent to the decoding end. In 309, the quantization result is inversely quantized. In 310, the inverse quantization result is inversely transformed. In 311, the inverse transform result and the intra prediction result are used to obtain a reconstructed pixel. The reconstructed image block can be used for intra prediction of the next image block.

For the decoding end, operations corresponding to the encoding end are performed. First, the residual information is obtained by using entropy decoding, inverse quantization and inverse transformation, and it is determined whether the current image block uses intra prediction or inter prediction according to the decoded code stream. If it is intra prediction, use the reconstructed image blocks in the current frame to construct prediction information according to the intra prediction method; if it is inter prediction, you need to parse out the motion information, and use the parsed motion information to reconstruct the image The reference block is determined in order to obtain the prediction information. Next, the prediction information and the residual information are superimposed, and the reconstruction information can be obtained after the filtering operation.

The technical solution of the embodiment of the present application mainly relates to a quantization step in the encoding process, that is, to improve the compression efficiency of a video through improvement in the quantization step, and other steps may refer to related steps in the encoding process.

The technical solution of video compression in the embodiments of the present application can be applied to video content analysis (visual analysis), but the embodiments of the present application are not limited thereto.

In some designs, the mobile device may use the technical solution of the embodiment of the present application to compress the captured video. The movable device may be an unmanned aerial vehicle, an unmanned boat, an autonomous vehicle, a robot, or an aerial photography vehicle, but the embodiment of the present application is not limited thereto.

FIG. 4 is a schematic architecture diagram of a mobile device 400 according to an embodiment of the present application.

As shown in FIG. 4, the mobile device 400 may include a power system 410, a control system 420, a sensing system 430, and a processing system 440.

The power system 410 is used to power the mobile device 400.

Taking a drone as an example, the power system of the drone may include an electronic governor (referred to as an ESC), a propeller, and a motor corresponding to the propeller. The motor is connected between the electronic governor and the propeller, and the motor and the propeller are arranged on the corresponding arm; the electronic governor is used to receive the driving signal generated by the control system and provide the driving current to the motor according to the driving signal to control the Rotating speed. The motor is used to drive the propellers to rotate, thereby powering the drone's flight.

The sensing system 430 may be used to measure the posture information of the mobile device 400, that is, the position information and status information of the mobile device 400 in space, such as three-dimensional position, three-dimensional angle, three-dimensional velocity, three-dimensional acceleration, and three-dimensional angular velocity. The sensing system 430 may include, for example, at least one of a gyroscope, an electronic compass, an Inertial Measurement Unit (IMU), a vision sensor, a Global Positioning System (GPS), a barometer, and an airspeed meter. Species.

In the embodiment of the present application, the sensing system 430 may also be used to acquire an image, that is, the sensing system 430 includes a sensor for acquiring an image, such as a camera.

The control system 420 is used to control the movement of the mobile device 400. The control system 420 may control the mobile device 400 according to a preset program instruction. For example, the control system 420 may control the movement of the mobile device 400 according to the posture information of the mobile device 400 measured by the sensing system 430. The control system 420 may also control the mobile device 400 according to a control signal from a remote controller. For example, for a drone, the control system 420 may be a flight control system (flight control), or a control circuit in the flight control.

The processing system 440 may process images acquired by the sensing system 430. For example, the processing system 440 may be an image signal processing (Image Signal Processing, ISP) type chip.

The processing system 440 may be the system 100 in FIG. 1, or the processing system 440 may include the system 100 in FIG. 1.

It should be understood that the foregoing division and naming of each component of the mobile device 400 is merely exemplary, and should not be construed as limiting the embodiments of the present application.

It should also be understood that the mobile device 400 may further include other components not shown in FIG. 4, which is not limited in the embodiment of the present application.

FIG. 5 shows a schematic flowchart of a video compression method 500 according to an embodiment of the present application. The method 500 may be performed by the system 100 shown in FIG. 1; or may be performed by the mobile device 400 shown in FIG. 4. Specifically, when executed by the mobile device 400, it may be executed by the processing system 440 in FIG. 4.

510: Obtain a first quantization matrix, wherein the quantization parameter of the high-frequency component in the first quantization matrix is smaller than the quantization parameter of the low-frequency component, and the frequency corresponding to the quantization parameter of the high-frequency component is higher than the quantization parameter of the low-frequency component. Frequency of.

Quantization implements video compression by dividing the transformed transform coefficients by the corresponding quantization step size. The quantization step size is indicated by a quantization parameter in the quantization matrix. The quantization matrix includes quantization parameters corresponding to each frequency, and the quantization parameters of different frequencies may be different to achieve selective energy region loss. The larger the quantization parameter, the larger the quantization step size, and the larger the compression ratio.

In the embodiment of the present application, the quantization parameter of the high frequency component in the first quantization matrix used is smaller than the quantization parameter of the low frequency component. That is, in the technical solution of the embodiment of the present application, low-frequency information is more lost, and high-frequency information is retained.

In the analysis of video content, visual algorithms need to clearly retain high-frequency area information such as the edges, textures, and structures of visual objects. For low-frequency areas, such as blue sky, white clouds, and large walls, the visual significance is small because It does not provide any amount of information. Therefore, in the embodiment of the present application, according to the above characteristics of the visual algorithm, an asymmetric quantization strategy is adopted, and more low-frequency information is lost, while retaining high-frequency information. This can achieve a lower code rate, a longer compression time, and Can guarantee the reduction and effectiveness of visual algorithms.

520: Compress the first video according to the first quantization matrix.

The first video is quantized by using the first quantization matrix in the embodiment of the present application. Specifically, in the quantization process, each frequency component (transformation coefficient) is divided by a quantization step size indicated by a corresponding quantization parameter. Because the quantization parameter of the high frequency component in the first quantization matrix is smaller than the quantization parameter of the low frequency component, that is, the quantization step size used for the high frequency component is small, and the quantization step size used for the low frequency component is large. Therefore, after quantization, the compression ratio of the high frequency component Small, compression loss is small, low frequency component compression ratio is large, compression loss is large. Using this asymmetric quantization strategy can not only achieve a higher compression rate as a whole, increase the compression time, but also meet the quality requirements of video content analysis. Therefore, the video compression method in the embodiments of the present application can improve the efficiency of video compression.

Optionally, different quantization matrices can be configured for different scenarios. That is, based on the basic setting that the quantization parameter of the high frequency component is smaller than the quantization parameter of the low frequency component, the quantization matrix of different scenes can be set with different quantization parameters.

Specifically, multiple quantization matrices can be pre-configured, and each scene corresponds to a quantization matrix. In this case, the first quantization matrix corresponding to the scene of the first video may be selected from multiple quantization matrices according to the scene of the first video.

For example, a quantization matrix can be configured for each scene of video content analysis. When compressing a video, a corresponding quantization matrix is found according to the scene of the video, and the quantization matrix is used for video compression.

It should be understood that the correspondence between the scene and the quantization matrix may be one-to-one or one-to-many, which is not limited in this application.

Optionally, in an embodiment of the present application, the multiple quantization matrices may be determined according to video samples of multiple scenes.

Specifically, for each scene, a quantization matrix corresponding to the scene may be trained in advance based on video samples of the scene. The training process may be to gradually adjust the quantization parameters in the quantization matrix, and finally obtain a quantization matrix that meets the requirements.

Optionally, in an embodiment of the present application, for specific scenes in the multiple scenes, the quantization parameters in the initial quantization matrix are adjusted according to the specific video samples of the specific scenes and the difference threshold of the video content analysis result to obtain A specific quantization matrix corresponding to the specific scene, wherein a difference between a video content analysis result corresponding to the specific video sample compressed by using the specific quantization matrix and a video content analysis result corresponding to the uncompressed specific video sample is not different Greater than the difference threshold of the video content analysis result.

The difference threshold of the video content analysis result is the difference threshold of the video content analysis result that can be tolerated when the video content analysis is performed. That is to say, the difference between the video content analysis result obtained from the video content analysis based on the compressed video and the video content analysis result obtained from the video content analysis based on the uncompressed video is within the threshold of the video content analysis result difference. The video content analysis results obtained from the compressed video meet the requirements. In this way, when determining a specific quantization matrix corresponding to a specific scene, based on the initial quantization matrix, the quantization parameters in the initial quantization matrix can be adjusted, the specific video samples are compressed with the adjusted quantization matrix, and then video content analysis is performed. Compare the obtained video content analysis result with the video content analysis result corresponding to the uncompressed specific video sample. If the difference between the two is greater than the difference threshold of the video content analysis result, continue to adjust, if the difference is not greater than the difference of the video content analysis result The threshold is adjusted, and the adjusted quantization matrix is a specific quantization matrix corresponding to the specific scene.

The above initial quantization matrix may be a standard quantization matrix, or may be another quantization matrix, for example, a predefined quantization matrix, a quantization matrix used before, and the like, which are not limited in this embodiment of the present application.

Optionally, the quantization parameters in the initial quantization matrix may be adjusted in the following manner:

For each quantization parameter in the first set of the initial quantization matrix, adjust in the direction of increasing the quantization parameter; if the adjusted quantization matrix is used to compress the video content analysis result corresponding to the specific video sample and the uncompressed The difference of the video content analysis result corresponding to the specific video sample converges, then the next quantization parameter is adjusted; wherein the first set includes M quantization parameters in the initial quantization matrix, and the M quantization parameters correspond to The lowest M frequencies, M is a predetermined value;

For each quantization parameter in the second set of the initial quantization matrix, adjust in the direction of reducing the quantization parameter; if the adjusted quantization matrix is used to compress the video content analysis result corresponding to the specific video sample and the uncompressed If the difference between the video content analysis results corresponding to the specific video sample converges and is greater than the video content analysis result difference threshold, the next quantization parameter is adjusted; if the adjusted specific quantization matrix is used to compress the specific video sample, the corresponding The difference between the video content analysis result and the uncompressed video content analysis result corresponding to the specific video sample is not greater than the video content analysis result difference threshold, then the adjustment is stopped to obtain the specific quantization matrix; wherein the second set Including other quantization parameters in the initial quantization matrix except the M quantization parameters.

Specifically, for different quantization parameters in the initial quantization matrix, different adjustment directions may be adopted. In this embodiment, for the quantization parameters of the low frequency part (quantization parameters in the first set), adjustment in the direction of increasing the quantization parameters is adopted, and for the quantization parameters of the high frequency part (quantization parameters in the second set), Adjust in the direction of decreasing the quantization parameter. For example, the above-mentioned M may be 4, that is, for the DC component and the first three AC main components, it is adjusted in the direction of increasing the quantization parameter, and the other AC components are adjusted in the direction of decreasing the quantization parameter. In the adjustment process, the quantization parameters in the first set are adjusted first. Because the adjustment is in the direction of increasing the quantization parameters, the adjusted video content analysis results will be worse, that is, the above differences will become larger. When the differences converge (that is, the differences are basically When it does not change), adjust the next quantization parameter.

Optionally, the quantization parameters in the first set may be adjusted in order from low to high frequency.

After adjusting the quantization parameters in the first set, adjust the quantization parameters in the second set. Because the adjustment is to reduce the quantization parameters, the adjusted video content analysis results will be better, that is, the above differences will become smaller. Before the difference is not greater than the difference threshold of the video content analysis result, after adjusting to the convergence of the difference, the next quantization parameter is adjusted until the difference is not greater than the difference threshold of the video content analysis result. The adjusted quantization matrix is a specific quantization matrix corresponding to the specific scene.

Optionally, the quantization parameters in the second set may be adjusted in order of frequency from low to high, or adjusted in order of frequency from high to low, or may be adjusted in predetermined order. Optionally, the predetermined order may be related to the degree of dependence of the video content analysis on each frequency component, for example, it may be adjusted in the order of the degree of dependence from high to low, but this embodiment of the present application is not limited thereto.

Optionally, the adjustment of the initial quantization matrix may be adjusted by a scaling matrix. For example, you can adjust based on the following formula:

Q ’= Scl * Qr

Among them, Qr is an initial quantization matrix, Scl is a scaling matrix, and Q 'is a quantization matrix finally used for quantization.

Scl includes a scaling factor corresponding to each quantization parameter. Each quantization parameter is scaled by Scl to adjust its quantization intensity. In this case, the process of adjusting the quantization parameters is the process of adjusting the scaling factor corresponding to each quantization parameter. The scaling factor corresponding to each quantization parameter is determined, that is, the scaling matrix Scl is determined, and thus the quantization matrix Q 'is determined.

For example, assuming that the difference threshold of the tolerable video content analysis result is T, the scaling matrix Scl can be obtained in the following manner.

Obtain a video (sample) of a scene analyzed by video content;

Perform visual content analysis on the original uncompressed video with a visual algorithm to obtain the original analysis results;

Use Q '= Qr to compress the original video, and then use visual algorithms to perform video content analysis on the decompressed video to obtain the analysis results, calculate the fidelity (difference) between the current analysis result and the original analysis result, and define it as g, if g <T, there is no need to adjust, otherwise, continue with the following adjustments;

Adjust the scaling factors one by one in Scl (the initial value of each scaling factor can be 1). Taking the adjustment from the lowest frequency as an example, adjust to g convergence, or g <T; if g <T, stop adjusting; if g convergence , That is, the difference in the analysis results of the video content between the two adjustments is small, for example, <0.01T, the adjustment of the current frequency is stopped, and the adjustment of the next frequency is started.

For example, in the above adjustment, the following adjustment directions can be adopted:

The first component: the DC component, which is adjusted in the direction of the amplification and quantization parameter, that is, the zoom factor is greater than 1;

The second, third, and fourth components: the AC main component, which is adjusted in the direction of the enlarged quantization parameter, that is, the scaling factor is greater than 1; if the g distortion is large during the first adjustment, the adjustment of the quantization parameter of the corresponding component can be abandoned;

The remaining components are adjusted in the direction of reducing the quantization parameter, that is, the scaling factor is <1.

Finally, through the above adjustment, an Scl matrix with g <T is obtained, thereby obtaining a quantization matrix Q '.

After the first video is compressed according to the first quantization matrix, subsequent video content analysis may be performed based on the compressed first video.

The above quantization matrix is used for video compression, which retains high-frequency information such as edges, structures, and textures that are meaningful for video content analysis, and simultaneously reduces low-frequency energy, achieving the purpose of compression rate and the quality of compression for offline analysis effectiveness.

Therefore, the technical solution of the embodiment of the present application compresses a video according to a quantization matrix with a quantization parameter of a high-frequency component smaller than a quantization parameter of a low-frequency component, and can compress the video under a condition that satisfies video content analysis requirements, thereby improving effectiveness.

The video compression method in the embodiment of the present application is described in detail above. The video compression device, computer system, and mobile device in the embodiment of the present application will be described below.

FIG. 6 shows a schematic block diagram of a video compression apparatus 600 according to an embodiment of the present application. The apparatus 600 may execute the video compression method in the embodiment of the present application.

As shown in FIG. 6, the apparatus 600 may include:

The obtaining module 610 is configured to obtain a first quantization matrix, wherein a quantization parameter of a high frequency component in the first quantization matrix is smaller than a quantization parameter of a low frequency component, and a frequency corresponding to the quantization parameter of the high frequency component is higher than the low frequency component. The corresponding frequency of the quantization parameter;

A compression module 620 is configured to compress a first video according to the first quantization matrix.

Optionally, in the embodiment of the present application, the compressed first video is used for video content analysis.

Optionally, in the embodiment of the present application, the obtaining module 610 is configured to:

According to the scene of the first video, the first quantization matrix corresponding to the scene of the first video is selected from a plurality of quantization matrices.

Optionally, in the embodiment of the present application, as shown in FIG. 7, the apparatus 600 further includes:

A configuration module 630 is configured to pre-configure the multiple quantization matrices.

Optionally, in the embodiment of the present application, the configuration module 630 is configured to:

The multiple quantization matrices are determined according to video samples of multiple scenes.

For a specific scene in the multiple scenes, adjusting a quantization parameter in an initial quantization matrix according to a difference threshold of a specific video sample and a video content analysis result of the specific scene to obtain a specific quantization matrix corresponding to the specific scene,

Wherein, a difference between a video content analysis result corresponding to the specific video sample compressed by using the specific quantization matrix and a video content analysis result corresponding to the uncompressed specific video sample is not greater than a difference threshold of the video content analysis result.

Optionally, in the embodiment of the present application, the configuration module 630 is configured to adjust the quantization parameters in the first set in a descending order of frequency.

Optionally, in the embodiment of the present application, the configuration module 630 is configured to adjust the quantization parameters in the second set according to a frequency from low to high or from high to low.

Optionally, in the embodiment of the present application, the configuration module 630 is configured to adjust the quantization parameters in the second set in a predetermined order.

Optionally, in the embodiment of the present application, the M is 4.

Optionally, in the embodiment of the present application, the initial quantization matrix is a standard quantization matrix

It should be understood that the foregoing video compression device in the embodiment of the present application may be a chip, which may be specifically implemented by a circuit, but the embodiment of the present application does not limit a specific implementation form.

An embodiment of the present invention further provides an encoder, and the encoder includes the foregoing video compression apparatus according to various embodiments of the present invention.

FIG. 8 shows a schematic block diagram of a computer system 800 according to an embodiment of the present application.

As shown in FIG. 8, the computer system 800 may include a processor 810 and a memory 820.

It should be understood that the computer system 800 may also include components generally included in other computer systems, such as input-output devices, communication interfaces, and the like, which is not limited in the embodiments of the present application.

The memory 820 is configured to store computer-executable instructions.

The memory 820 may be various types of memory, for example, it may include high-speed random access memory (Random Access Memory, RAM), and may also include non-volatile memory (non-volatile memory), such as at least one magnetic disk memory. Examples are not limited to this.

The processor 810 is configured to access the memory 820 and execute the computer-executable instructions to perform operations in the video compression method in the embodiment of the present application.

The processor 810 may include a microprocessor, a Field-Programmable Gate Array (FPGA), a Central Processing Unit (CPU), and a Graphics Processing Unit (GPU). Examples are not limited to this.

The embodiment of the present application further provides a movable device, and the movable device may include the video compression device or the computer system of the various embodiments of the present application described above.

The video compression device, computer system, and mobile device according to the embodiments of the present application may correspond to the execution subject of the video compression method according to the embodiments of the present application. And other operations and / or functions are respectively used to implement the corresponding processes of the foregoing methods, and for the sake of brevity, they are not repeated here.

An embodiment of the present application further provides a computer storage medium, and the computer storage medium stores a program code, where the program code may be used to instruct to perform the foregoing video compression method in the embodiment of the present application.

It should be understood that, in the embodiments of the present application, the term “and / or” is merely an association relationship describing an associated object, and indicates that there may be three relationships. For example, A and / or B can indicate that there are three cases in which A exists alone, A and B exist, and B exists alone. In addition, the character "/" in this article generally indicates that the related objects are an "or" relationship.

Those of ordinary skill in the art may realize that the units and algorithm steps of each example described in combination with the embodiments disclosed herein can be implemented by electronic hardware, computer software, or a combination of the two. In order to clearly illustrate the hardware and software Interchangeability. In the above description, the composition and steps of each example have been described generally in terms of functions. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Professional technicians can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of this application.

Those skilled in the art can clearly understand that, for the convenience and brevity of the description, the specific working processes of the systems, devices, and units described above can refer to the corresponding processes in the foregoing method embodiments, and are not repeated here.

In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods may be implemented in other ways. For example, the device embodiments described above are only schematic. For example, the division of the unit is only a logical function division. In actual implementation, there may be another division manner. For example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may also be electrical, mechanical or other forms of connection.

The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions in the embodiments of the present application.

In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist separately physically, or two or more units may be integrated into one unit. The above integrated unit may be implemented in the form of hardware or in the form of software functional unit.

When the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application is essentially a part that contributes to the existing technology, or all or part of the technical solution may be embodied in the form of a software product, which is stored in a storage medium Included are instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method described in the embodiments of the present application. The foregoing storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks and other media that can store program codes .

The above is only a specific implementation of this application, but the scope of protection of this application is not limited to this. Any person skilled in the art can easily think of various equivalents within the technical scope disclosed in this application. Modifications or replacements, and these modifications or replacements should be covered by the protection scope of this application. Therefore, the protection scope of this application shall be subject to the protection scope of the claims.

Claims

A video compression method, comprising:

A first quantization matrix is obtained, wherein the quantization parameter of the high frequency component in the first quantization matrix is smaller than the quantization parameter of the low frequency component, and the frequency corresponding to the quantization parameter of the high frequency component is higher than the frequency corresponding to the quantization parameter of the low frequency component. ;

Compress the first video according to the first quantization matrix.
The method according to claim 1, further comprising:

Perform video content analysis based on the compressed first video.
The method according to claim 1 or 2, wherein the acquiring the first quantization matrix comprises:

According to the scene of the first video, the first quantization matrix corresponding to the scene of the first video is selected from a plurality of quantization matrices.
The method according to claim 3, further comprising:

The plurality of quantization matrices are pre-configured.
The method according to claim 4, wherein the pre-configuring the plurality of quantization matrices comprises:

The multiple quantization matrices are determined according to video samples of multiple scenes.
The method according to claim 5, wherein the determining the multiple quantization matrices based on video samples of multiple scenes comprises:

For a specific scene in the multiple scenes, adjusting a quantization parameter in an initial quantization matrix according to a difference threshold of a specific video sample and a video content analysis result of the specific scene to obtain a specific quantization matrix corresponding to the specific scene,

Wherein, a difference between a video content analysis result corresponding to the specific video sample compressed by using the specific quantization matrix and a video content analysis result corresponding to the uncompressed specific video sample is not greater than a difference threshold of the video content analysis result.
The method according to claim 6, wherein the adjusting a quantization parameter in an initial quantization matrix according to a difference threshold of a specific video sample and a video content analysis result of the specific scene comprises:

For each quantization parameter in the first set of the initial quantization matrix, adjust in the direction of increasing the quantization parameter; if the adjusted quantization matrix is used to compress the video content analysis result corresponding to the specific video sample and the uncompressed The difference of the video content analysis result corresponding to the specific video sample converges, then the next quantization parameter is adjusted; wherein the first set includes M quantization parameters in the initial quantization matrix, and the M quantization parameters correspond to The lowest M frequencies, M is a predetermined value;

For each quantization parameter in the second set of the initial quantization matrix, adjust in the direction of reducing the quantization parameter; if the adjusted quantization matrix is used to compress the video content analysis result corresponding to the specific video sample and the uncompressed If the difference between the video content analysis results corresponding to the specific video sample converges and is greater than the video content analysis result difference threshold, the next quantization parameter is adjusted; if the adjusted specific quantization matrix is used to compress the specific video sample, the corresponding The difference between the video content analysis result and the uncompressed video content analysis result corresponding to the specific video sample is not greater than the video content analysis result difference threshold, then the adjustment is stopped to obtain the specific quantization matrix; wherein the second set Including other quantization parameters in the initial quantization matrix except the M quantization parameters.
The method according to claim 7, wherein the quantization parameters in the first set are adjusted in order from low to high frequency.
The method according to claim 7 or 8, wherein the quantization parameters in the second set are adjusted in order of frequency from low to high or from high to low.
The method according to claim 7 or 8, wherein the quantization parameters in the second set are adjusted in a predetermined order.
The method according to any one of claims 7 to 10, wherein the M is 4.
The method according to any one of claims 6 to 11, wherein the initial quantization matrix is a standard quantization matrix.
A video compression device, comprising:

The obtaining module is configured to obtain a first quantization matrix, wherein the quantization parameter of the high frequency component in the first quantization matrix is smaller than the quantization parameter of the low frequency component, and the frequency corresponding to the quantization parameter of the high frequency component is higher than that of the low frequency component. The frequency corresponding to the quantization parameter;

A compression module, configured to compress a first video according to the first quantization matrix.
The apparatus according to claim 13, wherein the compressed first video is used for video content analysis.
The apparatus according to claim 13 or 14, wherein the obtaining module is configured to:

According to the scene of the first video, the first quantization matrix corresponding to the scene of the first video is selected from a plurality of quantization matrices.
The apparatus according to claim 15, further comprising:

A configuration module, configured to pre-configure the multiple quantization matrices.
The apparatus according to claim 16, wherein the configuration module is configured to:

The multiple quantization matrices are determined according to video samples of multiple scenes.
The apparatus according to claim 17, wherein the configuration module is configured to:

For a specific scene in the multiple scenes, adjusting a quantization parameter in an initial quantization matrix according to a difference threshold of a specific video sample and a video content analysis result of the specific scene to obtain a specific quantization matrix corresponding to the specific scene,

Wherein, a difference between a video content analysis result corresponding to the specific video sample compressed by using the specific quantization matrix and a video content analysis result corresponding to the uncompressed specific video sample is not greater than a difference threshold of the video content analysis result.
The apparatus according to claim 18, wherein the configuration module is configured to:

For each quantization parameter in the first set of the initial quantization matrix, adjust in the direction of increasing the quantization parameter; if the adjusted quantization matrix is used to compress the video content analysis result corresponding to the specific video sample and the uncompressed The difference of the video content analysis result corresponding to the specific video sample converges, then the next quantization parameter is adjusted; wherein the first set includes M quantization parameters in the initial quantization matrix, and the M quantization parameters correspond to The lowest M frequencies, M is a predetermined value;

For each quantization parameter in the second set of the initial quantization matrix, adjust in the direction of reducing the quantization parameter; if the adjusted quantization matrix is used to compress the video content analysis result corresponding to the specific video sample and the uncompressed If the difference between the video content analysis results corresponding to the specific video sample converges and is greater than the video content analysis result difference threshold, the next quantization parameter is adjusted; if the adjusted specific quantization matrix is used to compress the specific video sample, the corresponding The difference between the video content analysis result and the uncompressed video content analysis result corresponding to the specific video sample is not greater than the video content analysis result difference threshold, then the adjustment is stopped to obtain the specific quantization matrix; wherein the second set Including other quantization parameters in the initial quantization matrix except the M quantization parameters.
The device according to claim 19, wherein the configuration module is configured to adjust the quantization parameters in the first set in order from low to high frequency.
The device according to claim 19 or 20, wherein the configuration module is configured to adjust the quantization parameters in the second set in a sequence from low to high or high to low.
The device according to claim 19 or 20, wherein the configuration module is configured to adjust the quantization parameters in the second set in a predetermined order.
The device according to any one of claims 19 to 22, wherein the M is 4.
The apparatus according to any one of claims 18 to 23, wherein the initial quantization matrix is a standard quantization matrix.
A computer system, comprising:

Memory for storing computer-executable instructions;

A processor for accessing the memory and executing the computer-executable instructions to perform operations in the method according to any one of claims 1 to 12.
A movable device, comprising:

The device according to any one of claims 13 to 24; or,

The computer system of claim 25.