CN110771167A - Video compression method, device, computer system and movable equipment - Google Patents

Video compression method, device, computer system and movable equipment Download PDF

Info

Publication number
CN110771167A
CN110771167A CN201880038972.6A CN201880038972A CN110771167A CN 110771167 A CN110771167 A CN 110771167A CN 201880038972 A CN201880038972 A CN 201880038972A CN 110771167 A CN110771167 A CN 110771167A
Authority
CN
China
Prior art keywords
quantization
video
content analysis
quantization matrix
analysis result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201880038972.6A
Other languages
Chinese (zh)
Inventor
朱磊
高修峰
林茂疆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SZ DJI Technology Co Ltd
Shenzhen Dajiang Innovations Technology Co Ltd
Original Assignee
Shenzhen Dajiang Innovations Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Dajiang Innovations Technology Co Ltd filed Critical Shenzhen Dajiang Innovations Technology Co Ltd
Publication of CN110771167A publication Critical patent/CN110771167A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • H04N19/126Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation

Abstract

A method, apparatus, computer system and removable device for video compression are disclosed. The method comprises the following steps: acquiring a first quantization matrix, wherein a quantization parameter of a high-frequency component in the first quantization matrix is smaller than a quantization parameter of a low-frequency component, and a frequency corresponding to the quantization parameter of the high-frequency component is higher than a frequency corresponding to the quantization parameter of the low-frequency component; compressing the first video according to the first quantization matrix. According to the technical scheme, the efficiency of video compression can be improved.

Description

Video compression method, device, computer system and movable equipment
Copyright declaration
The disclosure of this patent document contains material which is subject to copyright protection. The copyright is owned by the copyright owner. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the patent and trademark office official records and records.
Technical Field
The present application relates to the field of video processing, and more particularly, to a method, apparatus, computer system, and removable device for video compression.
Background
The video content analysis is to analyze the stored video by using a visual algorithm so as to perform corresponding application by using the video content analysis result. For example, video content analysis is performed on a video stored in a black box of a flight system, and the analysis can be used for accident cause analysis and the like.
In order to ensure the accuracy of the analysis result of the video content, the video is required to have higher quality. However, higher quality video is typically lossless or compressed at a low compression rate, which in turn increases resource requirements.
Therefore, a method of video compression adapted to video content analysis is needed to improve the efficiency of video compression.
Disclosure of Invention
The embodiment of the application provides a video compression method, a video compression device, a computer system and a mobile device, and the video compression efficiency can be improved.
In a first aspect, a method for video compression is provided, including: acquiring a first quantization matrix, wherein a quantization parameter of a high-frequency component in the first quantization matrix is smaller than a quantization parameter of a low-frequency component, and a frequency corresponding to the quantization parameter of the high-frequency component is higher than a frequency corresponding to the quantization parameter of the low-frequency component; compressing the first video according to the first quantization matrix.
In a second aspect, there is provided an apparatus for video compression, comprising: an obtaining module, configured to obtain a first quantization matrix, where a quantization parameter of a high frequency component in the first quantization matrix is smaller than a quantization parameter of a low frequency component, and a frequency corresponding to the quantization parameter of the high frequency component is higher than a frequency corresponding to the quantization parameter of the low frequency component; a compression module to compress the first video according to the first quantization matrix. .
In a third aspect, there is provided a computer system comprising: a memory for storing computer executable instructions; a processor for accessing the memory and executing the computer-executable instructions to perform the operations in the method of the first aspect described above.
In a fourth aspect, there is provided a mobile device comprising: the video compression apparatus of the second aspect; alternatively, the computer system of the third aspect described above.
In a fifth aspect, a computer storage medium is provided, in which program code is stored, the program code being operable to instruct execution of the method of the first aspect.
According to the technical scheme, the video is compressed according to the quantization matrix with the quantization parameter of the high-frequency component smaller than that of the low-frequency component, the video can be compressed under the condition that the requirement of video content analysis is met, and therefore the efficiency of video compression can be improved.
Drawings
Fig. 1 is an architecture diagram of a solution to which an embodiment of the present application is applied.
Fig. 2 is a schematic diagram of data to be processed according to an embodiment of the present application.
Fig. 3 is a schematic diagram of an encoding framework according to an embodiment of the present application.
Fig. 4 is a schematic configuration diagram of a movable apparatus according to an embodiment of the present application.
Fig. 5 is a schematic flow chart of a method of video compression of an embodiment of the present application.
Fig. 6 is a schematic block diagram of an apparatus for video compression according to an embodiment of the present application.
Fig. 7 is a schematic block diagram of an apparatus for video compression according to another embodiment of the present application.
FIG. 8 is a schematic block diagram of a computer system of an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings.
It should be understood that the specific examples are provided herein only to assist those skilled in the art in better understanding the embodiments of the present application and are not intended to limit the scope of the embodiments of the present application.
It should also be understood that the formula in the embodiment of the present application is only an example, and is not intended to limit the scope of the embodiment of the present application, and the formula may be modified, and the modifications should also fall within the scope of the protection of the present application.
It should also be understood that, in the various embodiments of the present application, the sequence numbers of the processes do not mean the execution sequence, and the execution sequence of the processes should be determined by the functions and the inherent logic of the processes, and should not constitute any limitation to the implementation process of the embodiments of the present application.
It should also be understood that the various embodiments described in this specification can be implemented individually or in combination, and the examples in this application are not limited thereto.
Unless otherwise defined, all technical and scientific terms used in the examples of this application have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used in the present application is for the purpose of describing particular embodiments only and is not intended to limit the scope of the present application. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
Fig. 1 is an architecture diagram of a solution to which an embodiment of the present application is applied.
As shown in FIG. 1, the system 100 can receive the data 102 to be processed, process the data 102 to be processed, and generate processed data 108. For example, the system 100 may receive video to be compressed, encode the video to be compressed, and so on to compress the video. In some embodiments, the components in system 100 may be implemented by one or more processors, which may be processors in a computing device or in a mobile device (e.g., a drone). The processor may be any kind of processor, which is not limited in this application. In some possible designs, the processor may include an encoder, a decoder, a codec, or the like. One or more memories may also be included in the system 100. The memory may be used to store instructions and data, such as computer-executable instructions to implement aspects of embodiments of the present application, pending data 102, processed data 108, and so on. The memory may be any kind of memory, which is not limited in this embodiment of the present application.
Fig. 2 shows a schematic diagram of data to be processed according to an embodiment of the present application.
As shown in fig. 2, the pending data 202 may include a plurality of frames 204. For example, the plurality of frames 204 may represent successive image frames in a video stream. Each frame 204 may include one or more strips or tiles (tiles) 206. Each slice or tile206 may include one or more macroblocks or coding units 208. Each macroblock or coding unit 208 may include one or more blocks 210. Each block 210 may include one or more pixels 212. Each pixel 212 may include one or more data sets corresponding to one or more data portions, e.g., a luminance data portion and a chrominance data portion. The data unit may be a frame, slice, tile, coding unit, macroblock, block, pixel, or a group of any of the above. The size of the data units may vary in different embodiments. By way of example, a frame 204 may include 100 slices 206, each slice 206 may include 10 macroblocks 208, each macroblock 208 may include 4 (e.g., 2x2) blocks 210, and each block 210 may include 64 (e.g., 8x8) pixels 212.
In order to reduce the bandwidth occupied by video storage and transmission, video data needs to be subjected to encoding compression processing. Any suitable encoding technique may be used to encode the data to be encoded. The type of encoding depends on the data being encoded and the specific encoding requirements.
In some embodiments, the encoder may implement one or more different codecs. Each codec may include code, instructions or computer programs implementing a different coding algorithm. An appropriate encoding algorithm may be selected to encode a given piece of data to be encoded based on a variety of factors, including the type and/or source of the data to be encoded, the receiving entity of the encoded data, available computing resources, network environment, business environment, rules and standards, and the like.
For example, the encoder may be configured to encode a series of video frames. A series of steps may be taken to encode the data in each frame. In some embodiments, the encoding step may include prediction, transform, quantization, entropy encoding, and like processing steps.
The prediction includes two types, intra prediction and inter prediction, and aims to remove redundant information of a current image block to be coded by using prediction block information. The intra prediction obtains prediction block data using information of the present frame image. The inter-frame prediction utilizes the information of a reference frame to obtain prediction block data, and the process comprises the steps of dividing an image block to be coded into a plurality of sub-image blocks; then, aiming at each sub image block, searching an image block which is most matched with the current sub image block in a reference image as a prediction block; and then subtracting the corresponding pixel values of the sub image block and the prediction block to obtain a residual error, and combining the obtained residual errors corresponding to the sub image blocks together to obtain the residual error of the image block.
The method comprises the steps of using a transformation matrix to transform a residual block of an image to remove the correlation of the residual of the image block, namely removing redundant information of the image block so as to improve the coding efficiency, and using two-dimensional transformation to transform a data block in the image block, namely multiplying the residual information of the data block by an NxM transformation matrix and a transposition matrix at a coding end respectively to obtain a transformation coefficient after multiplication. The quantized coefficient can be obtained by quantizing the transformation coefficient, and finally entropy coding is carried out on the quantized coefficient, and finally the bit stream obtained by entropy coding and the coding mode information after coding, such as an intra-frame prediction mode, motion vector information and the like, are stored or sent to a decoding end. At the decoding end of the image, entropy coding is carried out after entropy coding bit streams are obtained, corresponding residual errors are obtained, according to the decoded motion vectors or intra-frame prediction and other information image blocks, prediction image blocks corresponding to the image blocks are obtained, and according to the prediction image blocks and the residual errors of the image blocks, values of all pixel points in the current sub image blocks are obtained.
Fig. 3 shows a schematic diagram of an encoding framework of an embodiment of the present application.
As shown in fig. 3, when inter prediction is used, the encoding process may be as follows:
in 301, a current frame image is acquired. In 302, a reference frame image is acquired. In 303a, Motion estimation is performed using the reference frame image to obtain Motion Vectors (MVs) of the image blocks of the current frame image. In 304a, motion compensation is performed using the motion vector obtained by motion estimation to obtain an estimated/predicted value of the current image block. At 305, the estimated/predicted value of the current image block is subtracted from the current image block to obtain a residual. In 306, the residual is transformed to obtain transform coefficients. In 307, the transform coefficients are quantized to obtain quantized coefficients. In 308, entropy coding is performed on the quantized coefficients, and finally, the bit stream obtained by entropy coding and the coded coding mode information are stored or transmitted to a decoding end. In 309, the result of the quantization is inverse quantized. At 310, the inverse quantization result is inverse transformed. At 311, a reconstructed pixel is obtained using the inverse transform result and the motion compensation result. At 312, the reconstructed pixels are filtered (i.e., video compressed). In 313, the filtered reconstructed pixels are output. Subsequently, the reconstructed image can be used as a reference frame image of other frame images for inter-frame prediction.
When intra prediction is used, the flow of encoding can be as follows:
in 302, a current frame image is acquired. In 303b, intra prediction selection is performed on the current frame image. In 304b, the current image block in the current frame is intra predicted. At 305, the estimated value of the current image block is subtracted from the current image block to obtain a residual. In 306, the residuals of the image blocks are transformed to obtain transform coefficients. In 307, the transform coefficients are quantized to obtain quantized coefficients. At 308, entropy coding is performed on the quantized coefficients, and finally, the bit stream obtained by entropy coding and the coded coding mode information are stored or transmitted to a decoding end. In 309, the quantization result is inverse quantized. At 310, the inverse quantization result is inverse transformed, and at 311, a reconstructed pixel is obtained using the inverse transformation result and the intra prediction result. The reconstructed image block may be used for intra prediction of the next image block.
For the decoding end, the operation corresponding to the encoding end is performed. Firstly, residual error information is obtained by utilizing entropy decoding, inverse quantization and inverse transformation, and whether the current image block uses intra-frame prediction or inter-frame prediction is determined according to a decoded code stream. If the prediction is intra-frame prediction, the reconstructed image block in the current frame is utilized to construct prediction information according to an intra-frame prediction method; if the inter-frame prediction is carried out, motion information needs to be analyzed, and a reference block is determined in the reconstructed image by using the analyzed motion information to obtain prediction information; and then, superposing the prediction information and the residual information, and obtaining the reconstruction information through filtering operation.
The technical scheme of the embodiment of the application mainly relates to a quantization step in an encoding process, namely, the compression efficiency of a video is improved through improvement in the quantization step, and other steps can refer to related steps in the encoding process.
The technical solution of video compression in the embodiment of the present application may be applied to video content analysis (visual analysis), but the embodiment of the present application does not limit this.
In some designs, the mobile device may compress the captured video using the technical solution of the embodiment of the present application. The movable device may be an unmanned aerial vehicle, an unmanned ship, an autonomous vehicle, a robot, an aerial vehicle, or the like, but the embodiment of the present application is not limited thereto.
FIG. 4 is a schematic architectural diagram of a removable device 400 of one embodiment of the present application.
As shown in FIG. 4, the mobile device 400 may include a power system 410, a control system 420, a sensing system 430, and a processing system 440.
A power system 410 is used to power the mobile device 400.
Taking an unmanned aerial vehicle as an example, a power system of the unmanned aerial vehicle may include an electronic governor (abbreviated as an electronic governor), a propeller, and a motor corresponding to the propeller. The motor is connected between the electronic speed regulator and the propeller, and the motor and the propeller are arranged on the corresponding machine arm; the electronic speed regulator is used for receiving a driving signal generated by the control system and providing a driving current for the motor according to the driving signal so as to control the rotating speed of the motor. The motor is used for driving the propeller to rotate, so that power is provided for the flight of the unmanned aerial vehicle.
Sensing system 430 may be used to measure pose information of mobile device 400, i.e., position information and state information of mobile device 400 in space, such as three-dimensional position, three-dimensional angle, three-dimensional velocity, three-dimensional acceleration, three-dimensional angular velocity, and the like. The sensing System 430 may include, for example, at least one of a gyroscope, an electronic compass, an Inertial Measurement Unit (IMU), a vision sensor, a Global Positioning System (GPS), a barometer, an airspeed meter, and the like.
In the present embodiment, the sensing system 430 may also be used for capturing images, i.e. the sensing system 430 comprises a sensor, such as a camera or the like, for capturing images.
The control system 420 is used to control the movement of the mobile device 400. The control system 420 may control the removable device 400 according to preset program instructions. For example, control system 420 may control movement of mobile device 400 based on information about the attitude of mobile device 400 as measured by sensing system 430. The control system 420 may also control the removable device 400 based on control signals from a remote control. For example, for a drone, the control system 420 may be a flight control system (flight control), or a control circuit in the flight control.
The processing system 440 may process the images acquired by the sensing system 430. For example, the Processing system 440 may be an Image Signal Processing (ISP) chip.
The processing system 440 may be the system 100 of fig. 1, or the processing system 440 may comprise the system 100 of fig. 1.
It should be understood that the above-described division and naming of the various components of the removable device 400 is merely exemplary and should not be construed as a limitation of the embodiments of the present application.
It should also be understood that the removable device 400 may also include other components not shown in fig. 4, which are not limited by the embodiments of the present application.
Fig. 5 shows a schematic flow chart of a method 500 of video compression of an embodiment of the present application. The method 500 may be performed by the system 100 shown in FIG. 1; or by the removable device 400 shown in figure 4. In particular, when executed by the removable device 400, may be executed by the processing system 440 of FIG. 4.
And 510, obtaining a first quantization matrix, wherein the quantization parameter of the high frequency component in the first quantization matrix is smaller than the quantization parameter of the low frequency component, and the frequency corresponding to the quantization parameter of the high frequency component is higher than the frequency corresponding to the quantization parameter of the low frequency component.
Quantization implements video compression by dividing the transformed transform coefficients by the corresponding quantization step. The quantization step size is indicated by a quantization parameter in the quantization matrix. The quantization matrix comprises quantization parameters corresponding to each frequency, wherein the quantization parameters of different frequencies can be different to realize selective energy region loss. The larger the quantization parameter, the larger the quantization step size and the larger the compression rate.
In the embodiment of the present application, the quantization parameter of the high frequency component in the first quantization matrix is smaller than the quantization parameter of the low frequency component. That is to say, in the technical solution of the embodiment of the present application, more low-frequency information is lost, and high-frequency information is retained.
During video content analysis, the visual algorithm needs to clearly keep high-frequency region information such as edges, textures, structures and the like of visual objects; for low frequency areas, e.g. information such as blue sky, white clouds, large walls etc., the visual significance is small because it does not provide any amount of information. Therefore, in the embodiment of the application, aiming at the characteristics of the visual algorithm, an asymmetric quantization strategy is adopted, more low-frequency information is lost, and high-frequency information is reserved, so that a lower code rate and longer compression time can be achieved, and meanwhile, the reduction degree and effectiveness of the visual algorithm can be ensured.
And compressing 520 the first video according to the first quantization matrix.
The first video is quantized by using the first quantization matrix in the embodiment of the present application, specifically, in the quantization process, each frequency component (transform coefficient) is divided by a quantization step size indicated by the corresponding quantization parameter. Because the quantization parameter of the high frequency component in the first quantization matrix is smaller than the quantization parameter of the low frequency component, that is, the quantization step size adopted by the high frequency component is small, and the quantization step size adopted by the low frequency component is large, after quantization, the compression ratio of the high frequency component is small, the compression loss is small, the compression ratio of the low frequency component is large, and the compression loss is large. By adopting the asymmetric quantization strategy, the high compression ratio can be integrally obtained, the compression time is prolonged, and the quality requirement of video content analysis can be met. Therefore, the video compression method can improve the efficiency of video compression.
Alternatively, different quantization matrices may be configured for different scenes. That is, on the basis of the basic setting that the quantization parameter of the high frequency component is smaller than the quantization parameter of the low frequency component, the quantization matrices of different scenes may adopt different quantization parameter settings.
In particular, a plurality of quantization matrices may be preconfigured, one for each scene. In this case, a first quantization matrix corresponding to the scene of the first video may be selected among the plurality of quantization matrices according to the scene of the first video.
For example, a quantization matrix may be configured for each scene of video content analysis, and when compressing a video, a corresponding quantization matrix is searched according to the scene of the video, and the quantization matrix is used for video compression.
It should be understood that the correspondence between the scene and the quantization matrix may be one-to-one or one-to-many, which is not limited in the present application.
Optionally, in an embodiment of the present application, the plurality of quantization matrices may be determined according to video samples of a plurality of scenes.
Specifically, for each scene, a quantization matrix corresponding to the scene may be trained in advance according to a video sample of the scene. The training process may be to gradually adjust quantization parameters in the quantization matrix to finally obtain a quantization matrix meeting the requirements.
Optionally, in an embodiment of the present application, for a specific scene in the multiple scenes, a quantization parameter in an initial quantization matrix is adjusted according to a specific video sample of the specific scene and a video content analysis result difference threshold, so as to obtain a specific quantization matrix corresponding to the specific scene, where a difference between a video content analysis result corresponding to the specific video sample compressed by using the specific quantization matrix and a video content analysis result corresponding to the specific video sample that is not compressed is not greater than the video content analysis result difference threshold.
The video content analysis result difference threshold is a video content analysis result difference threshold which can be tolerated when the video content is analyzed. That is, the difference between the video content analysis result obtained by performing video content analysis based on the compressed video and the video content analysis result obtained by performing video content analysis based on the uncompressed video is within the video content analysis result difference threshold, and the video content analysis result obtained based on the compressed video meets the requirement. In this way, when a specific quantization matrix corresponding to a specific scene is determined, on the basis of the initial quantization matrix, a quantization parameter in the initial quantization matrix is adjusted, the specific video sample is compressed by using the adjusted quantization matrix, then video content analysis is performed, an obtained video content analysis result is compared with a video content analysis result corresponding to the uncompressed specific video sample, if the difference between the two analysis results is greater than a video content analysis result difference threshold, the adjustment is continued, if the difference between the two analysis results is not greater than the video content analysis result difference threshold, the adjustment is ended, and the adjusted quantization matrix is the specific quantization matrix corresponding to the specific scene.
The initial quantization matrix may be a standard quantization matrix, or may be other quantization matrices, for example, a predefined quantization matrix, a previously used quantization matrix, and the like, which is not limited in this embodiment of the present application.
Alternatively, the quantization parameters in the initial quantization matrix may be adjusted as follows:
for each quantization parameter in the first set in the initial quantization matrix, adjusting in a direction to increase the quantization parameter; if the difference between the video content analysis result corresponding to the specific video sample after being compressed by adopting the adjusted quantization matrix and the video content analysis result corresponding to the uncompressed specific video sample is converged, adjusting the next quantization parameter; wherein the first set comprises M quantization parameters in the initial quantization matrix, the M quantization parameters corresponding to the lowest M frequencies, M being a predetermined value;
for each quantization parameter in the second set in the initial quantization matrix, adjusting in a direction to decrease the quantization parameter; if the difference between the video content analysis result corresponding to the specific video sample compressed by the adjusted quantization matrix and the video content analysis result corresponding to the uncompressed specific video sample is converged and is greater than the video content analysis result difference threshold, adjusting the next quantization parameter; if the difference between the video content analysis result corresponding to the specific video sample compressed by the adjusted quantization matrix and the video content analysis result corresponding to the uncompressed specific video sample is not greater than the video content analysis result difference threshold, stopping adjustment to obtain the specific quantization matrix; wherein the second set includes quantization parameters of the initial quantization matrix other than the M quantization parameters.
In particular, different adjustment directions may be employed for different quantization parameters in the initial quantization matrix. In this embodiment, the quantization parameter of the low frequency part (the quantization parameter in the first set) is adjusted in a direction to increase the quantization parameter, and the quantization parameter of the high frequency part (the quantization parameter in the second set) is adjusted in a direction to decrease the quantization parameter. For example, M may be 4, i.e., for the dc component and the first three ac principal components, the quantization parameter is adjusted in a direction of increasing, and the other ac components are adjusted in a direction of decreasing. In the adjusting process, the quantization parameters in the first set are adjusted first, and since the adjustment is performed in the direction of increasing the quantization parameters, the analysis result of the video content after the adjustment becomes worse, that is, the difference becomes larger, and when the difference converges (that is, the difference does not change substantially), the next quantization parameter is adjusted.
Alternatively, the quantization parameters in the first set may be adjusted in order of frequency from low to high.
After the quantization parameters in the first set are adjusted, the quantization parameters in the second set are adjusted, and the adjusted video content analysis result becomes better due to the adjustment in the direction of reducing the quantization parameters, namely, the difference becomes smaller, and the next quantization parameter is adjusted until the difference is not greater than the video content analysis result difference threshold after the difference is adjusted to be converged. The quantization matrix adjusted in this way is a specific quantization matrix corresponding to the specific scene.
Alternatively, the quantization parameters in the second set may be adjusted in the order from low frequency to high frequency, may be adjusted in the order from high frequency to low frequency, or may be adjusted in a predetermined order. Optionally, the predetermined order may be associated with the degree of dependency of the video content analysis on each frequency component, for example, the degree of dependency may be adjusted in order from high to low, but the embodiment of the present application is not limited thereto.
Alternatively, the adjustment to the initial quantization matrix may employ a scaling matrix adjustment. For example, the adjustment can be made according to the following formula:
Q’=Scl*Qr
where Qr is the initial quantization matrix, Scl is the scaling matrix, and Q' is the quantization matrix that is ultimately used for quantization.
And the Scl comprises a scaling factor corresponding to each quantization parameter, and each quantization parameter is scaled through the Scl to adjust the quantization intensity of the quantization parameter. In this case, the process of adjusting the quantization parameter is a process of adjusting a scaling factor corresponding to each quantization parameter. The scaling factor corresponding to each quantization parameter, i.e. the scaling matrix Scl, is determined, thereby determining the quantization matrix Q'.
For example, assuming that the tolerable video content analysis result difference threshold is T, the scaling matrix Scl can be obtained as follows.
Acquiring a video (sample) of a scene of a video content analysis;
analyzing the video content of the original uncompressed video by using a visual algorithm to obtain an original analysis result;
compressing the original video by using Q' ═ Qr, analyzing the video content of the decompressed video by using a visual algorithm to obtain an analysis result, calculating the fidelity (difference) of the current analysis result and the original analysis result, wherein the fidelity (difference) is defined as g, if g is less than T, the adjustment is not needed, otherwise, the following adjustment is continued;
adjusting the scaling factors one by one in Scl (each scaling factor may have an initial value of 1), starting from the lowest frequency, for example, until g converges, or g < T; if g < T, stopping adjustment; if g converges, i.e. the difference between the analysis results of the video content adjusted before and after the two times is small, for example <0.01T, the adjustment of the current frequency is stopped and the adjustment of the next frequency is started.
For example, the following adjustment directions can be adopted for the above adjustment:
the first component: adjusting the direct current component to the direction of amplifying the quantization parameter, namely, the scaling factor is greater than 1;
second, third, and fourth components: exchanging main components, and adjusting the main components in the direction of amplifying the quantization parameters, namely, the zooming factor is greater than 1; if the distortion of g is large during the first adjustment, the quantization parameter adjustment of the corresponding component can be abandoned;
the remaining components are adjusted in the direction of the reduced quantization parameter, i.e. the scaling factor < 1.
Finally, through the adjustment, the Scl matrix of g < T is obtained, and the quantization matrix Q' is obtained.
After the first video is compressed according to the first quantization matrix, video content analysis can be subsequently performed based on the compressed first video.
The quantization matrix is adopted for video compression, high-frequency information such as edges, structures, textures and the like which are significant for video content analysis is retained, low-frequency energy is suppressed, and the purpose of compression ratio and compression quality of offline analysis effectiveness is achieved.
Therefore, according to the technical scheme of the embodiment of the application, the video is compressed according to the quantization matrix in which the quantization parameter of the high-frequency component is smaller than that of the low-frequency component, and the video can be compressed under the condition that the requirement of video content analysis is met, so that the efficiency of video compression can be improved.
The method of video compression of the embodiments of the present application is described above in detail, and the apparatus, computer system, and mobile device of video compression of the embodiments of the present application will be described below.
Fig. 6 shows a schematic block diagram of an apparatus 600 for video compression according to an embodiment of the present application. The apparatus 600 may perform the method of video compression of the embodiment of the present application described above.
As shown in fig. 6, the apparatus 600 may include:
an obtaining module 610, configured to obtain a first quantization matrix, where a quantization parameter of a high frequency component in the first quantization matrix is smaller than a quantization parameter of a low frequency component, and a frequency corresponding to the quantization parameter of the high frequency component is higher than a frequency corresponding to the quantization parameter of the low frequency component;
a compressing module 620, configured to compress the first video according to the first quantization matrix.
Optionally, in this embodiment of the present application, the compressed first video is used for performing video content analysis.
Optionally, in this embodiment of the present application, the obtaining module 610 is configured to:
according to the scene of the first video, the first quantization matrix corresponding to the scene of the first video is selected from a plurality of quantization matrices.
Optionally, in this embodiment of the present application, as shown in fig. 7, the apparatus 600 further includes:
a configuration module 630 for pre-configuring the plurality of quantization matrices.
Optionally, in this embodiment of the present application, the configuration module 630 is configured to:
determining the plurality of quantization matrices from video samples of a plurality of scenes.
Optionally, in this embodiment of the present application, the configuration module 630 is configured to:
for a specific scene in the plurality of scenes, adjusting quantization parameters in an initial quantization matrix according to a specific video sample of the specific scene and a video content analysis result difference threshold to obtain a specific quantization matrix corresponding to the specific scene,
wherein a difference between a video content analysis result corresponding to the specific video sample compressed by using the specific quantization matrix and a video content analysis result corresponding to the specific video sample that is not compressed is not greater than the video content analysis result difference threshold.
Optionally, in this embodiment of the present application, the configuration module 630 is configured to:
for each quantization parameter in the first set in the initial quantization matrix, adjusting in a direction to increase the quantization parameter; if the difference between the video content analysis result corresponding to the specific video sample after being compressed by adopting the adjusted quantization matrix and the video content analysis result corresponding to the uncompressed specific video sample is converged, adjusting the next quantization parameter; wherein the first set comprises M quantization parameters in the initial quantization matrix, the M quantization parameters corresponding to the lowest M frequencies, M being a predetermined value;
for each quantization parameter in the second set in the initial quantization matrix, adjusting in a direction to decrease the quantization parameter; if the difference between the video content analysis result corresponding to the specific video sample compressed by the adjusted quantization matrix and the video content analysis result corresponding to the uncompressed specific video sample is converged and is greater than the video content analysis result difference threshold, adjusting the next quantization parameter; if the difference between the video content analysis result corresponding to the specific video sample compressed by the adjusted quantization matrix and the video content analysis result corresponding to the uncompressed specific video sample is not greater than the video content analysis result difference threshold, stopping adjustment to obtain the specific quantization matrix; wherein the second set includes quantization parameters of the initial quantization matrix other than the M quantization parameters.
Optionally, in this embodiment of the present application, the configuration module 630 is configured to: and adjusting the quantization parameters in the first set from low to high in frequency.
Optionally, in this embodiment of the present application, the configuration module 630 is configured to: and adjusting the quantization parameters in the second set from low to high or from high to low in frequency.
Optionally, in this embodiment of the present application, the configuration module 630 is configured to: and adjusting the quantization parameters in the second set according to a preset sequence.
Optionally, in an embodiment of the present application, M is 4.
Optionally, in this embodiment of the present application, the initial quantization matrix is a standard quantization matrix
It should be understood that the apparatus for video compression in the embodiment of the present application may be a chip, which may be specifically implemented by a circuit, but the embodiment of the present application does not limit a specific implementation form.
The embodiment of the invention also provides an encoder, which comprises the video compression device of the various embodiments of the invention.
Fig. 8 shows a schematic block diagram of a computer system 800 of an embodiment of the present application.
As shown in fig. 8, the computer system 800 may include a processor 810 and a memory 820.
It should be understood that the computer system 800 may also include other components commonly included in computer systems, such as input/output devices, communication interfaces, etc., which are not limited by the embodiments of the present application.
The memory 820 is used to store computer executable instructions.
The Memory 820 may be various types of memories, and may further include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory, for example, and the present invention is not limited thereto.
The processor 810 is configured to access the memory 820 and execute the computer-executable instructions to perform the operations of the method of video compression of the embodiment of the present application.
The processor 810 may include a microprocessor, a Field-Programmable gate array (FPGA), a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), and the like, which is not limited in this embodiment.
The embodiment of the present application further provides a removable device, which may include the video compression apparatus or the computer system according to the various embodiments of the present application.
The apparatus, the computer system, and the mobile device for video compression in the embodiments of the present application may correspond to an execution main body of the method for video compression in the embodiments of the present application, and the above and other operations and/or functions of each module in the apparatus, the computer system, and the mobile device for video compression are respectively for implementing corresponding processes of each of the foregoing methods, and are not described herein again for brevity.
The embodiment of the present application further provides a computer storage medium, in which a program code is stored, where the program code may be used to instruct to execute the method for video compression according to the embodiment of the present application.
It should be understood that, in the embodiment of the present application, the term "and/or" is only one kind of association relation describing an associated object, and means that three kinds of relations may exist. For example, a and/or B, may represent: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may also be an electric, mechanical or other form of connection.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiments of the present application.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially or partially contributed by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
While the invention has been described with reference to specific embodiments, the scope of the invention is not limited thereto, and those skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the invention. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (26)

1. A method of video compression, comprising:
acquiring a first quantization matrix, wherein a quantization parameter of a high-frequency component in the first quantization matrix is smaller than a quantization parameter of a low-frequency component, and a frequency corresponding to the quantization parameter of the high-frequency component is higher than a frequency corresponding to the quantization parameter of the low-frequency component;
compressing the first video according to the first quantization matrix.
2. The method of claim 1, further comprising:
and performing video content analysis based on the compressed first video.
3. The method of claim 1 or 2, wherein the obtaining the first quantization matrix comprises:
according to the scene of the first video, the first quantization matrix corresponding to the scene of the first video is selected from a plurality of quantization matrices.
4. The method of claim 3, further comprising:
preconfiguring the plurality of quantization matrices.
5. The method of claim 4, wherein the pre-configuring the plurality of quantization matrices comprises:
determining the plurality of quantization matrices from video samples of a plurality of scenes.
6. The method of claim 5, wherein determining the plurality of quantization matrices based on the video samples of the plurality of scenes comprises:
for a specific scene in the plurality of scenes, adjusting quantization parameters in an initial quantization matrix according to a specific video sample of the specific scene and a video content analysis result difference threshold to obtain a specific quantization matrix corresponding to the specific scene,
wherein a difference between a video content analysis result corresponding to the specific video sample compressed by using the specific quantization matrix and a video content analysis result corresponding to the specific video sample that is not compressed is not greater than the video content analysis result difference threshold.
7. The method of claim 6, wherein adjusting quantization parameters in an initial quantization matrix according to a difference threshold between a specific video sample and a video content analysis result of the specific scene comprises:
for each quantization parameter in the first set in the initial quantization matrix, adjusting in a direction to increase the quantization parameter; if the difference between the video content analysis result corresponding to the specific video sample after being compressed by adopting the adjusted quantization matrix and the video content analysis result corresponding to the uncompressed specific video sample is converged, adjusting the next quantization parameter; wherein the first set comprises M quantization parameters in the initial quantization matrix, the M quantization parameters corresponding to the lowest M frequencies, M being a predetermined value;
for each quantization parameter in the second set in the initial quantization matrix, adjusting in a direction to decrease the quantization parameter; if the difference between the video content analysis result corresponding to the specific video sample compressed by the adjusted quantization matrix and the video content analysis result corresponding to the uncompressed specific video sample is converged and is greater than the video content analysis result difference threshold, adjusting the next quantization parameter; if the difference between the video content analysis result corresponding to the specific video sample compressed by the adjusted quantization matrix and the video content analysis result corresponding to the uncompressed specific video sample is not greater than the video content analysis result difference threshold, stopping adjustment to obtain the specific quantization matrix; wherein the second set includes quantization parameters of the initial quantization matrix other than the M quantization parameters.
8. The method of claim 7, wherein the quantization parameters in the first set are adjusted in order of frequency from low to high.
9. The method according to claim 7 or 8, wherein the quantization parameters in the second set are adjusted in order of frequency from low to high or from high to low.
10. Method according to claim 7 or 8, wherein the quantization parameters in the second set are adjusted in a predetermined order.
11. The method of any one of claims 7 to 10, wherein M is 4.
12. The method according to any of claims 6 to 11, wherein the initial quantization matrix is a standard quantization matrix.
13. An apparatus for video compression, comprising:
an obtaining module, configured to obtain a first quantization matrix, where a quantization parameter of a high frequency component in the first quantization matrix is smaller than a quantization parameter of a low frequency component, and a frequency corresponding to the quantization parameter of the high frequency component is higher than a frequency corresponding to the quantization parameter of the low frequency component;
a compression module to compress the first video according to the first quantization matrix.
14. The apparatus of claim 13, wherein the compressed first video is used for video content analysis.
15. The apparatus of claim 13 or 14, wherein the obtaining module is configured to:
according to the scene of the first video, the first quantization matrix corresponding to the scene of the first video is selected from a plurality of quantization matrices.
16. The apparatus of claim 15, further comprising:
a configuration module to pre-configure the plurality of quantization matrices.
17. The apparatus of claim 16, wherein the configuration module is configured to:
determining the plurality of quantization matrices from video samples of a plurality of scenes.
18. The apparatus of claim 17, wherein the configuration module is configured to:
for a specific scene in the plurality of scenes, adjusting quantization parameters in an initial quantization matrix according to a specific video sample of the specific scene and a video content analysis result difference threshold to obtain a specific quantization matrix corresponding to the specific scene,
wherein a difference between a video content analysis result corresponding to the specific video sample compressed by using the specific quantization matrix and a video content analysis result corresponding to the specific video sample that is not compressed is not greater than the video content analysis result difference threshold.
19. The apparatus of claim 18, wherein the configuration module is configured to:
for each quantization parameter in the first set in the initial quantization matrix, adjusting in a direction to increase the quantization parameter; if the difference between the video content analysis result corresponding to the specific video sample after being compressed by adopting the adjusted quantization matrix and the video content analysis result corresponding to the uncompressed specific video sample is converged, adjusting the next quantization parameter; wherein the first set comprises M quantization parameters in the initial quantization matrix, the M quantization parameters corresponding to the lowest M frequencies, M being a predetermined value;
for each quantization parameter in the second set in the initial quantization matrix, adjusting in a direction to decrease the quantization parameter; if the difference between the video content analysis result corresponding to the specific video sample compressed by the adjusted quantization matrix and the video content analysis result corresponding to the uncompressed specific video sample is converged and is greater than the video content analysis result difference threshold, adjusting the next quantization parameter; if the difference between the video content analysis result corresponding to the specific video sample compressed by the adjusted quantization matrix and the video content analysis result corresponding to the uncompressed specific video sample is not greater than the video content analysis result difference threshold, stopping adjustment to obtain the specific quantization matrix; wherein the second set includes quantization parameters of the initial quantization matrix other than the M quantization parameters.
20. The apparatus of claim 19, wherein the configuration module is configured to: and adjusting the quantization parameters in the first set from low to high in frequency.
21. The apparatus of claim 19 or 20, wherein the configuration module is configured to: and adjusting the quantization parameters in the second set from low to high or from high to low in frequency.
22. The apparatus of claim 19 or 20, wherein the configuration module is configured to: and adjusting the quantization parameters in the second set according to a preset sequence.
23. The apparatus of any one of claims 19 to 22, wherein M is 4.
24. The apparatus according to any of claims 18 to 23, wherein the initial quantization matrix is a standard quantization matrix.
25. A computer system, comprising:
a memory for storing computer executable instructions;
a processor for accessing the memory and executing the computer-executable instructions to perform operations in the method of any one of claims 1 to 12.
26. A mobile device, comprising:
the apparatus of any one of claims 13 to 24; alternatively, the first and second electrodes may be,
the computer system of claim 25.
CN201880038972.6A 2018-07-27 2018-07-27 Video compression method, device, computer system and movable equipment Pending CN110771167A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/097343 WO2020019279A1 (en) 2018-07-27 2018-07-27 Video compression method and apparatus, computer system, and mobile device

Publications (1)

Publication Number Publication Date
CN110771167A true CN110771167A (en) 2020-02-07

Family

ID=69180715

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201880038972.6A Pending CN110771167A (en) 2018-07-27 2018-07-27 Video compression method, device, computer system and movable equipment

Country Status (2)

Country Link
CN (1) CN110771167A (en)
WO (1) WO2020019279A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112437301A (en) * 2020-10-13 2021-03-02 北京大学 Code rate control method and device for visual analysis, storage medium and terminal

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101695132A (en) * 2004-01-20 2010-04-14 松下电器产业株式会社 Picture coding method, picture decoding method, picture coding apparatus, picture decoding apparatus, and program thereof
US20130188692A1 (en) * 2012-01-25 2013-07-25 Yi-Jen Chiu Systems, methods, and computer program products for transform coefficient sub-sampling
CN103444180A (en) * 2011-03-09 2013-12-11 日本电气株式会社 Video encoding device, video decoding device, video encoding method, and video decoding method
CN106063265A (en) * 2014-02-26 2016-10-26 杜比实验室特许公司 Luminance based coding tools for video compression

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101695132A (en) * 2004-01-20 2010-04-14 松下电器产业株式会社 Picture coding method, picture decoding method, picture coding apparatus, picture decoding apparatus, and program thereof
CN103444180A (en) * 2011-03-09 2013-12-11 日本电气株式会社 Video encoding device, video decoding device, video encoding method, and video decoding method
US20130188692A1 (en) * 2012-01-25 2013-07-25 Yi-Jen Chiu Systems, methods, and computer program products for transform coefficient sub-sampling
CN106063265A (en) * 2014-02-26 2016-10-26 杜比实验室特许公司 Luminance based coding tools for video compression

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112437301A (en) * 2020-10-13 2021-03-02 北京大学 Code rate control method and device for visual analysis, storage medium and terminal

Also Published As

Publication number Publication date
WO2020019279A1 (en) 2020-01-30

Similar Documents

Publication Publication Date Title
EP2805499B1 (en) Video decoder, video encoder, video decoding method, and video encoding method
CN110612553B (en) Encoding spherical video data
US20190238848A1 (en) Method and apparatus for calculating quantization parameters to encode and decode an immersive video
KR20130136170A (en) Apparatus and method for image processing for 3d image
US20210014486A1 (en) Image transmission
EP3104615B1 (en) Moving image encoding device and moving image encoding method
US8792549B2 (en) Decoder-derived geometric transformations for motion compensated inter prediction
US11265575B2 (en) Image and video processing apparatuses and methods
US8442338B2 (en) Visually optimized quantization
CN110771167A (en) Video compression method, device, computer system and movable equipment
US20140044167A1 (en) Video encoding apparatus and method using rate distortion optimization
WO2020053688A1 (en) Rate distortion optimization for adaptive subband coding of regional adaptive haar transform (raht)
CN114391259A (en) Information processing method, terminal device and storage medium
WO2021263251A1 (en) State transition for dependent quantization in video coding
US20220351422A1 (en) Inference Processing of Data
CN116325732A (en) Decoding and encoding method, decoder, encoder and encoding and decoding system of point cloud
Angelino et al. A sensor aided H. 264 encoder tested on aerial imagery for SFM
US11711540B2 (en) Method for encoding video using effective differential motion vector transmission method in omnidirectional camera, and method and device
US20240064331A1 (en) Image decoding apparatus and method and image encoding apparatus and method
US11159826B2 (en) Method for encoding and decoding images, encoding and decoding device, and corresponding computer programs
WO2024078403A1 (en) Image processing method and apparatus, and device
US20240129546A1 (en) Artificial intelligence-based image encoding and decoding apparatus, and image encoding and decoding method thereby
CN112154667B (en) Encoding and decoding of video
EP2753091A1 (en) Method for coding and decoding a video and corresponding devices
WO2022213122A1 (en) State transition for trellis quantization in video coding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
AD01 Patent right deemed abandoned
AD01 Patent right deemed abandoned

Effective date of abandoning: 20220603