KR101868270B1 - Content-aware video encoding method, controller and system based on single-pass consistent quality control - Google Patents

Content-aware video encoding method, controller and system based on single-pass consistent quality control Download PDF

Info

Publication number
KR101868270B1
KR101868270B1 KR1020170026356A KR20170026356A KR101868270B1 KR 101868270 B1 KR101868270 B1 KR 101868270B1 KR 1020170026356 A KR1020170026356 A KR 1020170026356A KR 20170026356 A KR20170026356 A KR 20170026356A KR 101868270 B1 KR101868270 B1 KR 101868270B1
Authority
KR
South Korea
Prior art keywords
current frame
frame
quantization parameter
distortion
optimal
Prior art date
Application number
KR1020170026356A
Other languages
Korean (ko)
Inventor
김기원
경종민
Original Assignee
재단법인 다차원 스마트 아이티 융합시스템 연구단
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 재단법인 다차원 스마트 아이티 융합시스템 연구단 filed Critical 재단법인 다차원 스마트 아이티 융합시스템 연구단
Priority to KR1020170026356A priority Critical patent/KR101868270B1/en
Application granted granted Critical
Publication of KR101868270B1 publication Critical patent/KR101868270B1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/142Detection of scene cut or scene change
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/164Feedback from the receiver or from the transmission channel
    • H04N19/166Feedback from the receiver or from the transmission channel concerning the amount of transmission errors, e.g. bit error rate [BER]

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

According to an embodiment, a method for content-aware video encoding based on single-pass consistent quality control includes: a step of detecting a screen change for a current frame of a picture group - the picture group includes a plurality of frames -; a step of determining an optimal frame type of the current frame based on a result of detecting the screen change; a step of setting an initial quantization parameter in the current frame based on an optimal quantization parameter in a previous frame of the current frame; a step of obtaining model parameters in the current parameters based on the optimal frame type of the current frame and the initial quantization parameter by using a prebuilt model parameter lookup table; a step of calculating a predicted encoding distortion of the current frame based on the obtained model parameters and a screen describer of the previous frame received from a video encoder by using a prebuilt distortion prediction model for the current frame; and a step of obtaining an optimal quantization parameter in the current frame minimizing the difference between the predicted encoding distortion of the current frame and the target encoding distortion of the current frame.

Description

TECHNICAL FIELD [0001] The present invention relates to a content-aware video encoding method, a controller, and a system based on single-pass consistent picture quality control.

The following description relates to a content-aware video encoding method, a controller and a system based on a single-pass encoding structure, and more particularly, to a method and apparatus for encoding a quantization parameter of each frame that minimizes inter-frame distortion variance. QP). ≪ / RTI >

Recent advances in video technology have led to a variety of battery-powered miniature cameras that capture, record, and stream image / video of a user's personal activities. Although such a video camera system requires high video quality, a trade-off relationship between the encoding bit-rate and the distortion of the video frame becomes a very important problem due to the limitations on storage capacity, channel bandwidth and battery life.

Many techniques have been developed to control the encoding bit-rate and / or distortion while satisfying the constraints of the encoding system, in order to improve the performance of the rate-distortion (RD) and rate-distortion (RD) . In particular, since the size of the quantization step greatly affects the R-D performance in the video encoding process, a technique of controlling the bit-rate and distortion by controlling the quantization parameter that determines the size of the quantization step has been developed.

Specifically, with respect to R-D control, video encoding is classified into variable bit-rate (VBR) encoding and constant bit-rate (CBR) encoding. The VBR encoding allocates a different bit rate for each frame, while the CBR encoding allocates a uniform bit rate for all frames. Due to storage capacity or channel bandwidth constraints, most conventional techniques employ CBR encoding. However, despite the different screen characteristics, uniform bit rate allocation for all frames often causes severe distortion variations. Such an increase in the change in inter-frame distortion causes flicker and is undesirable because subjective video quality deteriorates depending on the human visual system.

Thus, a consistent image quality control method for reducing interframe distortion dispersion in CBR encoding has been proposed. Consistent image quality control is focused on minimizing the difference between the encoded distortion level and the target distortion level for each frame, i. E., Minimizing the distortion variance for all frames. In order to achieve consistent image quality control, by selecting appropriate quantization parameter values for each frame, different bit rates may be assigned to each frame depending on the screen characteristics. However, this increases the overall bit rate and, furthermore, causes the disadvantage that the buffer overflow frequently causes frame degradation.

To solve the buffer overflow problem, most commercial hardware video encoders attempt to reduce the frame rate according to the degree of buffer fullness. However, reducing the frame rate according to the buffer fullness can not compensate for the frame degradation, which often leads to a problem of degrading perceptual video quality.

Constrained Variable Bit-Rate (CVBR) encoding assigns a different bit rate to each frame as long as the bit rate constraint is satisfied. In the case of CVBR encoding, the existing consistent quality control scheme determines the optimal quantization parameter for each frame based on the result of the first pass encoding, and uses the resulting quantization parameter for the second pass encoding . However, in addition to increased computational complexity and power consumption, the resultant longer encoding delay makes multi-pass encoding unsuitable for real-time video streaming applications.

Thus, in order to overcome the problems of multi-pass encoding, a consistent picture quality management system based on single pass encoding has been proposed. Unlike multi-pass encoding, single pass encoding requires accurate prediction of encoding distortion because it can not determine the quantization parameters of the current frame using the encoding result of the current frame.

Since the influence of the screen complexity on the R-D performance is significant, the scene descriptor indicating the screen complexity is a very important factor in encoding distortion prediction. Conventional techniques have proposed several screen descriptors for various applications such as perceptual coding, image / video image quality evaluation, object recognition as well as consistent image quality management. However, conventional screen descriptors have the drawback that they are not suitable for real-time video streaming applications due to the high computational complexity. Furthermore, when a sudden change in the screen occurs, inaccurate distortion prediction and a prediction error are propagated due to screen distortion, resulting in a large distortion variation.

Therefore, it is necessary to provide a single-pass consistent image quality control technique that uses a lightweight screen descriptor, an accurate distortion prediction model, and a powerful screen change detection method.

One embodiment of the present invention proposes a content-aware video encoding method, a controller, and a system based on a single-pass consistent picture quality control using a lightweight screen descriptor, an accurate distortion prediction model, and a strong screen change detection method.

In particular, one embodiment proposes a screen descriptor that can be obtained without additional cost by calculating a screen descriptor based on a DCT coefficient (Discrete Cosine Transform coefficient) so as to quantitatively express a screen complexity in a frame.

In addition, one embodiment proposes a distortion prediction model considering screen complexity by defining a relationship between a screen descriptor having a distortion-quantization (D-Q) relation based on a Cauchy distribution and intra-frame distortion.

In addition, one embodiment suggests a screen change detection that is robust against the intensity change and improves the accuracy by using the ratio of the prediction mode of the macroblock (MB) within one frame.

According to an exemplary embodiment, a content-aware video encoding method based on single-pass consistent image quality control includes: detecting a screen change for a current frame of a group of pictures (the group of pictures includes a plurality of frames) ; Determining an optimal frame type of the current frame based on a result of performing the screen change detection; Setting an initial quantization parameter in the current frame based on an optimal quantization parameter in a previous frame of the current frame; Obtaining model parameters in the current frame based on the optimal frame type and the initial quantization parameter of the current frame using a pre-built model parameter lookup table; Calculating a predicted encoding distortion of the current frame based on the screen descriptor of the previous frame received from the video encoder and the obtained model parameters using a distortion prediction model constructed beforehand for the current frame; And obtaining an optimal quantization parameter in the current frame that minimizes the difference between the predicted encoding distortion of the current frame and the target encoding distortion of the current frame.

According to one aspect, the screen descriptor of the previous frame is calculated based on the horizontal slope AC coefficient and the vertical slope AC coefficient among the DCT coefficients (Discrete Cosine Transform coefficient) in the previous frame so as to quantitatively express the screen complexity in the previous frame .

According to another aspect, the distortion prediction model can be constructed in advance to define a relationship between the picture descriptor and the predicted encoding distortion using the model parameters.

According to another aspect, performing the scene change detection on the current frame may include detecting the number of macroblocks encoded in the intra mode in the previous frame received from the video encoder and the number of macroblocks encoded in the skip mode in the previous frame, And performing a screen change detection on the current frame based on a ratio between the number of blocks.

According to another aspect, performing the scene change detection on the current frame may include detecting a ratio between a number of macroblocks encoded in an intra mode in the previous frame and a number of macroblocks encoded in a skip mode in the previous frame, Recognizing that a scene change occurs in the current frame if the current frame is greater than or equal to the threshold; Or if the ratio between the number of macroblocks encoded in the intra mode in the previous frame and the number of macroblocks encoded in the skip mode in the previous frame is smaller than the threshold value, Wherein the step of determining an optimal frame type of the current frame comprises the steps of: if it is recognized that a scene change has occurred in the current frame, determining an optimal frame type of the current frame as an I frame Determining; Or determining that the optimal frame type of the current frame is a P frame when it is recognized that the current frame does not issue a screen change.

According to another aspect of the present invention, performing the screen change detection on the current frame includes: determining whether the current frame is the first frame of the image group; And performing a screen change detection on the current frame if the current frame is not the first frame of the image group as a result of the determination.

According to another aspect, the model parameter look-up table may be pre-built to define a relationship between each optimal frame type, quantization parameters and model parameters in sample frames.

According to another aspect, obtaining the optimal quantization parameter in the current frame may comprise incrementing the initial quantization parameter such that the difference between the predicted encoding distortion of the current frame and the target encoding distortion of the current frame is minimized ; And obtaining the increased or decreased initial quantization parameter as an optimal quantization parameter in the current frame.

According to another aspect, there is provided a method comprising: obtaining model parameters in the current frame; calculating predicted encoding distortion in the current frame; and obtaining optimal quantization parameters in the current frame, Wherein the step of repeatedly performing the at least one generation until the difference between the predicted encoding distortion and the target encoding distortion of the current frame is minimized and the step of increasing or decreasing the initial quantization parameter comprises: Wherein if the difference between the encoding distortion and the target encoding distortion of the current frame is not minimized, then increasing or decreasing the initial quantization parameter is performed in the step of obtaining model parameters in the current frame, More steps to use as .

According to another aspect, a content-aware video encoding method based on single-pass consistent picture quality control comprises calculating a target encoding bit-rate in the current frame based on a constraint on the average number of encoded bits per frame; And calculating a target encoding distortion of the current frame based on a target encoding bit-rate in the current frame, wherein obtaining an optimal quantization parameter in the current frame comprises: Estimating a predicted encoding bit-rate in the current frame based on a quantization parameter; And increasing the optimal quantization parameter until the predicted encoding bit-rate in the current frame is less than or equal to the target encoding bit-rate in the current frame.

According to another aspect, a content-aware video encoding method based on single pass consistent picture quality control comprises encoding the current frame based on an optimal frame type of the current frame and an optimal quantization parameter in the current frame .

According to another aspect, the step of encoding the current frame may further comprise the steps of using the current frame and the predicted encoding distortion of the current frame, the encoding distortion of the current frame being obtained as a result of encoding the current frame, Calculating an update parameter; And using the update parameter in calculating the predicted encoding distortion for the next frame of the current frame.

According to one embodiment, a computer program stored on a medium for executing a content-aware video encoding method based on single-pass consistent quality control in combination with a computer implementing an electronic device, the method comprising: Performing a screen change detection on a current frame of a group of pictures including the plurality of frames; Determining an optimal frame type of the current frame based on a result of performing the screen change detection; Setting an initial quantization parameter in the current frame based on an optimal quantization parameter in a previous frame of the current frame; Obtaining model parameters in the current frame based on the optimal frame type and the initial quantization parameter of the current frame using a pre-built model parameter lookup table; Calculating a predicted encoding distortion of the current frame based on the screen descriptor of the previous frame received from the video encoder and the obtained model parameters using a distortion prediction model constructed beforehand for the current frame; And obtaining an optimal quantization parameter in the current frame that minimizes the difference between the predicted encoding distortion of the current frame and the target encoding distortion of the current frame.

According to one embodiment, a content-aware video encoding controller based on single-pass consistent picture quality control may perform screen change detection on a current frame of a group of pictures (e.g., the video group includes a plurality of frames) A screen change detection unit for determining an optimal frame type of the current frame based on a result of performing the screen change detection; A distortion prediction / quantization parameter setting unit that sets an initial quantization parameter in the current frame based on an optimal quantization parameter in a previous frame of the current frame; And a model parameter obtaining unit for obtaining model parameters in the current frame based on the optimum frame type and the initial quantization parameter of the current frame using a pre-built model parameter lookup table, wherein the distortion prediction / The setting unit calculates the predicted encoding distortion of the current frame based on the screen descriptor of the previous frame received from the video encoder and the obtained model parameters using the distortion prediction model constructed beforehand for the current frame, Obtains an optimal quantization parameter in the current frame that minimizes the difference between the predicted encoding distortion of the current frame and the target encoding distortion of the current frame.

According to one embodiment, a content aware video encoding system based on single pass consistent quality control comprises: a video encoder; And a content-aware video encoding controller for controlling the video encoder, wherein the content-aware video encoding controller includes: a group of pictures, the group of pictures including a plurality of frames, A screen change detector for detecting an optimal frame type of the current frame based on a result of performing the screen change detection; A distortion prediction / quantization parameter setting unit that sets an initial quantization parameter in the current frame based on an optimal quantization parameter in a previous frame of the current frame; And a model parameter obtaining unit for obtaining model parameters in the current frame based on the optimum frame type and the initial quantization parameter of the current frame using a pre-built model parameter lookup table, wherein the distortion prediction / The setting unit calculates the predicted encoding distortion of the current frame based on the screen descriptor of the previous frame received from the video encoder and the obtained model parameters using the distortion prediction model constructed beforehand for the current frame, Obtains an optimal quantization parameter in the current frame that minimizes the difference between the predicted encoding distortion of the current frame and the target encoding distortion of the current frame, and wherein the video encoder Group on the basis of the optimum quantization parameter at the optimum frame type and the current frame in the current frame, and encoding the current frame.

One embodiment can propose a content-aware video encoding method, a controller and a system based on a single-pass consistent picture quality control using a lightweight screen descriptor, an accurate distortion prediction model, and a powerful screen change detection method.

In particular, one embodiment may propose a screen descriptor that can be obtained at no additional cost by computing a screen descriptor based on DCT coefficients to quantitatively represent screen intra-frame complexity.

In addition, one embodiment can propose a distortion prediction model considering screen complexity by defining a relationship between a screen descriptor having a distortion-quantization relation based on a Cauchy distribution and intra-frame distortion.

In addition, in one embodiment, by using the ratio of the prediction mode of the macroblock in one frame, it is possible to improve the accuracy and to propose a screen change detection robust to the intensity change.

Therefore, one embodiment can propose a technique for reducing the distortion variation while satisfying the bit-rate constraint through the image quality control method applicable to the VBR and CVBR.

1 is a diagram illustrating a correlation between a screen descriptor and a Sobel operator value according to an exemplary embodiment of the present invention.
FIG. 2 is a graph showing the standard deviation of a screen descriptor and a Sobel operator value in a frame for 15 video sequences according to an exemplary embodiment.
Figures 3-5 illustrate model parameters for each of the I frame and P frame for each quantization parameter for five video sequences according to one embodiment.

Figure 112017020511876-pat00001
And the screen descriptor.
FIG. 6 is a block diagram illustrating a model parameter < RTI ID = 0.0 >
Figure 112017020511876-pat00002
And
Figure 112017020511876-pat00003
Fig.
7 to 8 are diagrams showing a comparison between a distortion estimated by the distortion prediction model according to an embodiment and an actually measured distortion.
FIG. 9 is a diagram for explaining the ratio of prediction modes of intra-frame macroblocks used in picture change detection according to an embodiment.
10 is a diagram illustrating a type of a frame after a screen change in the single pass encoding according to an exemplary embodiment.
11 is a block diagram illustrating a content-aware video encoding system in accordance with one embodiment.
12 is a diagram for explaining a process of calculating predicted encoding distortion during a content recognition video encoding process according to an embodiment.
FIG. 13 is a flowchart illustrating a content recognition video encoding method according to an embodiment.
FIG. 14 is a diagram illustrating a content recognition quantization parameter determination algorithm according to an embodiment.
FIG. 15 is a flowchart showing a specific example in which the content recognition video encoding method shown in FIG. 13 is performed in the VBR encoding.
16 is a flowchart showing a concrete example in which the content recognition video encoding method shown in FIG. 13 is performed in CVBR encoding.

Hereinafter, embodiments according to the present invention will be described in detail with reference to the accompanying drawings. However, the present invention is not limited to or limited by the embodiments. In addition, the same reference numerals shown in the drawings denote the same members.

Also, terminologies used herein are terms used to properly represent preferred embodiments of the present invention, which may vary depending on the viewer, the intention of the operator, or the custom in the field to which the present invention belongs. Therefore, the definitions of these terms should be based on the contents throughout this specification.

The content-aware video encoding method in accordance with an embodiment minimizes variations in encoding distortion and may be performed at an average encoded bit-rate (in bits per frame) to adhere to a given capacity of the encoding buffer

Figure 112017020511876-pat00004
And the target distortion
Figure 112017020511876-pat00005
Lt; RTI ID = 0.0 > a < / RTI >

In this case, two criteria can be considered to represent the distortion variation: Minimizing the average distortion (MINAVE) and Minimizing the variance of distortion (MINVAR). Since the MINVAR is a more suitable and intuitive standard than the MINAVE in terms of consistent video quality, the content-aware video encoding method according to an exemplary embodiment, in the case of VBR encoding, is based on the MINVAR as an optimal quantization parameter

Figure 112017020511876-pat00006
And the optimum frame type
Figure 112017020511876-pat00007
As well as the problem of searching.

<Formula 1>

Figure 112017020511876-pat00008

<Formula 2>

Figure 112017020511876-pat00009

In Equation 2,

Figure 112017020511876-pat00010
Represents the number of frames,
Figure 112017020511876-pat00011
Lt; / RTI &gt; represents a given target distortion,
Figure 112017020511876-pat00012
Represents the encoding distortion of the i &lt; th &gt; frame,
Figure 112017020511876-pat00013
Lt; RTI ID = 0.0 &gt; i &lt; / RTI &
Figure 112017020511876-pat00014
And the variance of the variance.

In Equation 1,

Figure 112017020511876-pat00015
Is a set of optimal quantization parameters for all frames,
Figure 112017020511876-pat00016
. &Lt; / RTI &gt; Likewise,
Figure 112017020511876-pat00017
Is a set of optimal frame types for all frames,
Figure 112017020511876-pat00018
. &Lt; / RTI &gt; Below,
Figure 112017020511876-pat00019
Is expressed as 1 when it is I frame or 0 when it is P frame.

The content-aware video encoding method according to an embodiment should add an encoding bit-rate constraint such as Equation 3 in case of CVBR encoding.

<Formula 3>

Figure 112017020511876-pat00020

In Equation 3,

Figure 112017020511876-pat00021
&Lt; / RTI &gt; where the i &lt; th &gt; frame represents the encoding bit-
Figure 112017020511876-pat00022
Represents the frame width at the pixels,
Figure 112017020511876-pat00023
Represents the frame height in the pixels.

Solving Equations 2 and 3, the variation of the encoding distortion of the i &lt; th &gt;

Figure 112017020511876-pat00024
The optimal quantization parameter &lt; RTI ID = 0.0 &gt;
Figure 112017020511876-pat00025
And the optimum frame type
Figure 112017020511876-pat00026
Can be obtained. However, solving Equations 2 and 3 to find the correct solution to Equation 1 is a good solution for the combination of QP and frame type for all frames
Figure 112017020511876-pat00027
Lt; RTI ID = 0.0 &gt; overhead. &Lt; / RTI &gt;

Thus, the content-aware video encoding method according to one embodiment can solve the problem in Equations 2 and 3

Figure 112017020511876-pat00028
By localizing them into local problems, as shown in Equation (4), the optimal quantization parameter
Figure 112017020511876-pat00029
And the optimal frame type of the i &lt; th &gt;
Figure 112017020511876-pat00030
The solution can be simplified.

<Formula 4>

Figure 112017020511876-pat00031

In Equation 4,

Figure 112017020511876-pat00032
Represents the target encoding distortion of the i &lt; th &gt; frame,
Figure 112017020511876-pat00033
Represents the predicted encoding distortion of the i &lt; th &gt; frame,
Figure 112017020511876-pat00034
Represents the difference between the predicted encoding distortion of the i &lt; th &gt; frame and the target encoding distortion of the i &lt; th &gt;

Further, in the CVBR encoding, the bit-rate constraint is added as shown in Equation 5 below.

&Lt; EMI ID =

Figure 112017020511876-pat00035

In Equation 5,

Figure 112017020511876-pat00036
Represents the target encoding bit-rate of the i &lt; th &gt; frame,
Figure 112017020511876-pat00037
Represents the predicted encoding bit-rate of the i &lt; th &gt; frame.

The terms used in Equations 1 to 5 and the terms to be used in the following expressions will be described in detail with reference to Table 1.

Terms Explanation

Figure 112017020511876-pat00038
Given target distortion
Figure 112017020511876-pat00039
The given average encoded bit-rate
Figure 112017020511876-pat00040
A set of optimal quantization parameters
Figure 112017020511876-pat00041
A set of optimal frame types
Figure 112017020511876-pat00042
The encoding distortion of the i &lt; th &gt;
Figure 112017020511876-pat00043
The encoding bit-rate of the ith frame
Figure 112017020511876-pat00044
The optimal quantization parameter of the i &lt; th &gt;
Figure 112017020511876-pat00045
The optimal frame type of the i-th frame
Figure 112017020511876-pat00046
The predicted encoding distortion of the i &lt; th &gt;
Figure 112017020511876-pat00047
The predicted encoding bit-rate of the i &lt; th &gt;
Figure 112017020511876-pat00048
Number of frames
Figure 112017020511876-pat00049
Frame width in pixels
Figure 112017020511876-pat00050
Frame height in pixels
Figure 112017020511876-pat00051
Target encoding distortion of ith frame
Figure 112017020511876-pat00052
The target encoding bit-rate of the ith frame
Figure 112017020511876-pat00053
The quantization parameter of the i-th frame
Figure 112017020511876-pat00054
Frame type of i-th frame
Figure 112017020511876-pat00055
Number of macroblocks in the frame
Figure 112017020511876-pat00056
The screen descriptor in the i-th macroblock of the i-th frame
Figure 112017020511876-pat00057
Horizontal Sobel operator of N * N block
Figure 112017020511876-pat00058
Vertical Sobel operator of N * N block
Figure 112017020511876-pat00059
The pixel value at the position (k, l) of the frame
Figure 112017020511876-pat00060
The magnitude of the Sobel slope of the N * N block
Figure 112017020511876-pat00061
The DCT coefficients at the position (u, v) of the N * N block
Figure 112017020511876-pat00062
Screen descriptor of N * N block
Figure 112017020511876-pat00063
The encoding distortion of the jth macroblock of the i &lt; th &gt;
Figure 112017020511876-pat00064
Quantization step size of ith frame
Figure 112017020511876-pat00065
The number of macroblocks encoded in intra mode in a frame
Figure 112017020511876-pat00066
The number of macroblocks encoded in the skip mode in the frame
Figure 112017020511876-pat00067
Threshold for screen change detection

FIG. 1 is a diagram illustrating a correlation between a screen descriptor and a Sobel operator value according to an exemplary embodiment. FIG. 2 illustrates a screen descriptor and a Sobel operator value in a frame for 15 video sequences according to an exemplary embodiment. Fig.

Referring to FIGS. 1 and 2, on the other hand, the screen complexity is a degree of deviation of neighboring pixel values in a frame, and can be expressed by a slope magnitude through a slope operator. In general, a Sobel operator, which is capable of accurately measuring the slope information of an image among various operators, is widely used for measuring screen complexity. The Sobel operators of horizontal and vertical N * N blocks are defined as Equations 6 and 7.

&Lt; EMI ID =

Figure 112017020511876-pat00068

Equation (7)

Figure 112017020511876-pat00069

In Equations 6 and 7,

Figure 112017020511876-pat00070
Represents a horizontal Sobel operator,
Figure 112017020511876-pat00071
Represents a vertical Sobel operator,
Figure 112017020511876-pat00072
Represents the pixel value at the position (k, l) of the frame. From Equations 6 and 7, the size of the Sobel operator for the N * N block
Figure 112017020511876-pat00073
Is defined as Eq. (8).

<Formula 8>

Figure 112017020511876-pat00074

The slope information in the spatial domain can be expressed by a transform coefficient in the frequency domain. In particular, DCT is often used as an image / video encoder during various conversions. The DCT coefficients are composed of a DC coefficient, a horizontal slope AC coefficient, a vertical slope AC coefficient, and other AC coefficients. According to the definition of N * N DCT, the horizontal slope AC coefficient and the vertical slope AC coefficient are Is defined.

Equation (9)

Figure 112017020511876-pat00075

<Formula 10>

Figure 112017020511876-pat00076

In Equations 9 and 10,

Figure 112017020511876-pat00077
Represents the horizontal slope AC coefficient of the N * N block DCT,
Figure 112017020511876-pat00078
Represents the vertical slope AC coefficient of the N * N block DCT. Such
Figure 112017020511876-pat00079
And
Figure 112017020511876-pat00080
Can be obtained from the video encoder at no additional cost.

In the content-aware video encoding method according to an exemplary embodiment, the screen descriptor is defined as Equation 11 based on the horizontal slope AC coefficient and the vertical slope AC coefficient among the DCT coefficients.

<Formula 11>

Figure 112017020511876-pat00081

In Equation 11,

Figure 112017020511876-pat00082
Represents a screen descriptor for an N * N block. These screen descriptors
Figure 112017020511876-pat00083
To verify how well this slope is represented, we calculated the average of the five video sequences of Aspen, Pedestrian, Sunflower, Bluesky, and Tractor
Figure 112017020511876-pat00084
And
Figure 112017020511876-pat00085
1, a screen descriptor proposed by a content-recognition video encoding method according to an exemplary embodiment includes a size of a Sobel slope and a degree of degree of correlation (degree of similarity)
Figure 112017020511876-pat00086
Is very similar to 0.981.

Furthermore, the screen complexity is

Figure 112017020511876-pat00087
Is defined as the standard deviation of &lt; RTI ID =
Figure 112017020511876-pat00088
And the standard deviation of the screen descriptor, it is possible to know how well the screen descriptor proposed by the content recognition video encoding method according to the embodiment quantitatively expresses the screen complexity. For 15 video sequences of Bridge-far, Mother, Akiyo, Water-fall, Silent, Soccer, Coastguard, Bridge-close, Tempete, Harbor, Football, Bus, Paris, Flower and Mobile,
Figure 112017020511876-pat00089
Wow
Figure 112017020511876-pat00090
Referring to FIG. 2 showing the standard deviation of the standard descriptor, the relation between the screen descriptor and the Sobel operator value proposed by the content recognition video encoding method according to an embodiment is quasi-linear.

As described above, since the screen descriptor proposed by the content-recognition video encoding method according to the embodiment is calculated using only some of the coefficients (the horizontal slope AC coefficient and the vertical slope AC coefficient) of the DCT coefficients, And can be utilized as an indicator sufficiently reflecting the screen complexity.

Accordingly, the screen descriptor according to one embodiment may be used independently in the process of classifying a video sequence according to screen complexity as well as in a process using a distortion prediction model described later. For example, the fifteen video sequences of Bridge-far, Mother, Akiyo, Water-fall, Silent, Soccer, Coastguard, Bridge-close, Tempete, Harbor, Football, Bus, Based on the screen descriptor, the screen complexity can be classified into three groups as shown in Table 2 according to the conditions as shown in Expression (12).

<Formula 12>

Figure 112017020511876-pat00091

Figure 112017020511876-pat00092

Figure 112017020511876-pat00093

In Equation 12,

Figure 112017020511876-pat00094
Represents a threshold value for classifying low screen complexity,
Figure 112017020511876-pat00095
Represents a threshold value for classifying the screen complexity as high. for example,
Figure 112017020511876-pat00096
Is set to 100,
Figure 112017020511876-pat00097
Lt; RTI ID = 0.0 &gt; 150. Also,
Figure 112017020511876-pat00098
In the frame
Figure 112017020511876-pat00099
. &Lt; / RTI &gt; Table 2 below shows 15 videos of Bridge-far, Mother, Akiyo, Water-fall, Silent, Soccer, Coastguard, Bridge-close, Tempete, Harbor, And the sequence is classified according to Equation (12).

group Video sequence lowness Bridge-far, Mother, Akiyo, Water-fall, Silent, Soccer, Coastguard, Bridge-close middle Tempete, Harbor, Football, Bus, Paris height Flower, Mobile

Thus, the screen descriptor of the j-th macroblock of the i-th frame

Figure 112017020511876-pat00100
Is defined as Eq. (13).

Equation (13)

Figure 112017020511876-pat00101

In Equation 13, n and m denote the position index of the jth macroblock 4 * 4 DCT block. In video encoding, residual macroblocks that are differences between current and predicted macroblocks by inter / intra prediction can be transformed via DCT. Therefore,

Figure 112017020511876-pat00102
For the calculation of the residual macroblock,
Figure 112017020511876-pat00103
Can be used.

Such

Figure 112017020511876-pat00104
May be used in a process of previously constructing a distortion prediction model used in the content recognition video encoding method according to an embodiment. A detailed description thereof will be given below.

Figures 3-5 illustrate model parameters for each of the I frame and P frame for each quantization parameter for five video sequences according to one embodiment.

Figure 112017020511876-pat00105
And FIG. 6 is a diagram illustrating a relationship between a model parameter according to various quantization parameters in each of an I frame and a P frame according to an embodiment.
Figure 112017020511876-pat00106
And
Figure 112017020511876-pat00107
And FIGS. 7 to 8 are views showing a comparison between the distortion estimated by the distortion prediction model according to an embodiment and the actually measured distortion.

Referring to FIGS. 3-8, the DQ relation based on the Cauchy distribution is defined as Equation 14. &lt; EMI ID = 14.0 &gt;

<Formula 14>

Figure 112017020511876-pat00108

In Equation 14,

Figure 112017020511876-pat00109
And
Figure 112017020511876-pat00110
Indicates a content-dependent model parameter,
Figure 112017020511876-pat00111
Represents the encoding distortion due to the Mean Square Error (MSE) for the jth macroblock of the i &lt; th &gt; frame,
Figure 112017020511876-pat00112
Is the quantization step size of the i &lt; th &gt; frame, and is the quantization parameter
Figure 112017020511876-pat00113
. In H.264 / AVC,
Figure 112017020511876-pat00114
Wow
Figure 112017020511876-pat00115
Is expressed as shown in Eq. (15).

<Formula 15>

Figure 112017020511876-pat00116

In Equation 15,

Figure 112017020511876-pat00117
And
Figure 112017020511876-pat00118
Lt; RTI ID = 0.0 &gt; 14 &lt; / RTI &gt; and 15,
Figure 112017020511876-pat00119
Wow
Figure 112017020511876-pat00120
Is defined as Eq. (16).

<Formula 16>

Figure 112017020511876-pat00121

In Equation 16,

Figure 112017020511876-pat00122
Wow
Figure 112017020511876-pat00123
Represents a model parameter. Equation 16, which indicates the encoding distortion of the jth macroblock of the i-th frame,
Figure 112017020511876-pat00124
Is a constant value for each frame type, but has a high fitting accuracy. For example, in five video sequences: Aspen, Pedestrian, Sunflower, Bluesky, and Tractor
Figure 112017020511876-pat00125
Is 0.139 in the I frame and 0.116 in the P frame, the fitting accuracy of Equation 16
Figure 112017020511876-pat00126
As shown in Table 3.

Video sequence I frame P frame Aspen 0.999 0.998 Bluesky 0.996 0.994 Pedestrian 0.989 0.985 Sunflower 0.985 0.988 Tractor 0.998 0.997 Average 0.993 0.992

Especially,

Figure 112017020511876-pat00127
Is closely related to the screen complexity,
Figure 112017020511876-pat00128
And
Figure 112017020511876-pat00129
Is a constant,
Figure 112017020511876-pat00130
Has a higher correlation with screen complexity. For the five video sequences in Table 3, the I and P frames when the quantization parameters are 24, 32, and 40, and the model parameters
Figure 112017020511876-pat00131
And screen descriptors
Figure 112017020511876-pat00132
3 to 5 are shown. Therefore,
Figure 112017020511876-pat00133
And screen descriptors
Figure 112017020511876-pat00134
Is defined as Eq. (17).

<Formula 17>

Figure 112017020511876-pat00135

In Equation 17,

Figure 112017020511876-pat00136
And
Figure 112017020511876-pat00137
Represents a model parameter that depends on the frame type and the quantization parameter. Such
Figure 112017020511876-pat00138
And
Figure 112017020511876-pat00139
Can be obtained by the curve fit of Equation 17 for each quantization parameter in the five video sequences of Table 3. [ At this time, as shown in Fig. 6, in each quantization parameter, the highest average fitting accuracy
Figure 112017020511876-pat00140
Having
Figure 112017020511876-pat00141
And
Figure 112017020511876-pat00142
Can be adopted.

From equations 16 and 17

Figure 112017020511876-pat00143
Wow
Figure 112017020511876-pat00144
Is derived as shown in Eq. (18).

<Formula 18>

Figure 112017020511876-pat00145

In Equation 18,

Figure 112017020511876-pat00146
Represents the model parameter. From Equation (18), the encoding distortion of the i &lt; th &gt;
Figure 112017020511876-pat00147
Of all the macroblocks in the frame
Figure 112017020511876-pat00148
Is defined as equation 19.

<Formula 19>

Figure 112017020511876-pat00149

In Equation 19,

Figure 112017020511876-pat00150
Represents the number of macroblocks in a frame.

As described above, the content-recognition video encoding method according to an embodiment is based on the encoding distortion of the jth macroblock of the i-th frame defined by using the screen descriptor, as shown in Equation 18,

Figure 112017020511876-pat00151
A distortion model based on a content characteristic that defines a distortion characteristic can be constructed. Since this distortion model is defined by using a screen descriptor indicating the screen complexity, it is possible to calculate not only the quantization parameter but also intra-frame distortion considering the screen complexity.

Accordingly, the content-aware video encoding method according to one embodiment is a method for encoding this distortion model into the predicted encoding distortion

Figure 112017020511876-pat00152
, It is possible to construct a distortion prediction model. A detailed description thereof will be given below.

The fitting accuracy of the distortion prediction model constructed by Eqs. 18 to 19 for ten video sequences of Akiyo, Bus, Crew, Football, Aspen, Station2, RushFieldCuts, OldTownCross, Sunflower and DucksTakeOff is shown in Table 4. Here, Model 1 and Model 2 are models that use the SATD of the DCT and the SATD of the HT as the screen descriptor, and do not consider the quantization parameter in the relationship between the screen descriptor and the model parameters of the DQ model. As shown in Table 4, the distortion prediction model according to one embodiment has the highest average fitting accuracy between the estimated distortion and the actually measured distortion (e.g.,

Figure 112017020511876-pat00153
).

resolution Video sequence Model 1 Model 2 Distortion prediction model CIF Akiyo 0.937 0.958 0.993 Bus 0.950 0.916 0.996 Crew 0.940 0.992 0.984 Football 0.940 0.905 0.993 HD Aspen 0.920 0.959 0.997 Station2 0.870 0.982 0.997 RushFieldCuts 0.982 0.912 0.992 OldTownCross 0.967 0.945 0.971 Sunflower 0.784 0.956 0.984 DucksTakeOff 0.991 0.953 0.998 Average 0.930 0.948 0.991

Referring to FIGS. 7 to 8 illustrating a comparison between the distortion estimated by the distortion prediction model according to an embodiment and the actually measured distortion based on Table 4, the distortion prediction model according to an embodiment is a conventional model 1 And 2, respectively, and the average fitting accuracy.

Hereinafter, a distortion prediction model according to an embodiment is described as being used for obtaining the optimal quantization parameter, but it is not limited thereto and can be applied to various techniques such as rate-distortion optimization macro block mode determination. The detailed description thereof will be omitted because it goes beyond the technical idea of the present invention.

FIG. 9 is a diagram for explaining the ratio of prediction modes of intra-frame macroblocks used in picture change detection according to an embodiment.

Due to the nature of video encoding using reference frames, sudden scene changes cause a significant degradation of R-D performance. Therefore, in order to prevent degradation of the prediction error due to the screen change, the first frame after the screen change must be encoded into the I frame. Detecting the scene change in the video encoding is a necessary process to prevent serious deterioration of the R-D performance.

Accordingly, the content-aware video encoding method according to an exemplary embodiment of the present invention includes the step of determining the number of macroblocks

Figure 112017020511876-pat00154
It is possible to efficiently detect a screen change at a low calculation cost.

When a violent picture change occurs, the pixels of the macroblock in the current frame are hardly correlated with the corresponding pixels of the corresponding macroblock in the previous frame. Thus, due to the high proportion of intra modes in the frame, it may be more efficient to encode the macroblocks in intra mode than inter mode or skip mode.

Referring to FIG. 9A, the number of macroblocks in the ratio of the macroblocks encoded in one of the skip mode, the inter mode, and the intra mode for the first frame after the scene change generated with the high intensity light, The ratio of intra-encoded MBs to intra-mode encoded macroblocks,

Figure 112017020511876-pat00155
) Is more than 98%. However, referring to FIG. 9 (b), it can be seen that the IMR ratio for the first frame after the change of the screen generated with the low-intensity light is reduced due to the increase of the ratio of the macroblock. The macroblocks thus increased in the IMR ratio are encoded in skip mode with increasing quantization parameters.

Accordingly, the content-aware video encoding method according to an exemplary embodiment may use a modified intra-encoded MB, as shown in Equation 20, to remove the effect of the quantization parameter and the increase of the skip mode in the IMR. ratio (MIMR).

<Formula 20>

Figure 112017020511876-pat00156

In Equation 20, MIMR represents the ratio between the number of macroblocks encoded in intra mode and the number of macroblocks encoded in skipped mode,

Figure 112017020511876-pat00157
Represents a threshold value for screen change detection. This MIMR can be used as a robust picture change detection indicator, since it is independent of the quantization parameters relative to both video sequences with high intensity and low intensity light. At this time,
Figure 112017020511876-pat00158
Can be adaptively determined according to the MIMR analyzed for high and low video sequences.

For example, the minimum MIMR for w / scene change in high-light and low-light video sequences analyzed for 512 connections of high and low-brightness video sequences is 0.955 and 0.728, respectively, as shown in Table 5, May be 0.476 and 0.655, respectively. Thus, for accurate detection,

Figure 112017020511876-pat00159
Should be less than the minimum MIMR of the scene change and may be set to a value of 0.7 higher than the maximum MIMR of the scene change.

Connected video Screen change frames MIMR Average at least maximum High strength w / screen change 0.983 0.955 0.998 w / o screen change 0.291 0.077 0.476 Low intensity w / screen change 0.867 0.728 0.949 w / o screen change 0.325 0.093 0.655

As described above, the MIMR, which represents the ratio between the number of macroblocks encoded in intra mode and the number of macroblocks encoded in skipped mode, is a suitable screen transition indicator that can be used to detect screen changes regardless of the intensity of the light Thus, the content-aware video encoding method according to an exemplary embodiment performs screen change detection based on MIMR.

10 is a diagram illustrating a type of a frame after a screen change in the single pass encoding according to an exemplary embodiment.

Referring to FIG. 10, in order to suppress error propagation due to a screen change, it is necessary to be encoded into an I frame after a screen change. The first frame after the picture change has a ratio of the macroblocks encoded in the high intra mode, but the remaining macroblocks encoded in the internal skip mode may cause error propagation due to the picture change. Therefore, the R-D performance may be significantly degraded.

Accordingly, the content recognition video encoding method according to the embodiment can determine the frame type for the frame after the screen change by using the MIMR calculated as shown in Equation (20). More specifically, as shown in FIG. 10 (a), when the first frame after the scene change is encoded into an I frame, since all the macroblocks of the first frame after the scene change have already been encoded in the intra mode, It is possible to suppress the error propagation caused by the error. Accordingly, the content recognition video encoding method according to the embodiment does not need to detect the screen change using the MIMR calculated as Equation 20, and the frame type of the next frame can be encoded into the P frame.

However, as shown in Fig. 10 (b), if the first frame after the scene change is encoded as a P frame, the next frame (the second frame after the scene change) should be encoded in the I frame. This is to suppress the propagation of errors caused by the screen change. Accordingly, in the content recognition video encoding method according to the embodiment, when the first frame after the screen change is encoded into the P frame, the screen change detection is performed using the MIMR calculated as Equation (20) As shown in FIG.

That is, in the content-aware video encoding method according to the embodiment, the MIMR calculated as shown in Equation (20)

Figure 112017020511876-pat00160
If the MIMR is less than or equal to the threshold, it is determined that the screen change occurs in the current frame, and the encoding frame type of the current frame is determined to be I frame as shown in FIG. 10 (b) It is recognized that no change occurs and the encoding frame type of the current frame can be determined as a P frame as shown in FIG. 10 (a).

FIG. 11 is a block diagram illustrating a content recognition video encoding system according to an embodiment, FIG. 12 is a diagram for explaining a process of calculating encoding distortion predicted during a content recognition video encoding process according to an embodiment, FIG. 14 is a flowchart illustrating a content recognition video encoding method according to an exemplary embodiment of the present invention, and FIG. 14 illustrates a content recognition quantization parameter determination algorithm according to an exemplary embodiment.

11 through 14, the content recognition video encoding system 1100 according to an embodiment includes a video encoder 1110 and a content recognition video encoding controller 1120. [ Hereinafter, the content recognition video encoding controller 1120 may be embodied as a hardware module combined with the video encoder 1110. However, the present invention is not limited thereto and can be implemented by a computer program installed in the processor included in the video encoder, And may be embodied in the form of a computer program stored in a recording medium. The content recognition video encoding controller 1120 includes a screen change detection unit 1121, a distortion prediction / quantization parameter setting unit 1122, and a model parameter obtaining unit 1123, which perform the content recognition video encoding method described above And controls the video encoder 1110 in connection with the video encoder 1110.

In single pass video encoding, the feature of the currently encoded frame can not be used until the video encoding of the current frame is completed. That is, extracting the characteristics of the current frame prior to video encoding necessarily leads to significant computational overhead and is not suitable for real-time video streaming applications.

Accordingly, the content-aware video encoding controller 1120 according to an exemplary embodiment may include indicators such as the screen descriptor described with reference to FIGS. 1 to 10, the number of macro blocks encoded in the intra mode, and the number of macro blocks encoded in the skip mode Use the previous frame of the current frame as a reference instead of the current frame. This is because the content-aware video encoding controller 1120 according to an exemplary embodiment may use the screen detection method described on the basis of Equation 20 because the change in the screen characteristics between the current frame and the previous frame is based on a gradual and similar assumption The content-recognized video encoding method is applied in the case where no rapid screen change has occurred.

Hereinafter, the current frame is described as an i-th frame, and the previous frame is described as an (i-1) th frame.

The screen change detection unit 1121 performs screen change detection on a current frame of a group of pictures (including a plurality of frames) (1320), and based on the result of performing screen change detection The optimal frame type of the current frame is determined (1330).

Specifically, the screen change detection unit 1121 detects the number of macroblocks encoded in the intra mode in the previous frame

Figure 112017020511876-pat00161
And the number of macroblocks encoded in skip mode in the previous frame
Figure 112017020511876-pat00162
From the video encoder 1110 and calculates a ratio MIMR between the number of macroblocks encoded in intra mode in the previous frame and the number of macroblocks encoded in skip mode in the previous frame as shown in Equation 20, (1320), and determines an optimal frame type of a current frame according to a result of the detection (1330).

For example, if the ratio MIMR between the number of macroblocks encoded in intra mode in the previous frame and the number of macroblocks encoded in skip mode in the previous frame is greater than the threshold

Figure 112017020511876-pat00163
The screen change detecting unit 1121 recognizes that a screen change occurs in the current frame, and accordingly, the screen change detecting unit 1121 detects an optimal frame type
Figure 112017020511876-pat00164
As an I frame (1311). In this case, the screen change detection unit 1121 detects the optimal quantization parameter
Figure 112017020511876-pat00165
(1312) based on the optimal quantization parameter of the previous frame. A detailed description thereof will be given below.

For another example, if the ratio MIMR between the number of macroblocks encoded in intra mode in the previous frame and the number of macroblocks encoded in skip mode in the previous frame is greater than the threshold

Figure 112017020511876-pat00166
The screen change detecting unit 1121 recognizes that no screen change occurs in the current frame, and accordingly, the screen change detecting unit 1121 detects an optimal frame type
Figure 112017020511876-pat00167
Can be determined as a P frame.

In this case, the screen change detecting unit 1121 may adaptively perform the screen change detection for the current frame only when the current frame is not the first frame of the image group, instead of unconditionally performing the screen change detection for the current frame. That is, the screen change detecting unit 1121 determines whether the current frame is the first frame of the image group (1310). If it is determined that the current frame is not the first frame of the image group, Detection may be performed (1320).

If the current frame is the first frame of the video group, the screen change detection unit 1121 detects the optimal frame type

Figure 112017020511876-pat00168
As an I frame (1311), and determines an optimum quantization parameter
Figure 112017020511876-pat00169
(1312) based on the optimal quantization parameter of the previous frame. For example, if the current frame is the first frame of the image group, the distortion of the I frame is generally lower than the distortion of the P frame under the same quantization parameter. Therefore, in step 1312, the screen change detection unit 1121 detects the optimal quantization parameter
Figure 112017020511876-pat00170
Lt; RTI ID = 0.0 &gt; 1 &lt; / RTI &gt;
Figure 112017020511876-pat00171
.

The optimum frame type of the current frame thus determined may be transmitted to the model parameter obtaining unit 1123 and the video encoder 1110, respectively. In addition, if the current frame is the first frame of the image group, or the ratio MIMR between the number of macroblocks encoded in the intra mode in the previous frame and the number of macroblocks encoded in the skip mode in the previous frame is smaller than the threshold

Figure 112017020511876-pat00172
The optimum frame type of the current frame determined in step 1311
Figure 112017020511876-pat00173
And an optimal quantization parameter of the current frame obtained in step 1312
Figure 112017020511876-pat00174
May be communicated to the video encoder (1110).

The distortion prediction / quantization parameter setting unit 1122 sets the distortion quantization parameter /

Figure 112017020511876-pat00175
The initial quantization parameter in the current frame is set (1340). For example, when the current frame is the first P frame in the image group, the distortion prediction / quantization parameter setting unit 1122 sets the optimal quantization parameter
Figure 112017020511876-pat00176
A value obtained by decrementing 1 by 1 can be set as an initial quantization parameter. This is because the distortion of the I frame is lower than the distortion of the P frame due to the cost of the generally higher bit-rate, so that the target encoding distortion
Figure 112017020511876-pat00177
And the distortion of the first P frame, a quantization parameter lower than the quantization parameter of the previous I frame is required. On the other hand, if the current frame is not the first P frame in the image group, the distortion prediction / quantization parameter setting unit 1122 sets the optimal quantization parameter
Figure 112017020511876-pat00178
As an initial quantization parameter.

The model parameter obtaining unit 1123 uses the model parameter lookup tables 1210 and 1220 that are constructed in advance to determine the optimum frame type

Figure 112017020511876-pat00179
And model parameters in the current frame based on the initial quantization parameters (1350).

Here, the model parameter lookup tables 1210 and 1220 correspond to each of the optimal frame types, quantization parameters, and model parameters

Figure 112017020511876-pat00180
And
Figure 112017020511876-pat00181
Lt; RTI ID = 0.0 &gt; a &lt; / RTI &gt; That is, the model parameter lookup tables 1210 and 1220 use the model parameters
Figure 112017020511876-pat00182
The first lookup table 1210 which is calculated in advance according to each quantization parameter and each frame type and the model parameters used in Equations 18 to 19
Figure 112017020511876-pat00183
And a second lookup table 1220 that is calculated in advance according to each quantization parameter and each frame type.

Accordingly, the model parameter acquisition unit 1123 acquires the optimal frame type of the current frame transmitted from the screen change detection unit 1121

Figure 112017020511876-pat00184
And an initial quantization parameter transmitted from the distortion prediction / quantization parameter setting unit 1122 are applied to the model parameter lookup tables 1210 and 1220 as shown in FIG. 12,
Figure 112017020511876-pat00185
And
Figure 112017020511876-pat00186
On the basis of which the model parameters used in equations 18 to 19
Figure 112017020511876-pat00187
(1230).

The model parameters thus obtained

Figure 112017020511876-pat00188
And
Figure 112017020511876-pat00189
May be transmitted to the distortion prediction / quantization parameter setting unit 1122.

In response to this, the distortion prediction / quantization parameter setting unit 1122 sets the distortion descriptor of the previous frame received from the video encoder 1110, using the distortion prediction model pre-

Figure 112017020511876-pat00190
And model parameters
Figure 112017020511876-pat00191
And
Figure 112017020511876-pat00192
The predicted encoding distortion of the current frame
Figure 112017020511876-pat00193
(1360).

Specifically, the distortion prediction / quantization parameter setting unit 1122 sets the distortion parameter / quantization parameter based on the distortion model expressed by Equations (18) to (19)

Figure 112017020511876-pat00194
And
Figure 112017020511876-pat00195
The screen descriptor
Figure 112017020511876-pat00196
And predicted encoding distortion
Figure 112017020511876-pat00197
A distortion prediction model that defines the relationship between the distortion and the distortion can be constructed in advance.

<Formula 21>

Figure 112017020511876-pat00198

In equation 21,

Figure 112017020511876-pat00199
Is an update parameter for compensating the prediction error of the distortion prediction model due to the real-time change of the screen complexity, and is defined as Equation (22).

<Formula 22>

Figure 112017020511876-pat00200

In Equation 22,

Figure 112017020511876-pat00201
Represents an adjustable weight according to the target distortion.

Here, the screen descriptor of the previous frame used by the distortion prediction / quantization parameter setting unit 1122 is expressed by Equation (11) and Equation 13, so that the horizontal slope AC coefficient and the vertical slope AC Can be calculated based on the coefficients.

Therefore, the distortion prediction / quantization parameter setting unit 1122 sets

Figure 112017020511876-pat00202
Calculated by
Figure 112017020511876-pat00203
(1240), the predicted encoding distortion of the current frame using the distortion prediction model as shown in Equation (21)
Figure 112017020511876-pat00204
(1250), the predicted encoding distortion of the current frame
Figure 112017020511876-pat00205
The optimal quantization parameter in the current frame
Figure 112017020511876-pat00206
(1370).

More specifically, in step 1370, the distortion prediction / quantization parameter setting unit 1122 sets the predicted encoding distortion

Figure 112017020511876-pat00207
And target encoding distortion of the current frame
Figure 112017020511876-pat00208
The optimal quantization parameter in the current frame that minimizes the difference between
Figure 112017020511876-pat00209
Can be obtained. For example, the distortion prediction / quantization parameter setting unit 1122 sets the predicted encoding distortion
Figure 112017020511876-pat00210
And target encoding distortion of the current frame
Figure 112017020511876-pat00211
By increasing or decreasing the initial quantization parameter so that the difference between the optimal quantization parameter in the current frame and the initial quantization parameter in the current frame is minimized,
Figure 112017020511876-pat00212
.

Steps 1350 to 1370 may be repeatedly performed for at least one generation. That is, the optimal quantization parameter obtained through steps 1350 to 1370 of the first generation is used as the initial quantization parameter in step 1350 of the second generation, which is the next generation, and steps 1360 and 1370 of the second generation are successively performed . At this time, the repetition of steps 1350 to 1370 can be performed by the content recognition quantization parameter determination algorithm 1 of FIG.

As described above, the content-aware video encoding controller 1120 according to an exemplary embodiment of the present invention includes an optimal quantization parameter

Figure 112017020511876-pat00213
May be obtained through iterative procedures so that the video encoder 1110 may encode the current frame with minimal deviation from the target distortion over conventional techniques.

Distortion prediction / quantization parameter setting unit 1122. The optimum quantization parameter &lt; RTI ID = 0.0 &gt;

Figure 112017020511876-pat00214
May be communicated to the video encoder (1110).

The video encoder 1110 encodes (1380) the current frame based on the optimal frame type of the current frame and the optimal quantization parameter in the current frame from the content-aware video encoding controller 1220, Can be output.

Although the content recognition video encoding controller 1120 has been described as having a structure including the screen change detection unit 1121, the distortion prediction / quantization parameter setting unit 1122 and the model parameter obtaining unit 1123 as described above, And may have a structure that includes more or fewer components to perform the content-aware video encoding method as described above.

FIG. 15 is a flowchart showing a specific example in which the content recognition video encoding method shown in FIG. 13 is performed in the VBR encoding.

Referring to FIG. 15, a method of encoding a content recognition video in a variable bit-rate encoding VBR encoding is performed as follows based on the process described above with reference to FIG. Here, since VBR encoding generally focuses more on video memory, the constraint on bit-rate is relatively weak. Thus, for every frame of the VBR encoding, the target encoding distortion of the i &lt; th &gt;

Figure 112017020511876-pat00215
(Target encoding distortion of the current frame)
Figure 112017020511876-pat00216
Is used.

First, the content recognition video encoding controller according to an embodiment determines whether the current frame is the first frame of the video group (1510).

As a result of the determination, if the current frame is the first frame of the video group, the content recognition video encoding controller performs I frame coding on the current frame (1520). Specifically, if the current frame is the first frame of the video group, the content-aware video encoding controller decides (1521) the optimal frame type of the current frame as an I frame (1521) and sets the optimal quantization parameter of the current frame to the optimal quantization parameter (1522). For example, in step 1522, the content-aware video encoding controller may determine an optimal quantization parameter

Figure 112017020511876-pat00217
Lt; RTI ID = 0.0 &gt; 1 &lt; / RTI &gt;
Figure 112017020511876-pat00218
.

On the other hand, if it is determined that the current frame is not the first frame of the video group, the content recognition video encoding controller performs a screen change detection for the current frame (1530). Specifically, in step 1530, the content-aware video encoding controller decodes the content based on the ratio of the number of macroblocks encoded in intra mode to the number of macroblocks encoded in skip mode in the previous frame in a previous frame received from the video encoder Change detection can be performed. For example, if the ratio between the number of macroblocks encoded in the intra-mode in the previous frame and the number of macroblocks encoded in the skipped mode in the previous frame is greater than or equal to the threshold, the content- If the ratio between the number of macroblocks encoded in the intra mode in the previous frame and the number of macroblocks encoded in the skip mode in the previous frame is less than the threshold value, The controller can recognize that the screen change does not occur in the current frame.

Accordingly, if a screen change occurs in the current frame (recognizing that a screen change has occurred), the content recognition video encoding controller may perform I frame coding on the current frame (1520).

On the other hand, if the screen change has not occurred in the current frame (recognizing that no screen change has occurred), the content recognition video encoding controller may perform P frame coding on the current frame (1540).

Specifically, when the screen change does not occur in the current frame (when it is recognized that the screen change does not occur), the content recognition video encoding controller determines the optimal frame type of the current frame as a P frame 1541, The content recognition quantization parameter determination algorithm 1 described with reference to Figs. 14 to 14 may be used (1542).

In more detail, in step 1542, the content-aware video encoding controller sets an initial quantization parameter in the current frame based on the optimal quantization parameter in the previous frame, and uses the pre-built model parameter look- Based on the frame descriptor and model parameters of the previous frame received from the video encoder using the pre-established distortion prediction model for the current frame, based on the frame type and the initial quantization parameter, And then obtain the optimal quantization parameter in the current frame based on the predicted encoding distortion of the current frame. The operation of the content-aware quantization parameter determination algorithm 1 has been described in detail with reference to FIGS. 11 to 14, and will not be described.

The content aware video encoding controller then encodes the current frame (1550) based on the optimal quantization parameter and optimal frame type in the current frame.

The content-aware video encoding controller then calculates an update parameter used in the distortion prediction model, based on the encoding distortion of the current frame obtained as a result of encoding the current frame and the predicted encoding distortion of the current frame, (1560).

Accordingly, the content-aware video encoding controller may use the update parameter calculated in step 1560 in the process of calculating the predicted encoding distortion for the next frame. That is, the content-aware video encoding controller updates the distortion prediction model defined by Equation 21 based on the update parameter calculated in operation 1560, so that the updated distortion prediction model can be utilized in the process of acquiring the optimal quantization parameter for the next frame have.

16 is a flowchart showing a concrete example in which the content recognition video encoding method shown in FIG. 13 is performed in CVBR encoding.

Referring to FIG. 16, the content-aware video encoding method in CVBR encoding, which is a limited variable bit-rate encoding, is performed as follows based on the process described above with reference to FIG. Here, the CVBR encoding should search for the solution of Equation 4 below the bit-rate constraint of Equation 5. Therefore, in order to minimize the distortion fluctuation in the CVBR encoding, considering the bit-rate constraint, the target encoding distortion of the i-th frame

Figure 112017020511876-pat00219
(The target encoding bit-rate of the current frame) and the target encoding bit-rate of the ith frame (the target encoding bit-rate of the current frame)
Figure 112017020511876-pat00220
Should be determined,

Due to the change in screen complexity between consecutive frames, the number of bits allocated to each frame may be varied to reduce distortion fluctuation while satisfying Equation (5). The screen descriptor of the j-th macroblock of the i-th frame defined by equation

Figure 112017020511876-pat00221
The target encoding bit-rate of the i &lt; th &gt;

<Formula 23>

Figure 112017020511876-pat00222

In Equation 23,

Figure 112017020511876-pat00223
Represents a constraint on the average number of bits encoded per frame.

Further, an individual target encoding distortion for each frame that satisfies the bit-rate constraint using the RD relation, i.e., the target encoding distortion of the i-th frame (the target encoding distortion of the current frame) can be derived as Equation 24 .

<Formula 24>

Figure 112017020511876-pat00224

First, the content-aware video encoding controller according to an embodiment calculates the target encoding bit-rate in the current frame based on the constraint condition of the average number of encoded bits per frame as shown in Equations 23 and 24, (1610) the target encoding distortion of the current frame based on the target encoding bit-rate of the current frame.

Then, the content recognition video encoding controller determines whether the current frame is the first frame of the image group (1620).

As a result of the determination, if the current frame is the first frame of the video group, the content recognition video encoding controller performs I frame coding on the current frame (1630). Specifically, if the current frame is the first frame of the video group, the content-aware video encoding controller decides (1631) the optimal frame type of the current frame as an I frame, and updates the optimal quantization parameter of the current frame to the optimal quantization parameter (1632). For example, in step 1632, the content-aware video encoding controller may determine an optimal quantization parameter

Figure 112017020511876-pat00225
Lt; RTI ID = 0.0 &gt; 1 &lt; / RTI &gt;
Figure 112017020511876-pat00226
.

On the other hand, if it is determined that the current frame is not the first frame of the video group, the content-aware video encoding controller 1640 performs screen change detection on the current frame. Specifically, in step 1640, the content-aware video encoding controller determines whether the number of macroblocks encoded in the intra-mode in the previous frame received from the video encoder and the number of macroblocks encoded in the skipped mode in the previous frame, Change detection can be performed. For example, if the ratio between the number of macroblocks encoded in the intra-mode in the previous frame and the number of macroblocks encoded in the skipped mode in the previous frame is greater than or equal to the threshold, the content- If the ratio between the number of macroblocks encoded in the intra mode in the previous frame and the number of macroblocks encoded in the skip mode in the previous frame is less than the threshold value, The controller can recognize that the screen change does not occur in the current frame.

Accordingly, when a screen change occurs in the current frame (when it is recognized that a screen change has occurred), the content recognition video encoding controller may perform I frame coding on the current frame (1630).

On the other hand, if the screen change has not occurred in the current frame (recognizing that no screen change has occurred), the content recognition video encoding controller may perform P frame coding on the current frame (1650).

Specifically, when the screen change does not occur in the current frame (when it is recognized that the screen change does not occur), the content recognition video encoding controller determines the optimum frame type of the current frame as a P frame (1651) The content recognition quantization parameter determination algorithm 1 described with reference to Figs. 14 to 14 can be used (1652).

More specifically, in step 1652, the content-aware video encoding controller sets an initial quantization parameter in the current frame based on the optimal quantization parameter in the previous frame, and uses the pre-built model parameter look- Based on the frame descriptor and model parameters of the previous frame received from the video encoder using the pre-established distortion prediction model for the current frame, based on the frame type and the initial quantization parameter, And then obtain the optimal quantization parameter in the current frame based on the predicted encoding distortion of the current frame. The operation of the content-aware quantization parameter determination algorithm 1 has been described in detail with reference to FIGS. 11 to 14, and will not be described.

The content-aware video encoding controller then determines the predicted encoding bit-rate in the current frame based on the optimal quantization parameter in the current frame

Figure 112017020511876-pat00227
(1660) whether to violate the bit-rate constraint, such as Equation 5. &lt; RTI ID = 0.0 &gt;

In more detail, the content-aware video encoding controller determines in step 1660 whether the predicted encoding bit-rate in the current frame is greater than the target encoding bit-rate in the current frame If it is less than or equal to the target encoding bit-rate in the current frame, the current frame may be encoded 1670 based on the optimal quantization parameter and optimal frame type in the current frame.

The content-aware video encoding controller then calculates an update parameter used in the distortion prediction model, based on the encoding distortion of the current frame obtained as a result of encoding the current frame and the predicted encoding distortion of the current frame, (1680).

Accordingly, the content-aware video encoding controller can use the update parameter calculated in step 1680 in the process of calculating the predicted encoding distortion for the next frame. That is, the content-aware video encoding controller updates the distortion prediction model defined by Equation 21 based on the update parameter calculated in operation 1680, so that the updated distortion prediction model can be utilized in the process of acquiring the optimal quantization parameter for the next frame have.

On the other hand, if it is determined in step 1660 that the predicted encoding bit-rate in the current frame is greater than the target encoding bit-rate in the current frame, then the content-aware video encoding controller determines that the predicted encoding bit- The optimal quantization parameter may be increased 1690 until it is less than or equal to the target encoding bit-rate in the current frame.

The apparatus described above may be implemented as a hardware component, a software component, and / or a combination of hardware components and software components. For example, the apparatus and components described in the embodiments may be implemented within a computer system, such as, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate array (FPGA) , A programmable logic unit (PLU), a microprocessor, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and one or more software applications running on the operating system. The processing device may also access, store, manipulate, process, and generate data in response to execution of the software. For ease of understanding, the processing apparatus may be described as being used singly, but those skilled in the art will recognize that the processing apparatus may have a plurality of processing elements and / As shown in FIG. For example, the processing unit may comprise a plurality of processors or one processor and one controller. Other processing configurations are also possible, such as a parallel processor.

The software may include a computer program, code, instructions, or a combination of one or more of the foregoing, and may be configured to configure the processing device to operate as desired or to process it collectively or collectively Device can be commanded. The software and / or data may be in the form of any type of machine, component, physical device, virtual equipment, computer storage media, or device As shown in FIG. The software may be distributed over a networked computer system and stored or executed in a distributed manner. The software and data may be stored on one or more computer readable recording media.

The method according to an embodiment may be implemented in the form of a program command that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions to be recorded on the medium may be those specially designed and configured for the embodiments or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Magneto-optical media, and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. For example, it is to be understood that the techniques described may be performed in a different order than the described methods, and / or that components of the described systems, structures, devices, circuits, Lt; / RTI &gt; or equivalents, even if it is replaced or replaced.

Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.

Claims (15)

A content-aware video encoding method based on single-pass consistent picture quality control,
Performing a screen change detection on a current frame of a group of pictures (the group of pictures including a plurality of frames);
Determining an optimal frame type of the current frame based on a result of performing the screen change detection;
Setting an initial quantization parameter in the current frame based on an optimal quantization parameter in a previous frame of the current frame;
Obtaining model parameters in the current frame based on the optimal frame type and the initial quantization parameter of the current frame using a pre-built model parameter lookup table;
Calculating a predicted encoding distortion of the current frame based on the screen descriptor of the previous frame received from the video encoder and the obtained model parameters using a distortion prediction model constructed beforehand for the current frame; And
Obtaining an optimal quantization parameter in the current frame that minimizes the difference between the predicted encoding distortion of the current frame and the target encoding distortion of the current frame
Lt; / RTI &gt;
Wherein obtaining the optimal quantization parameter in the current frame comprises:
Increasing or decreasing the initial quantization parameter such that the difference between the predicted encoding distortion of the current frame and the target encoding distortion of the current frame is minimized; And
Obtaining the increased or decreased initial quantization parameter as an optimal quantization parameter in the current frame
/ RTI &gt;
Obtaining model parameters at the current frame, calculating a predicted encoding distortion in the current frame, and obtaining an optimal quantization parameter in the current frame,
And repeatedly performed for at least one generation until the difference between the predicted encoding distortion of the current frame and the target encoding distortion of the current frame is minimized,
Wherein increasing or decreasing the initial quantization parameter comprises:
If the difference between the predicted encoding distortion of the current frame and the target encoding distortion of the current frame is not minimized, the increasing or decreasing initial quantization parameter may be used to obtain model parameters in the current frame of the next generation Using as the initial quantization parameter in step &lt; RTI ID = 0.0 &gt;
Further comprising the steps of:
The method according to claim 1,
The screen descriptor of the previous frame includes:
Wherein the computation is based on a horizontal slope AC coefficient and a vertical slope AC coefficient of the Discrete Cosine Transform coefficient in the previous frame so as to quantitatively represent the screen complexity in the previous frame.
The method according to claim 1,
In the distortion prediction model,
Wherein the model descriptor is constructed in advance to define a relationship between the picture descriptor and the predicted encoding distortion using the model parameters.
The method according to claim 1,
Wherein the step of performing screen change detection on the current frame comprises:
A screen change detection for the current frame is performed based on a ratio between the number of macroblocks encoded in the intra mode in the previous frame received from the video encoder and the number of macroblocks encoded in the skip mode in the previous frame Step
And generating the content-aware video.
5. The method of claim 4,
Wherein the step of performing screen change detection on the current frame comprises:
If the ratio between the number of macroblocks encoded in the intra mode in the previous frame and the number of macroblocks encoded in the skip mode in the previous frame is greater than or equal to a threshold value, ; or
When the ratio between the number of macroblocks encoded in the intra mode in the previous frame and the number of macroblocks encoded in the skip mode in the previous frame is smaller than the threshold value, Step
The method comprising the steps of:
Wherein determining the optimal frame type of the current frame comprises:
Determining an optimal frame type of the current frame as an I frame when it is recognized that a change in the screen occurs in the current frame; or
Determining that an optimal frame type of the current frame is a P frame if it is recognized that no screen change occurs in the current frame
The method comprising the steps of:
The method according to claim 1,
Wherein the step of performing screen change detection on the current frame comprises:
Determining whether the current frame is the first frame of the image group; And
If it is determined that the current frame is not the first frame of the image group, performing a screen change detection on the current frame
Further comprising the steps of:
The method according to claim 1,
Wherein the model parameter lookup table comprises:
Constructed in advance to define a relationship between each optimal frame type, quantization parameters and model parameters in sample frames.
delete delete The method according to claim 1,
Calculating a target encoding bit-rate in the current frame based on a constraint of an average number of encoded bits per frame; And
Calculating a target encoding distortion of the current frame based on a target encoding bit-rate in the current frame
Further comprising:
Wherein obtaining the optimal quantization parameter in the current frame comprises:
Estimating a predicted encoding bit-rate in the current frame based on an optimal quantization parameter in the current frame; And
Increasing the optimal quantization parameter until the predicted encoding bit-rate in the current frame is less than or equal to the target encoding bit-rate in the current frame
Further comprising the steps of:
The method according to claim 1,
Encoding the current frame based on an optimal frame type of the current frame and an optimal quantization parameter in the current frame
Further comprising the steps of:
12. The method of claim 11,
Wherein encoding the current frame comprises:
Calculating an update parameter used in the distortion prediction model based on the encoding distortion of the current frame obtained as a result of encoding the current frame and the predicted encoding distortion of the current frame; And
Using the update parameter in calculating the predicted encoding distortion for the next frame of the current frame
Further comprising the steps of:
A computer program stored in a recording medium for executing a content recognition video encoding method based on single-pass consistent image quality control in combination with a computer embodying an electronic device,
The content-aware video encoding method includes:
Performing a screen change detection on a current frame of a group of pictures (the group of pictures including a plurality of frames);
Determining an optimal frame type of the current frame based on a result of performing the screen change detection;
Setting an initial quantization parameter in the current frame based on an optimal quantization parameter in a previous frame of the current frame;
Obtaining model parameters in the current frame based on the optimal frame type and the initial quantization parameter of the current frame using a pre-built model parameter lookup table;
Calculating a predicted encoding distortion of the current frame based on the screen descriptor of the previous frame received from the video encoder and the obtained model parameters using a distortion prediction model constructed beforehand for the current frame; And
Obtaining an optimal quantization parameter in the current frame that minimizes the difference between the predicted encoding distortion of the current frame and the target encoding distortion of the current frame
Lt; / RTI &gt;
Wherein obtaining the optimal quantization parameter in the current frame comprises:
Increasing or decreasing the initial quantization parameter such that the difference between the predicted encoding distortion of the current frame and the target encoding distortion of the current frame is minimized; And
Obtaining the increased or decreased initial quantization parameter as an optimal quantization parameter in the current frame
/ RTI &gt;
Obtaining model parameters at the current frame, calculating a predicted encoding distortion in the current frame, and obtaining an optimal quantization parameter in the current frame,
And repeatedly performed for at least one generation until the difference between the predicted encoding distortion of the current frame and the target encoding distortion of the current frame is minimized,
Wherein increasing or decreasing the initial quantization parameter comprises:
If the difference between the predicted encoding distortion of the current frame and the target encoding distortion of the current frame is not minimized, the increasing or decreasing initial quantization parameter may be used to obtain model parameters in the current frame of the next generation Using as the initial quantization parameter in step &lt; RTI ID = 0.0 &gt;
And a computer program product stored in the storage medium.
A content-aware video encoding controller based on single-pass consistent picture quality control,
A group of pictures (group of pictures), which includes a plurality of frames, is subjected to screen change detection on a current frame, and based on a result of performing the screen change detection, an optimal frame type of the current frame A screen change detecting unit for determining the screen change;
A distortion prediction / quantization parameter setting unit that sets an initial quantization parameter in the current frame based on an optimal quantization parameter in a previous frame of the current frame; And
Acquiring model parameters in the current frame based on the optimum frame type of the current frame and the initial quantization parameter using a pre-built model parameter lookup table,
Lt; / RTI &gt;
Wherein the distortion prediction / quantization parameter setting unit comprises:
Calculating a predicted encoding distortion of the current frame based on the screen descriptor of the previous frame received from the video encoder and the obtained model parameters using a distortion prediction model previously constructed for the current frame, Obtaining an optimal quantization parameter in the current frame that minimizes the difference between the predicted encoding distortion and the target encoding distortion of the current frame,
Increasing or decrementing the initial quantization parameter such that the difference between the predicted encoding distortion of the current frame and the target encoding distortion of the current frame is minimized, and adding the increased or decreased initial quantization parameter to the optimal quantization parameter In addition,
Calculating predicted encoding distortion in the current frame performed by the distortion prediction / quantization parameter setting unit, and calculating the predicted encoding distortion in the current frame, The step of obtaining an optimal quantization parameter comprises:
And repeatedly performed for at least one generation until the difference between the predicted encoding distortion of the current frame and the target encoding distortion of the current frame is minimized,
Wherein the step of increasing or decreasing the initial quantization parameter in the distortion prediction /
If the difference between the predicted encoding distortion of the current frame and the target encoding distortion of the current frame is not minimized, the increasing or decreasing initial quantization parameter may be used to obtain model parameters in the current frame of the next generation Using as the initial quantization parameter in step &lt; RTI ID = 0.0 &gt;
Further comprising: a content-aware video encoding controller.
A content-aware video encoding system based on single-pass consistent picture quality control,
Video encoder; And
The content-aware video encoding controller
Lt; / RTI &gt;
The content-aware video encoding controller includes:
A group of pictures (group of pictures), which includes a plurality of frames, is subjected to screen change detection on a current frame, and based on a result of performing the screen change detection, an optimal frame type of the current frame A screen change detecting unit for determining the screen change;
A distortion prediction / quantization parameter setting unit that sets an initial quantization parameter in the current frame based on an optimal quantization parameter in a previous frame of the current frame; And
Acquiring model parameters in the current frame based on the optimum frame type of the current frame and the initial quantization parameter using a pre-built model parameter lookup table,
Lt; / RTI &gt;
Wherein the distortion prediction / quantization parameter setting unit comprises:
Calculating a predicted encoding distortion of a current frame based on a screen descriptor of the previous frame received from the video encoder and the obtained model parameters using a distortion prediction model constructed in advance for the current frame, To obtain an optimal quantization parameter in the current frame that minimizes the difference between the predicted encoding distortion of the current frame and the target encoding distortion of the current frame,
Increasing or decrementing the initial quantization parameter such that the difference between the predicted encoding distortion of the current frame and the target encoding distortion of the current frame is minimized, and adding the increased or decreased initial quantization parameter to the optimal quantization parameter In addition,
Calculating predicted encoding distortion in the current frame performed by the distortion prediction / quantization parameter setting unit, and calculating the predicted encoding distortion in the current frame, The step of obtaining an optimal quantization parameter comprises:
And repeatedly performed for at least one generation until the difference between the predicted encoding distortion of the current frame and the target encoding distortion of the current frame is minimized,
Wherein the step of increasing or decreasing the initial quantization parameter in the distortion prediction /
If the difference between the predicted encoding distortion of the current frame and the target encoding distortion of the current frame is not minimized, the increasing or decreasing initial quantization parameter may be used to obtain model parameters in the current frame of the next generation Using as the initial quantization parameter in step &lt; RTI ID = 0.0 &gt;
Further comprising:
The video encoder comprising:
And encodes the current frame based on an optimal frame type of the current frame delivered from the content aware video encoding controller and an optimal quantization parameter in the current frame.
KR1020170026356A 2017-02-28 2017-02-28 Content-aware video encoding method, controller and system based on single-pass consistent quality control KR101868270B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020170026356A KR101868270B1 (en) 2017-02-28 2017-02-28 Content-aware video encoding method, controller and system based on single-pass consistent quality control

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1020170026356A KR101868270B1 (en) 2017-02-28 2017-02-28 Content-aware video encoding method, controller and system based on single-pass consistent quality control

Publications (1)

Publication Number Publication Date
KR101868270B1 true KR101868270B1 (en) 2018-06-15

Family

ID=62628742

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020170026356A KR101868270B1 (en) 2017-02-28 2017-02-28 Content-aware video encoding method, controller and system based on single-pass consistent quality control

Country Status (1)

Country Link
KR (1) KR101868270B1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20030082818A (en) * 2002-04-18 2003-10-23 삼성전자주식회사 Apparatus and method for performing variable bit rate control in real time
JP2010541386A (en) * 2007-09-28 2010-12-24 ドルビー・ラボラトリーズ・ライセンシング・コーポレーション Video compression and video transmission techniques
US20130089137A1 (en) * 2011-10-06 2013-04-11 Synopsys, Inc. Rate distortion optimization in image and video encoding
US20150215621A1 (en) * 2014-01-30 2015-07-30 Qualcomm Incorporated Rate control using complexity in video coding

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20030082818A (en) * 2002-04-18 2003-10-23 삼성전자주식회사 Apparatus and method for performing variable bit rate control in real time
JP2010541386A (en) * 2007-09-28 2010-12-24 ドルビー・ラボラトリーズ・ライセンシング・コーポレーション Video compression and video transmission techniques
US20130089137A1 (en) * 2011-10-06 2013-04-11 Synopsys, Inc. Rate distortion optimization in image and video encoding
US20150215621A1 (en) * 2014-01-30 2015-07-30 Qualcomm Incorporated Rate control using complexity in video coding

Similar Documents

Publication Publication Date Title
CN109862359B (en) Code rate control method and device based on layered B frame and electronic equipment
KR100468726B1 (en) Apparatus and method for performing variable bit rate control in real time
US9615085B2 (en) Method and system for structural similarity based rate-distortion optimization for perceptual video coding
US8179981B2 (en) Method and apparatus for determining bit allocation for groups of pixel blocks in a picture according to attention importance level
US8331449B2 (en) Fast encoding method and system using adaptive intra prediction
US20100111180A1 (en) Scene change detection
US20090097546A1 (en) System and method for enhanced video communication using real-time scene-change detection for control of moving-picture encoding data rate
JP2006140758A (en) Method, apparatus and program for encoding moving image
JP2002010259A (en) Image encoding apparatus and its method and recording medium recording image encoding program
EP2041984A1 (en) Method and apparatus for adapting a default encoding of a digital video signal during a scene change period
US8050320B2 (en) Statistical adaptive video rate control
Jing et al. A novel intra-rate estimation method for H. 264 rate control
US8792562B2 (en) Moving image encoding apparatus and method for controlling the same
JP5649296B2 (en) Image encoding device
Zhou et al. Complexity-based intra frame rate control by jointing inter-frame correlation for high efficiency video coding
JP4179917B2 (en) Video encoding apparatus and method
KR101868270B1 (en) Content-aware video encoding method, controller and system based on single-pass consistent quality control
Wu et al. A region of interest rate-control scheme for encoding traffic surveillance videos
KR101242560B1 (en) Device and method for adjusting search range
US20120328007A1 (en) System and method for open loop spatial prediction in a video encoder
KR20130032807A (en) Method and apparatus for encoding a moving picture
Li et al. Low-delay window-based rate control scheme for video quality optimization in video encoder
KR20090037288A (en) Method for real-time scene-change detection for rate control of video encoder, method for enhancing qulity of video telecommunication using the same, and system for the video telecommunication
Wu et al. A content-adaptive distortion-quantization model for intra coding in H. 264/AVC
US8064526B2 (en) Systems, methods, and apparatus for real-time encoding

Legal Events

Date Code Title Description
E701 Decision to grant or registration of patent right
GRNT Written decision to grant