CN108810545B

CN108810545B - Method, apparatus, computer readable medium and electronic device for video encoding

Info

Publication number: CN108810545B
Application number: CN201810726552.9A
Authority: CN
Inventors: 张昊; 李明娟; 王剑光; 翟海昌; 汪亮; 廖念波
Original assignee: Tencent Technology Shenzhen Co Ltd; Central South University
Current assignee: Tencent Technology Shenzhen Co Ltd; Central South University
Priority date: 2018-07-04
Filing date: 2018-07-04
Publication date: 2023-04-18
Anticipated expiration: 2038-07-04
Also published as: CN108810545A

Abstract

The embodiment of the invention provides a method, a device, a computer readable medium and an electronic device for video coding. The video sequence adopts n-way coding, and the method comprises the following steps: performing (n-1) th path coding on the video sequence to generate an (n-1) th path analysis file, and obtaining (n-1) th path coding information of video frames in the video sequence; obtaining quantization parameters of the nth path coding rate control stage of the video frame; performing nth coding on the video sequence based on the (n-1) th path analysis file, and correcting a quantization parameter of an nth path coding rate control stage of the video frame according to the (n-1) th path coding information of the video frame; wherein n is a positive integer greater than or equal to 2. The technical scheme of the embodiment of the invention can correct the quantization parameter of the current path coding code rate control stage of the video frame in the video sequence by the coding information of the previous path coding, thereby reducing the quality fluctuation of the video sequence.

Description

Method, apparatus, computer readable medium and electronic device for video encoding

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for video encoding, a computer-readable medium, and an electronic device.

Background

Video coding technology is used continuously in live video and on demand, and is becoming mature, and under the condition of large film source data volume, scenes with dark brightness or approximate black color often appear in common television series or movies. For example, scenes in which consecutive 3s or more frames are low brightness or dark often occur in movie scenes. Although the existence of these low-brightness frames does not affect the visual experience of people during the viewing process, the current frame in the video sequence can affect the subsequent video frame when being used as the reference frame of the subsequent video frame.

The above problem exists in conventional non-distributed video coding, which is exacerbated in distributed coding by analyzing the distributed encoded data.

In distributed coding, the integrity and continuity of the video is cut off as the complete video sequence is cut into multiple small video sequences. The effect of the frame type change and/or the lack of coding information during the coding process, which may occur at the cut-off distributed slicing points, on the subsequent video quality is a gradual cumulative process, and is particularly significant for the quality of the low-luminance or dark frames of the similar scenes mentioned above, which may result in large quality fluctuations of the video sequence in these particular scenes.

Therefore, a new method, apparatus, computer-readable medium, and electronic device for video encoding are needed.

It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present invention and therefore may include information that does not constitute prior art known to a person of ordinary skill in the art.

Disclosure of Invention

Embodiments of the present invention provide a method and an apparatus for video encoding, a computer-readable medium, and an electronic device, which can reduce quality fluctuation of a video sequence.

Additional features and advantages of the invention will be set forth in the detailed description which follows, or may be learned by practice of the invention.

According to an aspect of the embodiments of the present invention, there is provided a method for video coding, a video sequence adopts n-way coding; wherein the method comprises the following steps: performing (n-1) th path coding on the video sequence to generate an (n-1) th path analysis file, and acquiring (n-1) th path coding information of a video frame in the video sequence; obtaining quantization parameters of the nth path coding rate control stage of the video frame; performing nth coding on the video sequence based on the (n-1) th path analysis file, and correcting a quantization parameter of an nth path coding rate control stage of the video frame according to the (n-1) th path coding information of the video frame; wherein n is a positive integer greater than or equal to 2.

In some embodiments of the present invention, based on the foregoing scheme, the (n-1) th encoding information comprises peak signal-to-noise ratios of (n-1) th luminance component and chrominance component of the video frame; wherein, the modifying the quantization parameter of the nth path coding rate control stage of the video frame according to the (n-1) th path coding information of the video frame includes: and if the peak signal-to-noise ratios of the (n-1) th luminance component and the chrominance component of the current frame in the video frame are greater than a first threshold and the current frame is not a scene switching frame, correcting the quantization parameter of the current frame according to the peak signal-to-noise ratios of the (n-1) th luminance component and the chrominance component of the current frame.

In some embodiments of the present invention, based on the foregoing solution, the modifying the quantization parameter of the current frame according to the (n-1) th peak snr of the luminance component and the chrominance component of the current frame includes: if the peak signal-to-noise ratio of the (n-1) th path of the brightness component and the chrominance component of the current frame is in any one of the 1 st interval to the m1 st interval, correcting the quantization parameter of the current frame by adopting a corresponding regulation factor; wherein m1 is a positive integer of 1 or more.

In some embodiments of the invention, m1 is equal to 5, based on the foregoing scheme; wherein, if the peak signal-to-noise ratio of the (n-1) th path of the luminance component and the chrominance component of the current frame is in any one of the 1 st interval to the mth interval, the modifying the quantization parameter of the current frame by using the corresponding regulation factor includes: if the peak signal-to-noise ratio of the (n-1) th luminance component and the chrominance component of the current frame is in the 1 st interval, correcting the quantization parameter of the current frame by adopting a first regulation factor; or if the (n-1) th peak signal-to-noise ratio of the luminance component and the chrominance component of the current frame is in the 2 nd interval, correcting the quantization parameter of the current frame by adopting a second regulation factor; or if the (n-1) th peak signal-to-noise ratio of the luminance component and the chrominance component of the current frame is in a 3 rd interval, correcting the quantization parameter of the current frame by adopting a third regulating and controlling factor; or if the peak signal-to-noise ratio of the (n-1) th luminance component and the chrominance component of the current frame is in the 4 th interval, correcting the quantization parameter of the current frame by adopting a fourth regulation factor; or if the peak signal-to-noise ratio of the (n-1) th luminance component and the chrominance component of the current frame is in the 5 th interval, correcting the quantization parameter of the current frame by adopting a fifth regulation factor.

In some embodiments of the present invention, based on the foregoing scheme, if the video sequence is a segmented video sequence, the first to fifth adjustment factors are: 1.2,1.8,2.2,3,4.5.

In some embodiments of the present invention, based on the foregoing scheme, the (n-1) th encoding information comprises peak signal-to-noise ratios of the (n-1) th luminance component and the chrominance component of the video frame; wherein, the correcting the quantization parameter of the nth path coding rate control stage of the video frame according to the (n-1) th path coding information of the video frame comprises: if the peak signal-to-noise ratio of the (n-1) th luminance component and the chrominance component of the current frame in the video frame is greater than a second threshold value and is in the m2 th interval, and the current frame is a first inter-frame prediction frame, acquiring the quantization parameter of the forward reference frame of the current frame; correcting the quantization parameter of the current frame according to the quantization parameter of the forward reference frame; wherein m2 is a positive integer of 1 or more.

In some embodiments of the present invention, based on the foregoing scheme, the following formula is adopted to modify the quantization parameter of the first inter-frame prediction frame: q' = q- (ref 0Qp + a) × b, wherein ref0Qp is the quantization parameter of the forward reference frame; a and b are constants; q is a quantization parameter of the first inter-frame prediction frame; q' is the quantized parameter of the first inter-frame prediction frame after modification.

In some embodiments of the present invention, based on the foregoing scheme, the (n-1) th encoding information comprises peak signal-to-noise ratios of the (n-1) th luminance component and the chrominance component of the video frame; wherein, the modifying the quantization parameter of the nth path coding rate control stage of the video frame according to the (n-1) th path coding information of the video frame includes: if the peak signal-to-noise ratio of the (n-1) th path of luminance component and chrominance component of the current frame in the video frame is greater than a second threshold value and is in the m2 th interval, and the current frame is a second inter-frame prediction frame, acquiring the quantization parameter of the forward reference frame of the current frame and/or the quantization parameter of the backward parameter frame of the current frame; correcting the quantization parameter of the current frame according to the quantization parameter of the forward reference frame of the current frame and/or the quantization parameter of the backward parameter frame of the current frame; wherein m2 is a positive integer of 1 or more.

In some embodiments of the present invention, based on the foregoing scheme, the quantization parameter of the second inter-frame prediction frame is modified by using the following formula: q' = q- ((ref 0Qp + c 1) + (ref 1Qp + c 2)) × d, where ref0Qp is the quantization parameter of the forward reference frame; ref1Qp is a quantization parameter of the backward reference frame; c1, c2 and d are constants; q is a quantization parameter of the second inter-frame prediction frame; q' is the quantization parameter of the second inter-frame prediction frame after modification.

According to an aspect of the embodiments of the present invention, there is provided an apparatus for video coding, a video sequence adopts n-way coding; wherein the apparatus comprises: the first coding module is configured to perform (n-1) th path coding on the video sequence to generate an (n-1) th path analysis file, and obtain (n-1) th path coding information of video frames in the video sequence; a quantization parameter obtaining module configured to obtain a quantization parameter of the nth path coding rate control stage of the video frame; the second coding module is configured to perform nth coding on the video sequence based on the (n-1) th path analysis file, and modify the quantization parameter of the nth path coding rate control stage of the video frame according to the (n-1) th path coding information of the video frame; wherein n is a positive integer greater than or equal to 2.

In some embodiments of the present invention, based on the foregoing scheme, the (n-1) th encoding information comprises peak signal-to-noise ratios of (n-1) th luminance component and chrominance component of the video frame; wherein the second encoding module comprises: and the first quantization parameter correction unit is configured to correct the quantization parameter of the current frame according to the (n-1) th peak signal-to-noise ratio of the luminance component and the chrominance component of the current frame if the (n-1) th peak signal-to-noise ratio of the luminance component and the chrominance component of the current frame in the video frame is greater than a first threshold and the current frame is not a scene switching frame.

In some embodiments of the present invention, based on the foregoing scheme, the first quantization parameter modification unit includes: a first quantization parameter modification subunit, configured to modify a quantization parameter of a current frame by using a corresponding regulation factor if a peak signal-to-noise ratio of a (n-1) th luminance component and a chrominance component of the current frame in the video frame is greater than a first threshold and the current frame is not a scene switching frame, and the peak signal-to-noise ratio of the (n-1) th luminance component and the chrominance component of the current frame is in any one of a 1 st interval to an m1 st interval; wherein m1 is a positive integer of 1 or more.

In some embodiments of the present invention, based on the foregoing scheme, the (n-1) th encoding information comprises peak signal-to-noise ratios of (n-1) th luminance component and chrominance component of the video frame; wherein the second encoding module comprises: a second quantization parameter modification unit, configured to obtain a quantization parameter of a forward reference frame of a current frame if a (n-1) th peak snr of a luminance component and a chrominance component of the current frame in the video frame is greater than a second threshold and is in an m2 th interval, and the current frame is a first inter-frame prediction frame; correcting the quantization parameter of the current frame according to the quantization parameter of the forward reference frame; and/or a third quantization parameter modification unit, configured to, if a (n-1) th peak signal-to-noise ratio of the luminance component and the chrominance component of a current frame in the video frame is greater than the second threshold and is in an m2 th interval, and the current frame is a second inter-frame prediction frame, obtain a quantization parameter of a forward reference frame of the current frame and/or a quantization parameter of a backward parameter frame of the current frame; correcting the quantization parameter of the current frame according to the quantization parameter of the forward reference frame of the current frame and/or the quantization parameter of the backward parameter frame of the current frame; wherein m2 is a positive integer of 1 or more.

According to an aspect of embodiments of the present invention, there is provided a computer readable medium, on which a computer program is stored, which program, when executed by a processor, implements the method for video encoding as described in the above embodiments.

According to an aspect of an embodiment of the present invention, there is provided an electronic apparatus including: one or more processors; a storage device configured to store one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the method for video encoding as described in the above embodiments.

In the technical solutions provided by some embodiments of the present invention, the (n-1) th path of encoding information of a video frame in a video sequence is obtained by performing the (n-1) th path of encoding on the video sequence, and the quantization parameter of the n-th path of encoding rate control stage of the video frame is corrected based on the (n-1) th path of encoding information, so that the quality fluctuation of the video sequence can be reduced, and better visual experience is brought to a user.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort. In the drawings:

fig. 1 shows a schematic diagram of an exemplary system architecture of a method for video encoding or an apparatus for video encoding to which embodiments of the present invention may be applied;

FIG. 2 illustrates a schematic structural diagram of a computer system suitable for use with the electronic device to implement an embodiment of the invention;

fig. 3 schematically shows a flow diagram of a method for video encoding according to an embodiment of the invention;

FIG. 4 is a diagram illustrating a processing procedure of step S310 shown in FIG. 3 in one embodiment;

FIG. 5 is a diagram illustrating a processing procedure of step S330 shown in FIG. 3 in one embodiment;

fig. 6 schematically shows a flow chart of a method for video encoding according to a further embodiment of the present invention;

fig. 7 schematically shows a flow chart of a method for video encoding according to a further embodiment of the present invention;

fig. 8 schematically shows a block diagram of an apparatus for video encoding according to an embodiment of the present invention;

fig. 9 schematically shows a block diagram of an apparatus for video encoding according to still another embodiment of the present invention.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art.

Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the invention. One skilled in the relevant art will recognize, however, that the invention can be practiced without one or more of the specific details, or with other methods, components, devices, steps, and so forth. In other instances, well-known methods, devices, implementations, or operations have not been shown or described in detail to avoid obscuring aspects of the invention.

The block diagrams shown in the figures are functional entities only and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor means and/or microcontroller means.

The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the contents and operations/steps, nor do they necessarily have to be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.

Fig. 1 shows a schematic diagram of an exemplary system architecture 100 of a method for video encoding or an apparatus for video encoding to which embodiments of the present invention may be applied.

As shown in fig. 1, the system architecture 100 may include one or more of

terminal devices

101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the

terminal devices

101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for an implementation. For example, server 105 may be a server cluster comprised of multiple servers, or the like.

A user may use

terminal devices

101, 102, 103 to interact with a server 105 over a network 104 to receive or send messages or the like. The

terminal devices

101, 102, 103 may be various electronic devices having display screens including, but not limited to, smart phones, tablets, portable and desktop computers, digital cinema projectors, and the like.

The server 105 may be a server that provides various services. For example, a user sends a request for a live video or a video on demand to the server 105 using the terminal device 103 (which may also be the terminal device 101 or 102). The server 105 may retrieve the matched search result from the database based on the related information carried in the live video or the video on demand, and feed the search result, for example, the corresponding video information back to the terminal device 103, so that the user may view the corresponding video based on the content displayed on the terminal device 103.

Also for example, terminal device 103 (which may also be terminal device 101 or 102) may be a digital cinema projector in a movie theater, through which a user may send video playback instructions to server 105. The server 105 may retrieve a matching movie video from the database based on the video playback instruction and return the movie video to the digital movie projector, and play the returned movie video through the digital movie projector.

FIG. 2 illustrates a schematic structural diagram of a computer system suitable for use with the electronic device to implement an embodiment of the invention.

It should be noted that the computer system 200 of the electronic device shown in fig. 2 is only an example, and should not bring any limitation to the functions and the scope of the application of the embodiment of the present invention.

As shown in fig. 2, the computer system 200 includes a Central Processing Unit (CPU) 201 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM) 202 or a program loaded from a storage section 208 into a Random Access Memory (RAM) 203. In the RAM 203, various programs and data necessary for system operation are also stored. The CPU 201, ROM202, and RAM 203 are connected to each other via a bus 204. An input/output (I/O) interface 205 is also connected to bus 204.

The following components are connected to the I/O interface 205: an input portion 206 including a keyboard, a mouse, and the like; an output section 207 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 208 including a hard disk and the like; and a communication section 209 including a network interface card such as a LAN card, a modem, or the like. The communication section 209 performs communication processing via a network such as the internet. A drive 210 is also connected to the I/O interface 205 as needed. A removable medium 211 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 210 as necessary, so that a computer program read out therefrom is mounted into the storage section 208 as necessary.

In particular, according to an embodiment of the present invention, the processes described below with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the invention include a computer program product comprising a computer program embodied on a computer-readable medium, the computer program comprising program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 209 and/or installed from the removable medium 211. The computer program, when executed by a Central Processing Unit (CPU) 201, performs various functions defined in the methods and/or apparatus of the present application.

It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of methods, apparatus, and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules and/or units and/or sub-units described in the embodiments of the present invention may be implemented by software, or may be implemented by hardware, and the described modules and/or units and/or sub-units may also be disposed in a processor. Wherein the names of such modules and/or units and/or sub-units in some cases do not constitute a limitation on the modules and/or units and/or sub-units themselves.

As another aspect, the present application also provides a computer-readable medium, which may be contained in the electronic device described in the above embodiments; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by an electronic device, cause the electronic device to implement the method as described in the embodiments below. For example, the electronic device may implement the steps shown in fig. 3 or fig. 4 or fig. 5 or fig. 6 or fig. 7.

Fig. 3 schematically shows a flow diagram of a method for video encoding according to an embodiment of the present invention.

According to the method for video coding provided by the embodiment of the invention, a video sequence can adopt n-path coding, wherein n is a positive integer greater than or equal to 2.

For example, n may be 2, then the current video sequence adopts 2-way coding, i.e. the current video sequence is first coded by the first way (1 pass); and then, based on the analysis file coded by 1pass, carrying out a second pass (2 pass) on the current video sequence. When the current video sequence adopts 2-way coding, the 1pass coding generally only retains the required effective information and stores the effective information as a file, so that the effective information in the file is read during the 2pass coding to carry out the second-way coding.

The 2pass coding mode can accurately obtain the desired average code rate, the 2pass represents that the video sequence is coded for 2 times, and the first path coding video coder such as x264 analyzes the whole video sequence first to obtain a stats file and an mbtree file (the mbtree is used by default). The second encoding can allocate reasonable code rate with the two files as reference. Requiring a particular code rate or file size requires the use of 2pass or multiple pass coding.

For another example, if n may be 3, then the current video sequence adopts 3-way coding, that is, the current video sequence is first subjected to the first-way coding (1 pass); then, based on the analysis file coded by 1pass, carrying out a second pass coding (2 pass) on the current video sequence; and then, based on the analysis file coded by 2pass, carrying out third pass coding (3 pass) on the current video sequence. By analogy, the current video sequence may be subjected to multi-path coding, and the specific number of coding paths may be determined according to an application scenario, which is not limited in the present invention. That is, besides 2pass, there is a multi-pass coding mode, and the analysis is continued based on the previous analysis, so that the code rate allocation is more reasonable theoretically, but actually 2pass is enough.

In the following examples, the 2-way coding is used for the video sequence, but the present invention is not limited thereto.

As shown in fig. 3, the method for video encoding according to the embodiment of the present invention may include the following steps. The method according to the embodiment of the present invention may be executed by a terminal device, may also be executed by a server, or may be executed by the terminal device and the server interactively, for example, the server 105 in fig. 1 may be executed, but the present invention is not limited thereto.

In step S310, the (n-1) th way encoding is performed on the video sequence to generate an (n-1) th way analysis file, and the (n-1) th way encoding information of the video frames in the video sequence is obtained.

In an exemplary embodiment, the (n-1) th encoding information may include peak signal-to-noise ratios of the (n-1) th luminance component and chrominance component of the video frame (i.e., the (n-1) th YUV PSNR of the video frame, hereinafter referred to as (n-1) passSNR).

Among them, PSNR (Peak Signal to Noise Ratio) is an objective standard for evaluating an image. Generally, after image compression, the output image is different from the original image to some extent. In order to measure the quality of processed images, the PSNR value is usually referred to determine whether a certain process is satisfactory. PSNR is in dB, with larger values indicating less distortion. The PSNR is an image objective evaluation index which is common and widely used, and is based on errors among corresponding pixel points, namely, the image quality evaluation based on error sensitivity.

In step S320, a Quantization Parameter (QP, hereinafter referred to as q) of the nth coding rate control stage of the video frame is obtained.

In the embodiment of the present invention, before encoding the nth path in the video sequence, the video encoder may automatically assign an initial quantization parameter q to each video frame in the video sequence in advance.

In step S330, the nth path of coding is performed on the video sequence based on the (n-1) th path of analysis file, and the quantization parameter of the nth path of coding rate control stage of the video frame is corrected according to the (n-1) th path of coding information of the video frame.

In the embodiment of the invention, the (n-1) th path of coding information of each video frame is used for correcting the quantization parameter value used in the coding, and then the quantization parameter value after the correction is used for coding.

In an exemplary embodiment, the modifying the quantization parameter of the nth coding rate control stage of the video frame according to the (n-1) th coding information of the video frame may include: and if the peak signal-to-noise ratios of the (n-1) th luminance component and the chrominance component of the current frame in the video frame are greater than a first threshold and the current frame is not a scene switching frame, correcting the quantization parameter of the current frame according to the peak signal-to-noise ratios of the (n-1) th luminance component and the chrominance component of the current frame.

For example, the first threshold may be 61. However, the present invention is not limited thereto, and the value of the first threshold may be determined according to whether the video sequence is segmented or not and according to a specific application scenario.

In the embodiment of the present invention, the video frames with (n-1) passpSNR larger than the first threshold may be referred to as low brightness frames. For example, a low-luminance frame may refer to a video frame whose (n-1) th YUV PSNR value is greater than 61.

The basic principle of video coding is briefly explained here: video image data is extremely correlated, i.e., has a large amount of redundant information. The redundant information can be divided into spatial domain redundant information and time domain redundant information. The compression technology is to remove redundant information in the data and remove the correlation between the data, and the compression technology comprises an intra-frame image data compression technology, an inter-frame image data compression technology and an entropy coding compression technology. Video files generally involve three parameters: frame rate, resolution and code rate. The frame rate is the number of pictures displayed per second, and affects the picture fluency, and is proportional to the picture fluency. The code rate is the data volume after the picture displayed per second is compressed, and the code rate influences the volume and is in direct proportion to the volume. Resolution refers to the length and width of a (rectangular) picture, i.e. the size of the picture. Where the frame rate times the resolution is equal to the amount of data per second before compression in bytes. The compression ratio is equal to the data volume/code rate per second before compression, and for the same video source and by adopting the same video coding algorithm, the higher the compression ratio is, the worse the picture quality is.

Video quality can be expressed in subjective and objective ways, where subjective is video sharpness as usually mentioned and objective is quantization parameter or compression ratio or bitrate. Under the premise of the same video source and the same compression algorithm, the quantization parameter, the compression ratio and the code rate have direct proportional relation.

The code rate of the video directly affects the coding quality of the video. In order to effectively transmit video data to meet the service requirement of network video service under the condition that the channel bandwidth and the transmission delay are limited, rate control is generally required to be performed on video coding. The code rate control means that the code rate generated by video coding is made to be as close as possible to the determined target code rate by setting appropriate coding parameters on the premise of ensuring the recovery of video quality.

In an exemplary embodiment, the modifying the quantization parameter of the current frame according to the (n-1) th peak snr of the luminance component and the chrominance component of the current frame may include: if the peak signal-to-noise ratio of the (n-1) th luminance component and the chrominance component of the current frame is in any one of the 1 st interval to the m1 st interval, correcting the quantization parameter of the current frame by adopting a corresponding regulation factor; wherein m1 is a positive integer of 1 or more.

In the present example, m1 is equal to 5. However, the present invention is not limited thereto, and the specific interval division may be selected according to the actual application scenario, and is only used for illustration here.

In an exemplary embodiment, if the peak signal-to-noise ratio of the (n-1) th luminance component and the chrominance component of the current frame is in any one of the 1 st interval to the m th interval, modifying the quantization parameter of the current frame by using the corresponding adjustment factor may include:

if the peak signal-to-noise ratio of the (n-1) th path of luminance component and the chrominance component of the current frame is in the 1 st interval, correcting the quantization parameter of the current frame by adopting a first regulation factor qFactor 1; or

If the peak signal-to-noise ratio of the (n-1) th path of luminance component and the chrominance component of the current frame is in the 2 nd interval, correcting the quantization parameter of the current frame by adopting a second regulation factor qFactor 2; or

If the peak signal-to-noise ratio of the (n-1) th path of luminance component and the chrominance component of the current frame is in a 3 rd interval, correcting the quantization parameter of the current frame by adopting a third regulation factor qFactor 3; or

If the peak signal-to-noise ratio of the (n-1) th path of the brightness component and the chrominance component of the current frame is in a 4 th interval, correcting the quantization parameter of the current frame by adopting a fourth regulation factor qFactor 4; or

And if the peak signal-to-noise ratio of the (n-1) th path of luminance component and the chrominance component of the current frame is in a 5 th interval, correcting the quantization parameter of the current frame by adopting a fifth regulation factor qFactor5.

The embodiment of the invention divides the video frame of the low-brightness scene into 5 sections by evaluating the YUV PSNR value of the low-brightness video frame (which can be abbreviated as a low-brightness frame) and counting the YUV PSNR value of the low-brightness video frame, and adjusts the quantization parameter by adopting the corresponding regulation and control factor obtained by the empirical value of a large amount of data in each section, thereby effectively improving the video quality of the video frame of the low-brightness scene, reducing the quality fluctuation, particularly having obvious distributed coding effect, and being applied to video on demand and live broadcast platforms.

In an exemplary embodiment, if the video sequence is a segmented video sequence, the first to fifth adjustment factors are: 1.2,1.8,2.2,3,4.5.

It should be noted that the first to fifth adjustment factors are empirical values obtained by performing a large amount of data statistical analysis on a segmented video sequence in distributed coding, and if a coding mode of the video sequence is changed, for example, non-distributed coding is adopted, the first to fifth adjustment factors may be changed correspondingly. In addition, although the above description is made by taking distributed video coding as an example, the method provided by the embodiment of the present invention may also be applied to a non-distributed coding manner.

The video transcoding is video transcoding for short, and refers to the conversion of a video from one format to another format, wherein the format is characterized by a code rate, a frame rate, a spatial distribution rate and an encoding algorithm. With the development of high definition movies, the source capacity of the movie has increased from a few G to tens of G, and the time required to transcode such a source has also increased dramatically. In addition, as the types of the user terminals which need to be adapted for transcoding are increased, the transcoding task is increasingly diversified. At present, the mainstream transcoding mode is distributed, namely a plurality of transcoding servers are adopted to transcode video files simultaneously. And segmenting the video at the film source, transmitting each segment to a corresponding transcoding server, merging each segment into a video file after transcoding is finished, and returning the video. The method has the advantages of parallel transcoding, low time cost and capability of coping with concurrent transcoding tasks. For example, a Distributed video transcoding scheme based on a Hadoop platform uses an HDFS (Hadoop Distributed System, distributed file System) to store video files and a MapReduce programming framework to perform Distributed transcoding, and a transcoding tool adopts FFMPEG to integrate video storage and transcoding, so that the purposes of reducing network bandwidth and time consumption during transcoding are achieved.

The whole system can comprise a WebServer and a Hadoop cluster. WebServer is responsible for handling user requests, including access and transcoding of video. And the NameNode in the Hadoop cluster is responsible for receiving the user request forwarded by the WebServer and scheduling the DataNodes in the cluster to store or transcode the video. In the HDFS, files are stored in blocks, and a file is composed of one or more data blocks, wherein the size of the data blocks can be adjusted, so that segmentation (i.e. partitioning for video) is required when storing a video file.

In distributed coding, the slice point of a segmented video sequence is also a video frame. A change in frame type (e.g., I-frame, or P-frame, or B-frame) may occur for video frames located at a slice point on the one hand; on the other hand, even if the frame type of the video frame at the slice point does not change, the video quality is affected since the video continuity is affected in the distributed encoding. For example, if the current frame is an inter-predicted frame, reference to the forward video frame is required, and the slice may result in the loss of coding information. Therefore, quality fluctuation of video may be more severe in distributed coding due to a frame type change of a slice point and/or absence of coding information.

In an exemplary embodiment, the modifying the quantization parameter of the nth coding rate control stage of the video frame according to the (n-1) th coding information of the video frame may include: if the peak signal-to-noise ratio of the (n-1) th path of luminance component and chrominance component of the current frame in the video frame is larger than a second threshold value and is in the m2 th interval, and the current frame is a first inter-frame prediction frame, acquiring the quantization parameter of the forward reference frame of the current frame; correcting the quantization parameter of the current frame according to the quantization parameter of the forward reference frame; wherein m2 is a positive integer of 1 or more.

In an exemplary embodiment, m1 may be equal to m2, e.g., m1= m2=5. However, the present invention is not limited thereto.

In an exemplary embodiment, the quantization parameter of the first inter-predicted frame may be modified using the following formula:

q’＝q-(ref0Qp+a)*b (1)

in the formula (1), ref0Qp is a quantization parameter of a forward reference frame of the first inter prediction frame; a and b are constants, for example, a can be 4,b can be 2, but the invention is not limited thereto; q is a quantization parameter of the first inter-frame prediction frame; q' is the quantized parameter of the first inter-frame prediction frame after modification.

In an exemplary embodiment, the modifying the quantization parameter of the nth coding rate control stage of the video frame according to the (n-1) th coding information of the video frame may include: if the peak signal-to-noise ratio of the (n-1) th path of luminance component and chrominance component of the current frame in the video frame is larger than a second threshold value and is in an m2 th interval, and the current frame is a second inter-frame prediction frame, acquiring the quantization parameter of a forward reference frame of the current frame and/or the quantization parameter of a backward parameter frame of the current frame; correcting the quantization parameter of the current frame according to the quantization parameter of the forward reference frame of the current frame and/or the quantization parameter of the backward parameter frame of the current frame; wherein m2 is a positive integer of 1 or more.

In an exemplary embodiment, the second threshold is greater than or equal to the first threshold, for example, the second threshold may be 97. In the embodiment of the invention, on one hand, the quantization parameters of the low-brightness frame can be optimized by adopting corresponding regulation and control factors; on the other hand, the quantization parameters of the forward reference frame and/or the backward reference frame of the current frame can be adopted to perform further optimization processing on the quantization parameters of the current frame aiming at a lower brightness frame and even a video frame of an approximately dark scene.

In an exemplary embodiment, the quantization parameter of the second inter-predicted frame may be modified using the following formula:

q’＝q–((ref0Qp+c1)+(ref1Qp+c2))*d (2)

in the formula (2), ref0Qp is a quantization parameter of a forward reference frame of the second inter prediction frame; ref1Qp is a quantization parameter of a backward reference frame of the second inter-frame prediction frame; c1, c2 and d are constants, for example, c1 and c2 can be equal, and both values are 4,d and can be 2, but the invention is not limited thereto; q is a quantization parameter of the second inter-frame prediction frame; q' is the quantization parameter of the second inter-frame prediction frame after modification.

In the embodiment of the present invention, since the encoding standards such as h.261, h.263, and h.264, MPEG2, and MPEG4, and the like all use the frame-layer rate control algorithm to compress the video, the applicable range of the present invention covers these encoding standards.

The method for video coding provided by the embodiment of the invention is based on a multi-path (including two-path) coding mode, and can optimize the quantization parameter of the code rate control stage of the current path coding by the previous path coding information reserved by the previous path coding on the video sequence, thereby making up the quality fluctuation of the video frame to achieve the effects of reducing the quality loss and reducing the fluctuation. Especially for a local special scene such as a low-brightness video frame in distributed coding, the quality fluctuation can be effectively adjusted.

The method for video coding provided by the above embodiment is exemplified by using two-way coding as an example in conjunction with fig. 4-7.

In the embodiment of the invention, before the quantization parameter adjustment of the second path coding rate control stage, the first path coding information (such as onepass PSNR of the first path coding) of the first path coding (1 pass coding) is firstly obtained and stored in the analysis file, and the one path coding information is obtained in the second path coding (2 pass) stage of the video file. The specific operation is shown in the flow chart 4 and the flow chart 5.

Fig. 4 is a schematic diagram illustrating a processing procedure of step S310 shown in fig. 3 in an embodiment. The method steps of the embodiment of the present invention may be executed by the terminal device, may also be executed by the server, or may be executed by the terminal device and the server interactively, for example, the server 105 in fig. 1 may be executed, but the present invention is not limited thereto.

As shown in fig. 4, in the embodiment of the present invention, the step S310 may further include the following steps.

In step S311, it is determined whether a video sequence exists; if the video sequence exists, go to step S312; otherwise, step S315 is skipped to and the operation is ended.

In step S312, it is determined whether the video sequence is currently the first pass coding (1 pass coding); if the first path is coded, step S313 is entered; otherwise, the process proceeds to step S314.

In step S313, a first pass encoding (1 pass encoding) is performed on a video frame in the video sequence and first pass encoding information of the video frame is collected.

In step S314, error information is output.

In step S315, the process ends.

Fig. 5 is a schematic diagram illustrating a processing procedure of step S330 shown in fig. 3 in an embodiment. The method steps of the embodiment of the present invention may be executed by the terminal device, may also be executed by the server, or may be executed by the terminal device and the server interactively, for example, the server 105 in fig. 1 may be executed, but the present invention is not limited thereto.

As shown in fig. 5, in the embodiment of the present invention, the step S330 may further include the following steps.

In step S331, it is determined whether a video sequence exists; if the video sequence exists, go to step S332; otherwise, the process goes to step S336 to end the present operation.

In step S332, determining whether the video sequence is currently the second pass coding (2 pass coding); if the second path of coding (2 pass coding) is available, step S333 is entered; otherwise, the process proceeds to step S335.

In the embodiment of the invention, 2pass means 2-time coding, and the information in 1pass is recorded by the file analysis file. The 2-time encoding is equivalent to 2-time conversion, and although the conversion time is prolonged, the compressed video has better image quality, better picture details and smaller volume. The 2pass mainly aims at non-real-time video coding such as files, the first path of coding is to scan the whole video file and record some statistical information, and the second path of coding is to code according to the statistical information recorded in the front, so that the coding quality can be improved.

In step S333, it is determined whether a first file exists; if the first path of analysis file exists, the step S334 is entered; otherwise, the process goes to step S336 to end the present operation.

In step S334, the first path analysis file is read, and a second path encoding is performed on the video frames in the video sequence based on the first path analysis file.

In step S335, error information is output.

In step S336, the process ends.

The following describes an example of the scheme for correcting the quantization parameter in the second coding rate control stage by using fig. 6 and fig. 7, respectively.

It should be noted that the schemes in fig. 6 and fig. 7 described below may be combined to optimize the quantization parameter q, or only one of the schemes in fig. 6 or fig. 7 may be used to optimize the quantization parameter q, and in the following embodiment, the scheme in fig. 6 and fig. 7 is combined to optimize the quantization parameter q as an example.

Fig. 6 schematically shows a flow chart of a method for video encoding according to a further embodiment of the present invention.

In the embodiment of the present invention, an artificial classification of video frames based on the first path of coding information (YUV PSNR value) is obtained by analyzing a plurality of video sequence data and scenes, for example, it is assumed that the YUV PSNR value is 0-60dB for a normal video scene, 61-89dB for a dark video scene, and 90-100dB for a video scene that is approximately at night, and a specific flowchart is shown in fig. 6.

It should be noted that the above numerical values are only used for illustration, and can be adjusted accordingly in practical applications.

In the embodiment of the invention, when the second path of coding rate control coding is carried out, the first path of coding information is firstly obtained, and then the quantization parameter is regulated and controlled when the code rate control stage is formally started.

As shown in fig. 6, the method for video encoding according to the embodiment of the present invention may include the following steps. The method steps of the embodiment of the present invention may be executed by the terminal device, may also be executed by the server, or may be executed by the terminal device and the server interactively, for example, the server 105 in fig. 1 may be executed, but the present invention is not limited thereto.

In step S601, determining whether a current frame in the video sequence is an IDR (Instantaneous Decoding Refresh) frame; if the current frame is an IDR frame, jumping to step S612; if the current frame is not an IDR frame, step S602 is performed.

In the embodiment of the present invention, the frame types of the video frame may include an I frame, a P frame, and a B frame, where an IDR frame belongs to one of the I frames.

The I frame is an intra-frame coding frame, is an independent frame with all information, can be independently decoded without referring to other images, and is always the first frame in a video sequence. Both I and IDR frames use intra prediction. In the encoding and decoding, for the convenience of distinguishing the first I frame from other I frames, the first I frame is called IDR frame, so that the encoding and decoding process is convenient to control.

The IDR frame is refreshed immediately to prevent error from spreading, and a new sequence is recalculated from the IDR frame to start encoding. While I-frames do not have the capability of random access, this function is assumed by IDR. IDR will cause the DPB (Decoded Picture Buffer) to be empty, while I will not. The IDR frame must be an I frame, but the I frame is not necessarily an IDR frame. There may be many I-frames in a video sequence and video frames following an I-frame may reference video frames between I-frames for motion reference.

For IDR frames, all frames following an IDR frame cannot refer to the content of frames preceding any IRD frame, in contrast to normal I frames, for which the following B and P frames may refer to I frames preceding the normal I frame. From a randomly accessed video stream, the player can always play from one IDR frame because there is no frame following it to reference the previous frame. However, it is not possible to start playing from any point in a video without IDR frames, since the following frames always refer to the preceding frames.

The processing of IDR frames is the same as that of I frames: performing intra-frame prediction and determining an adopted intra-frame prediction mode; subtracting the predicted value from the pixel value to obtain a residual error; transforming and quantizing the residual error; variable length coding and arithmetic coding; and reconstructing the image and filtering to obtain the image serving as a reference frame of other frames.

It should be noted that, in the embodiments of the present invention, the IDR is used as an identifier for determining whether the current frame is a scene change frame, but the present invention is not limited thereto.

In the embodiment of the invention, whether the frame type of the current frame is an IDR frame is judged firstly, if the frame type of the current frame is the IDR frame, the default operation of an encoder is entered, and at the moment, the quantization parameter q of the current frame is not adjusted; otherwise, next, judging the YUV PSNR value of the current frame in the path of coding.

In step S602, it is determined whether the YUV PSNR value oneasspsnr in the first path of encoding information of the current frame is greater than 97 and less than or equal to 100; if 97< -onepass PSNR of the current frame is less than or equal to 100, jumping to step S611; if the current frame does not satisfy 97 pieces of onepass PSNR less than or equal to 100, then the process goes to step S603.

In step S603, it is determined whether the YUV PSNR value onepassasspsnr in the first path of encoding information of the current frame is greater than 90 and equal to or less than 97; if the 90-once onepass PSNR of the current frame is less than or equal to 97, jumping to step S610; if the current frame does not satisfy 90-straw onepass PSNR less than or equal to 97, go to step S604.

In step S604, it is determined whether the YUV PSNR value oneasspsnr in the first path of encoding information of the current frame is greater than 85 and less than or equal to 90; if the 85< -onepass PSNR of the current frame is less than or equal to 90, jumping to a step S609; if the current frame does not satisfy 85-straw onepass PSNR less than or equal to 90, the process proceeds to step S605.

In step S605, it is determined whether the YUV PSNR value onepassasspsnr in the first path of encoding information of the current frame is greater than 70 and less than or equal to 85; if the 70-straw onepass PSNR of the current frame is less than or equal to 85, jumping to step S608; if the current frame does not satisfy 70-straw onepass PSNR less than or equal to 85, the process proceeds to step S606.

In step S606, it is determined whether the YUV PSNR value oneasspsnr in the first path of encoding information of the current frame is greater than 60 and equal to or less than 70; if the 60-straw onepass PSNR of the current frame is less than or equal to 70, jumping to step S607; if the current frame does not satisfy 60-straw onepass psnr less than or equal to 70, go to step S612.

In step S607, 60 pieces of yarpsassnr ≦ 70, q' = q-qFactor1, and then the flow proceeds to step S612 to enter another encoding stage.

In the embodiment of the invention, when onepassPSNR is in the interval (60,70), q' = q-qFactor1.

Wherein q is a quantization parameter of the two-way coding code rate control stage; q' is the quantization parameter of the two-path coding code rate control stage after correction. It should be noted that, if the current video sequence adopts n-way coding, q is a quantization parameter in the rate control stage of n-way coding, and the corresponding onepass PSNR is modified to be (n-1) passPNSR, i.e., the YUV PSNR value of the (n-1) th-way coding.

In step S608, 70-onepasspsnr ≦ 85, q' = q-qFactor2, and then the flow proceeds to step S612 to enter other encoding stages.

In the embodiment of the invention, when onepospsnr is in the interval (70,85), q' = q-qFactor2.

In step S609, 85-onepasspsnr ≦ 90, q' = q-qFactor3, and then the flow proceeds to step S612 to enter other encoding stages.

In the embodiment of the invention, when onepospsnr is in the interval (85,90), q' = q-qFactor3.

In step S610, 90-AppleassPSNR ≦ 97, q' = q-qFactor4, and then proceeds to step S612 to enter other encoding stages.

In the embodiment of the invention, when onepospsnr is in the interval (90,97), q' = q-qFactor4.

In step S611, 97 were straw onepass psnr ≦ 100, q' = q-qFactor5, and then the flow goes to step S612 to enter other encoding stages.

In the embodiment of the invention, when onepospsnr is in the interval (97,100), q' = q-qFactor5.

It should be noted that, in an actual situation, the onepass psnr value is not an integer, and in general encoding, the onepass psnr value is of a double type, and two bits are further provided after a decimal point, so although the interval division of the onepass psnr value in the embodiment of the present invention takes integers, for example, 100, 97, 90, 85, 70, and 60 as end point values, the division is only an approximate value.

In the embodiment of the invention, qFactor1-qFactor5qFactor are optimized empirical values obtained according to statistical analysis, for example, the values of qFactor1 to qFactor5 can be 1.2,1.8,2.2,3,4.5. However, the present invention is not limited thereto.

In step S612, other encoding stages.

In step S613, the process ends.

Fig. 7 schematically shows a flowchart of a method for video encoding according to still another embodiment of the present invention.

As shown in fig. 7, the method for video encoding according to the embodiment of the present invention may further include the following steps. The method steps of the embodiment of the present invention may be executed by the terminal device, the server, or both the terminal device and the server, for example, the server 105 in fig. 1 may execute the method steps, but the present invention is not limited thereto.

In step S701, it is determined whether the YUV PSNR value oneasspsnr in the first encoding information of the current frame in the video sequence is greater than 97 and less than or equal to 100 and is a second encoding; if the 97-straw onepass PSNR of the current frame is less than or equal to 100 and is the second path of coding, the process goes to step S702; if the current frame does not satisfy 97 pieces of onepass PSNR less than or equal to 100 or is not the second path of coding, the process goes to step S709 to end the operation.

In the embodiment of the present invention, by analyzing a plurality of video sequence data, when the onepss pr is between intervals (97,100) and the current coding is a non-one-way coding (here, taking two-way coding as an example), the quality fluctuation can be further improved after such video frames are optimized by using the method described in fig. 6. Therefore, when such video frames are not scene cut frames (an IDR frame is taken as an example, but the present invention is not limited thereto), the quantization parameters of such video frames can be further optimized by using the information of the reference frames (e.g., a forward reference frame and/or a backward reference frame) of such video frames.

In the embodiment of the present invention, the reference frame of the current frame may not refer to only a few previous and subsequent frames, and in this type of encoder, the reference frame may refer to 16 previous and subsequent frames as the longest frame, for example, so that the method shown in fig. 7 may be used to achieve a certain effect when adjusting the video frame of such a scene.

In the embodiment of the invention, when the second path of coding rate control coding is carried out, the first path of coding information is firstly obtained, and then the quantization parameter is regulated and controlled when the second path of coding rate control stage is formally started.

In the embodiment of the invention, whether the area range of the onePassPSNR interval of the current frame is within the interval of (97,100) is judged firstly.

In step S702, it is determined whether the current frame is an inter-frame prediction frame; if the current frame is an inter-frame prediction frame, then step S703 is performed; if the current frame is not an inter-frame prediction frame, the process goes to step S708.

In the embodiment of the invention, if the onePassPSNR interval range of the current frame is in the interval of (97,100), then judging whether the current frame is subjected to inter-frame prediction, if the current frame is not subjected to inter-frame prediction, the current frame is subjected to intra-frame prediction, and at the moment, keeping the q value of the current frame unchanged.

It should be noted that, the current frame is subjected to intra-frame prediction, and at this time, the q value of the current frame is kept unchanged, which does not mean that the q value is a quantization parameter that is always an initial preset value of a video encoder, and if the video sequence is processed by the method described in fig. 6 and fig. 7, the quantization parameter q of the current frame is already processed by the method described in fig. 6, that is, 97 <onepsaspnr ≦ 100, q' = q-qFactor5.

In step S703, it is determined whether the current frame is a P frame; if the current frame is a P frame, go to step S704; if the current frame is not a P frame, go to step S706.

In the embodiment of the present invention, if the current frame is an inter-frame prediction frame, it is further determined whether the current frame is an inter-frame P frame or a B frame.

Wherein, the P frame is a forward predictive coding frame. When encoding continuous moving images, several continuous images are divided into P, B, I three types, a P frame is predicted from a P frame or an I frame before the P frame, and the P frame or the I frame is compared with the same information or data between the P frame and the I frame before the P frame or the I frame, that is, inter-frame compression is performed in consideration of the characteristics of motion. The P-frame method compresses data of a frame according to a difference between the frame and an I-frame or a P-frame adjacent to the previous frame. The method of P frame and I frame joint compression can achieve higher compression without obvious compression trace.

The P frame uses I frame as reference frame, finds out the predicted value and motion vector of some point in the P frame, and takes the predicted difference value and motion vector to transmit together. At the receiving end, the predicted value of a certain point of the P frame is found out from the I frame according to the motion vector and is added with the difference value to obtain a certain point sample value of the P frame, so that the complete P frame can be obtained.

In step S704, determining whether the current frame has a forward reference frame; if the current frame has a forward reference frame, step S705 is performed; if the current frame does not have the forward reference frame, jumping to step S709 to end the operation.

In the embodiment of the invention, the reference frame is a frame which needs to be referred to during IPB coding. The relationship between the reference frame and the IPB requires knowledge of the encoding scheme of each frame type. The I frame refers to the block coding in the image, and does not need a reference frame. P frames are encoded with reference to a previous I frame or P frame, with the reference frames being forward. B-frames are coded with reference to previous and subsequent I-or P-frames, e.g., one frame before or one frame after the other, or just a forward or backward reference frame (three options).

In step S705, if the current frame is a P frame, the quantization parameter q of the current frame may be adjusted by using the following formula:

ref0Qp’＝ref0Qp+4，

q’＝q-ref0Qp’*2，

in the embodiment of the invention, if the current frame is an interframe P frame, a ref0Qp value of refSlice0 is obtained; and if the B frame between the current frames is the frame B, obtaining ref0Qp and ref1Qp values of refSlice0 and refSlice 1.

Wherein refSlice0 represents a forward reference frame of the current frame; ref0Qp represents a refSlice0 quantization parameter, i.e., qp value.

In step S706, if the current frame is a B frame, it is determined whether a forward reference frame and/or a backward reference frame exists in the current reference; if the current reference has a forward reference frame and/or a backward reference frame, step S707 is executed; if the current reference does not have the forward reference frame and/or the backward reference frame, jumping to step S709 to end the operation.

In the embodiment of the present invention, if the current frame is an inter-frame predicted B frame and a forward reference frame and a backward reference frame exist in the current frame at the same time, ref0Qp and ref1Qp values of refSlice0 and refSlice1 are obtained.

Wherein, refSlice0 and refSlice1 are the forward reference frame and the backward reference frame of the current frame respectively; ref0Qp and ref1Qp represent quantization parameters, i.e., qp values, of refSlice0 and refSlice1, respectively.

In the embodiment of the invention, some video sequences are simpler, and B frames may not exist.

Wherein, the B frame is a bidirectional predictive interpolation coding frame. The B-frame method is an inter-frame compression algorithm for bi-directional prediction. When compressing a frame into a B frame, it compresses the frame according to the difference of the adjacent previous frame, the current frame and the next frame data, that is, only the difference between the current frame and the previous and next frames is recorded. High compression can be achieved with B-frame compression. The B frame uses the former I frame or P frame and the latter P frame as reference frame, finds out the predicted value and two motion vectors of the B frame 'a certain point', and takes the predicted difference value and motion vector to transmit. The receiving end is in two reference frames according to the motion vector.

In step S707, in the embodiment of the present invention, if the current frame is an inter-frame predicted B frame and the current frame has a forward reference frame and a backward reference frame, the quantization parameter q of the current frame may be modified by using the following formula:

ref0Qp’＝ref0Qp+4

ref1Qp’＝ref1Qp+4

combine_ref＝(ref0Qp’+ref1Qp’)*2

q’＝q-combine_ref，

wherein combine _ ref represents a tuning constant of a quantization parameter of a current frame if the current frame is a B frame and both forward and backward reference frames exist.

It should be noted that, in the above steps, a forward reference frame and/or a backward reference frame of a current frame are taken as an example for illustration, but in other embodiments, multiple forward reference frames and/or multiple backward reference frames may exist in the current frame at the same time, and at this time, a quantization parameter of the current frame may be optimized according to quantization parameters of multiple forward reference frames and/or multiple backward reference frames of the current frame, which is not limited in this disclosure.

In step S708, if the current frame is an intra-frame prediction frame, q is not changed.

In step S709, the process ends.

In the embodiment of the invention, the quality fluctuation of the video sequence can be measured by SSIM, namely the smaller the standard deviation of SSIM is, the larger the average value of SSIM is, the smaller the quality fluctuation of the corresponding video sequence is considered.

Among them, SSIM (structural similarity) is also a fully-referenced image quality evaluation index, which measures image similarity from three aspects of brightness, contrast, and structure. The SSIM value range [0,1], the larger the value, the smaller the image distortion.

In practical tests, the video sequence is optimized by using the method as described above by taking the test sequence 1 and the test sequence 2 as examples, and the specific effects are shown in the following table:

according to the test results in the table, it can be seen that after the video sequence is sliced and optimized by the method provided by the embodiment of the present invention, the SSIM standard deviation at the position corresponding to the low-brightness video frame is smaller than the SSIM standard deviation at the corresponding position of the original video sequence that is not sliced; meanwhile, after the video sequence is sliced and optimized by the method provided by the embodiment of the invention, the SSIM average value of the video frame position corresponding to the area with darker scene locally appears is larger than that of the original non-sliced video sequence. Therefore, the method for video coding provided by the embodiment of the invention is feasible, and can ensure the video quality while accelerating the video coding and decoding speed; the video-on-demand and live broadcast platform is optimized, and better visual experience is brought to users.

Fig. 8 schematically shows a block diagram of an apparatus for video encoding according to an embodiment of the present invention.

The apparatus 800 for video coding according to the embodiment of the present invention may adopt n-way coding for a video sequence. Wherein n is a positive integer greater than or equal to 2.

As shown in fig. 8, an apparatus 800 for video encoding according to an embodiment of the present invention may include: a first encoding module 810, a quantization parameter obtaining module 820 and a second encoding module 830.

The first encoding module 810 may be configured to perform (n-1) th encoding on the video sequence to generate an (n-1) th analysis file, and obtain (n-1) th encoding information of video frames in the video sequence.

In an exemplary embodiment, the (n-1) th way encoding information includes (n-1) th way luminance component and chrominance component peak signal-to-noise ratios of the video frame.

The quantization parameter obtaining module 820 may be configured to obtain a quantization parameter of the nth coding rate control stage of the video frame.

The second encoding module 830 may be configured to perform an nth encoding on the video sequence based on the (n-1) th analysis file, and modify a quantization parameter of an nth encoding rate control stage of the video frame according to the (n-1) th encoding information of the video frame.

The specific implementation of each module in the apparatus for video coding according to the embodiment of the present invention may refer to the content in the method for video coding, and is not described herein again.

As shown in fig. 9, an apparatus 900 for video encoding according to an embodiment of the present invention may include: a first encoding module 910, a quantization parameter obtaining module 920 and a second encoding module 930.

The first encoding module 910 may be configured to perform (n-1) th encoding on the video sequence to generate an (n-1) th analysis file, and obtain (n-1) th encoding information of video frames in the video sequence.

In an exemplary embodiment, the (n-1) th encoding information includes peak signal-to-noise ratios of (n-1) th luminance and chrominance components of the video frame.

The quantization parameter obtaining module 920 may be configured to obtain a quantization parameter of the nth coding rate control stage of the video frame.

The first encoding module 910 and the quantization parameter obtaining module 920 may refer to the first encoding module 810 and the quantization parameter obtaining module 820 in the embodiment shown in fig. 8.

The second encoding module 930 may be configured to perform n-th encoding on the video sequence based on the (n-1) -th analysis file, and modify the quantization parameter of the n-th encoding rate control stage of the video frame according to the (n-1) -th encoding information of the video frame.

In an exemplary embodiment, the second encoding module 930 may further include: an nth path encoding unit (not shown). Wherein the nth-way encoding unit may be configured to perform nth-way encoding on the video sequence based on the (n-1) th-way analysis file.

In the embodiment shown in fig. 9, the second encoding module 930 may further include a first quantization parameter modification unit 931. The first quantization parameter modification unit 931 may be configured to modify the quantization parameter of the current frame according to the (n-1) th peak snr of the luminance component and the chrominance component of the current frame if the (n-1) th peak snr of the luminance component and the chrominance component of the current frame in the video frame is greater than the first threshold and the current frame is not a scene switching frame.

With continued reference to fig. 9, the first quantization parameter modification unit 931 may include: the first quantization parameter modification subunit 9311. The first quantization parameter modification subunit 9311 may be configured to modify the quantization parameter of the current frame by using a corresponding adjustment factor if the peak signal-to-noise ratio of the (n-1) th luminance component and the chrominance component of the current frame in the video frame is greater than a first threshold, the current frame is not a scene switching frame, and the peak signal-to-noise ratio of the (n-1) th luminance component and the chrominance component of the current frame is in any one of the 1 st interval to the m1 st interval; wherein m1 is a positive integer of 1 or more.

With continued reference to fig. 9, the second encoding module 930 may further include: a second quantization parameter modification unit 932 and/or a third quantization parameter modification unit 933.

The second quantization parameter modification unit 932 may be configured to obtain a quantization parameter of a forward reference frame of a current frame if a peak signal-to-noise ratio of an (n-1) th luminance component and a peak signal-to-noise ratio of a chrominance component of the current frame in the video frame are greater than a second threshold and in an m2 th interval, and the current frame is a first inter-frame prediction frame; correcting the quantization parameter of the current frame according to the quantization parameter of the forward reference frame; and/or

The third quantization parameter modification unit 933 may be configured to obtain a quantization parameter of a forward reference frame of a current frame and/or a quantization parameter of a backward parameter frame of the current frame if a (n-1) th peak snr of a luminance component and a chrominance component of the current frame in the video frame is greater than the second threshold and is in an m2 th interval, and the current frame is a second inter-frame prediction frame; correcting the quantization parameter of the current frame according to the quantization parameter of the forward reference frame of the current frame and/or the quantization parameter of the backward parameter frame of the current frame; wherein m2 is a positive integer of 1 or more.

In an exemplary embodiment, the second threshold is greater than the first threshold.

For specific implementation of each module and/or unit and/or sub-unit in the apparatus for video encoding provided by the embodiment of the present invention, reference may be made to the contents in the method for video encoding, which are not described herein again.

It should be noted that although in the above detailed description several modules or units or sub-units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functions of two or more modules or units or sub-units described above may be embodied in one module or unit or sub-unit according to an embodiment of the invention. Conversely, the features and functions of one module or unit or sub-unit described above may be further divided into a plurality of modules or units or sub-units.

Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiment of the present invention can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which can be a personal computer, a server, a touch terminal, or a network device, etc.) to execute the method according to the embodiment of the present invention.

Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.

It will be understood that the invention is not limited to the precise arrangements that have been described above and shown in the drawings, and that various modifications and changes can be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims

1. A method for video coding, characterized in that a video sequence is coded using n-way coding; wherein the method comprises the following steps:

performing (n-1) th path coding on the video sequence to generate an (n-1) th path analysis file, and obtaining (n-1) th path coding information of a video frame in the video sequence, wherein the (n-1) th path coding information comprises (n-1) th path brightness component and chrominance component peak signal-to-noise ratio of the video frame;

obtaining quantization parameters of the nth path coding rate control stage of the video frame;

performing nth coding on the video sequence based on the (n-1) th path analysis file, and correcting a quantization parameter of an nth path coding rate control stage of the video frame according to the (n-1) th path coding information of the video frame;

wherein n is a positive integer greater than or equal to 2;

wherein, the correcting the quantization parameter of the nth path coding rate control stage of the video frame according to the (n-1) th path coding information of the video frame comprises:

and if the peak signal-to-noise ratios of the (n-1) th luminance component and the chrominance component of the current frame in the video frame are greater than a first threshold and the current frame is not a scene switching frame, correcting the quantization parameter of the current frame according to the peak signal-to-noise ratios of the (n-1) th luminance component and the chrominance component of the current frame.

2. The method according to claim 1, wherein said modifying the quantization parameter of the current frame according to the (n-1) th peak snr of the luma component and the chroma components of the current frame comprises:

if the peak signal-to-noise ratio of the (n-1) th luminance component and the chrominance component of the current frame is in any one of the 1 st interval to the m1 st interval, correcting the quantization parameter of the current frame by adopting a corresponding regulation factor;

wherein m1 is a positive integer of 1 or more.

3. The method of claim 2, wherein m1 is equal to 5; wherein, if the (n-1) th peak signal-to-noise ratio of the luminance component and the chrominance component of the current frame is in any one of the 1 st interval to the m1 st interval, the modifying the quantization parameter of the current frame by using the corresponding regulation factor includes:

if the (n-1) th peak signal-to-noise ratio of the luminance component and the chrominance component of the current frame is in a 1 st interval, correcting the quantization parameter of the current frame by adopting a first regulation factor; or

If the peak signal-to-noise ratio of the (n-1) th luminance component and the chrominance component of the current frame is in the 2 nd interval, correcting the quantization parameter of the current frame by adopting a second regulation factor; or

If the (n-1) th peak signal-to-noise ratio of the luminance component and the chrominance component of the current frame is in a 3 rd interval, correcting the quantization parameter of the current frame by adopting a third regulating and controlling factor; or

If the (n-1) th peak signal-to-noise ratio of the luminance component and the chrominance component of the current frame is in a 4 th interval, correcting the quantization parameter of the current frame by adopting a fourth regulation factor; or

And if the peak signal-to-noise ratio of the (n-1) th luminance component and the chrominance component of the current frame is in the 5 th interval, correcting the quantization parameter of the current frame by adopting a fifth regulation factor.

4. The method of claim 3, wherein if the video sequence is a segmented video sequence, the first through fifth adjustment factors are: 1.2,1.8,2.2,3,4.5.

5. The method according to any of claims 1 to 4, wherein the modifying the quantization parameter of the n-th encoding rate control stage of the video frame according to the (n-1) -th encoding information of the video frame further comprises:

if the peak signal-to-noise ratio of the (n-1) th path of luminance component and chrominance component of the current frame in the video frame is larger than a second threshold value and is in the m2 th interval, and the current frame is a first inter-frame prediction frame, acquiring the quantization parameter of the forward reference frame of the current frame;

correcting the quantization parameter of the current frame according to the quantization parameter of the forward reference frame;

wherein m2 is a positive integer of 1 or more.

6. The method of claim 5, wherein the quantization parameter of the first inter-predicted frame is modified using the following equation:

q’＝q-(ref0Qp+a)*b，

wherein ref0Qp is a quantization parameter of the forward reference frame; a and b are constants; q is a quantization parameter of the first inter-frame prediction frame; q' is the quantization parameter after the first inter prediction frame is modified.

7. The method according to any of claims 1 to 4, wherein the modifying the quantization parameter of the n-th encoding rate control stage of the video frame according to the (n-1) -th encoding information of the video frame further comprises:

if the peak signal-to-noise ratio of the (n-1) th path of luminance component and chrominance component of the current frame in the video frame is larger than a second threshold value and is in an m2 th interval, and the current frame is a second inter-frame prediction frame, acquiring the quantization parameter of a forward reference frame of the current frame and/or the quantization parameter of a backward parameter frame of the current frame;

correcting the quantization parameter of the current frame according to the quantization parameter of the forward reference frame of the current frame and/or the quantization parameter of the backward parameter frame of the current frame;

wherein m2 is a positive integer of 1 or more.

8. The method of claim 7, wherein the quantization parameter of the second inter-predicted frame is modified using the following equation:

q’＝q–((ref0Qp+c1)+(ref1Qp+c2))*d，

wherein ref0Qp is a quantization parameter of the forward reference frame; ref1Qp is a quantization parameter of the backward reference frame; c1, c2 and d are constants; q is a quantization parameter of the second inter-frame prediction frame; q' is the quantization parameter of the second inter-frame prediction frame after modification.

9. An apparatus for video coding, characterized in that a video sequence employs n-way coding; wherein the apparatus comprises:

the first coding module is configured to perform (n-1) th path coding on the video sequence to generate an (n-1) th path analysis file, and obtain (n-1) th path coding information of a video frame in the video sequence, wherein the (n-1) th path coding information comprises an (n-1) th path luminance component and a chrominance component peak signal-to-noise ratio of the video frame;

a quantization parameter obtaining module configured to obtain a quantization parameter of the nth path coding rate control stage of the video frame;

the second coding module is configured to perform nth coding on the video sequence based on the (n-1) th path analysis file, and modify the quantization parameter of the nth path coding rate control stage of the video frame according to the (n-1) th path coding information of the video frame;

wherein n is a positive integer greater than or equal to 2;

wherein the second encoding module comprises:

and the first quantization parameter correction unit is configured to correct the quantization parameter of the current frame according to the (n-1) th peak signal-to-noise ratio of the luminance component and the chrominance component of the current frame if the (n-1) th peak signal-to-noise ratio of the luminance component and the chrominance component of the current frame in the video frame is greater than a first threshold and the current frame is not a scene switching frame.

10. The apparatus of claim 9, wherein the first quantization parameter modification unit comprises:

a first quantization parameter modification subunit, configured to modify a quantization parameter of a current frame by using a corresponding adjustment factor if a peak signal-to-noise ratio of an (n-1) th luminance component and a chrominance component of the current frame in the video frame is greater than a first threshold, the current frame is not a scene switching frame, and the peak signal-to-noise ratio of the (n-1) th luminance component and the chrominance component of the current frame is in any one of a 1 st interval to an m1 st interval;

wherein m1 is a positive integer of 1 or more.

11. The apparatus of claim 9 or 10, wherein the second encoding module further comprises:

a second quantization parameter modification unit, configured to obtain a quantization parameter of a forward reference frame of a current frame if a (n-1) th peak snr of a luminance component and a chrominance component of the current frame in the video frame is greater than a second threshold and is in an m2 th interval, and the current frame is a first inter-frame prediction frame; correcting the quantization parameter of the current frame according to the quantization parameter of the forward reference frame; and/or

A third quantization parameter modification unit, configured to obtain a quantization parameter of a forward reference frame of a current frame and/or a quantization parameter of a backward parameter frame of the current frame if an (n-1) th peak snr of a luminance component and a chrominance component of the current frame in the video frame is greater than the second threshold and is in an m2 th interval, and the current frame is a second inter-frame prediction frame; correcting the quantization parameter of the current frame according to the quantization parameter of the forward reference frame of the current frame and/or the quantization parameter of the backward parameter frame of the current frame;

wherein m2 is a positive integer of 1 or more.

12. A computer-readable medium, on which a computer program is stored which, when being executed by a processor, carries out the method for video encoding according to any one of claims 1 to 8.

13. An electronic device, comprising:

one or more processors;

a storage device configured to store one or more programs that, when executed by the one or more processors, cause the one or more processors to implement the method for video encoding as claimed in any of claims 1 to 8.