WO2018228130A1 - 视频编码方法、装置、设备及存储介质 - Google Patents

视频编码方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2018228130A1
WO2018228130A1 PCT/CN2018/087539 CN2018087539W WO2018228130A1 WO 2018228130 A1 WO2018228130 A1 WO 2018228130A1 CN 2018087539 W CN2018087539 W CN 2018087539W WO 2018228130 A1 WO2018228130 A1 WO 2018228130A1
Authority
WO
WIPO (PCT)
Prior art keywords
image frame
encoded
frame
image
motion amplitude
Prior art date
Application number
PCT/CN2018/087539
Other languages
English (en)
French (fr)
Inventor
郭耀耀
毛煦楠
谷沉沉
高欣玮
张涛
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to EP18816657.3A priority Critical patent/EP3641310A4/en
Priority to KR1020197024476A priority patent/KR102225235B1/ko
Priority to JP2019543009A priority patent/JP6925587B2/ja
Publication of WO2018228130A1 publication Critical patent/WO2018228130A1/zh
Priority to US16/401,671 priority patent/US10893275B2/en
Priority to US17/100,207 priority patent/US11297328B2/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/573Motion compensation with multiple frame prediction using two or more reference frames in a given prediction direction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/107Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/114Adapting the group of pictures [GOP] structure, e.g. number of B-frames between two anchor frames
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • H04N19/139Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/142Detection of scene cut or scene change
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/59Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone

Definitions

  • the present application relates to the field of multimedia technologies, and in particular, to a video encoding method, apparatus, device, and storage medium.
  • a sequence of image frames of a video includes a plurality of image frames.
  • the amount of data transmitted is very large. Therefore, in order to reduce the amount of data, encoding is required before transmission.
  • the H.264 standard is a commonly used video coding standard.
  • the H.264 standard includes I frames and P frames.
  • the I frame is a frame obtained by completely encoding the current image frame.
  • the P frame is based on the current image frame and the previous one.
  • the programmed I frame and P frame may constitute a GOP (Group of Pictures), and one GOP starts with one I frame and ends with a P frame before the next I frame.
  • GOP Group of Pictures
  • the difference between the current image frame and the previous image frame is relatively large. If the current image frame is encoded as a P frame, the encoding quality is degraded too much, so the current image frame can be encoded as I. frame.
  • the residual between the current image frame and the previous image frame may be obtained when the encoding is performed, and the residual may represent the difference between the current image frame and the previous image frame, and determine whether the residual is greater than a preset threshold. When the residual is greater than the preset threshold, it may be determined that the video has been switched, and the current image frame is encoded as an I frame.
  • the inventors have found that the related art has at least the following drawbacks: the above method may result in continuous coding of too many P frames in some cases resulting in degradation of coding quality.
  • the embodiment of the present application provides a video encoding method, device, device, and storage medium, which can solve the problem of degradation of encoding quality.
  • the technical solution is as follows:
  • a video encoding method for use in a video encoding device, the method comprising:
  • N image frames on a sequence of image frames of the video according to the sliding window the image frames in the sliding window comprising N-1 encoded image frames and an image frame to be encoded at the end of the window;
  • a difference in motion amplitude is a difference between a motion amplitude of the corresponding image frame and a motion amplitude of a previous image frame, the image frame
  • the magnitude of motion is the ratio between the inter prediction cost of the corresponding image frame and the intra prediction cost
  • the image frame to be encoded is encoded as an I frame.
  • a video encoding apparatus comprising:
  • a determining module configured to determine N image frames on a sequence of image frames of the video according to the sliding window, wherein the image frame in the sliding window comprises N-1 encoded image frames and an image frame to be encoded at the end of the window;
  • a first obtaining module configured to acquire a difference in motion amplitude of each image frame in the sliding window, where a difference in motion amplitude of the image frame is between a motion amplitude of the corresponding image frame and a motion amplitude of the previous image frame a difference, a magnitude of motion of the image frame is a ratio between an inter prediction cost of the corresponding image frame and an intra prediction cost;
  • a second acquiring module configured to update a static variable according to a difference in motion amplitude of each image frame in the sliding window, where the static variable is used to represent the determined number of consecutive still image frames;
  • an encoding module configured to encode the image frame to be encoded into an I frame when the updated static variable is not less than the first preset threshold.
  • a video encoding apparatus comprising a processor and a memory, the memory storing at least one instruction loaded by the processor and executed to implement the first aspect The video encoding method.
  • a fourth aspect a computer readable storage medium having stored therein at least one instruction loaded by a processor and executed to implement the video encoding method of the first aspect The action taken.
  • the method, the device, the device and the storage medium provided by the embodiments of the present application determine the N image frames according to the sliding window, and update the static variables according to the difference of the motion amplitude of each image frame in the sliding window, when the updated static variable is not less than
  • the first preset threshold is determined, it is determined that the video located in the sliding window is in a still scene for a long time, and the I frame is programmed.
  • the present application provides a method for discriminating whether a video is in a still scene for a long time, and encodes an I frame when determining that the video is in a still scene for a long time, thereby avoiding insertion of too many P frames, thereby improving coding quality and coding. effectiveness.
  • FIG. 1 is a schematic diagram of an implementation environment provided by an embodiment of the present application.
  • FIG. 2 is a schematic diagram of another implementation environment provided by an embodiment of the present application.
  • FIG. 3 is a flowchart of a video encoding method according to an embodiment of the present application.
  • 4A is a schematic diagram of a sliding window and an image frame provided by an embodiment of the present application.
  • 4B is a schematic diagram of a mobile sliding window provided by an embodiment of the present application.
  • FIG. 5 is a flowchart of a video encoding method according to an embodiment of the present application.
  • FIG. 6 is a flowchart of a video encoding method according to an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of a video encoding apparatus according to an embodiment of the present disclosure.
  • FIG. 8 is a schematic structural diagram of a terminal according to an embodiment of the present disclosure.
  • FIG. 9 is a schematic structural diagram of a server according to an embodiment of the present application.
  • the embodiment of the present application provides a method for coding an I frame when it is in a still scene for a long time, and can improve the encoding quality.
  • a scheme for encoding an I frame when a scene is switched in a video is provided, which can improve encoding quality while avoiding coding too many I frames.
  • the embodiment of the present application provides an implementation environment, where the first device 101 and the second device 102 are included in the implementation environment, and the first device 101 and the second device 102 are connected through a network.
  • the first device 101 transmits the video to the second device 102
  • the video needs to be encoded first, and the encoded video is sent to the second device 102, and the second device 102 performs decoding to obtain the original video.
  • the embodiment of the present application can be applied to a scenario in which a video is played online.
  • the first device 101 is a video server for providing video
  • the second device 102 is a terminal for playing video
  • the video server obtains the encoded video.
  • the video can be transcoded, and the I frame is encoded in the encoding process provided by the embodiment of the present application, so that the encoded video is obtained and sent to the terminal, and the video is played by the terminal, and the user is You can watch the video on the terminal.
  • the embodiment of the present application can also be applied to a video call scenario.
  • the first device 101 and the second device 102 are two terminals for performing a video call, and the first device 101 and the second device 102 pass through the server 103. connection.
  • the first device 101 obtains an image frame, encodes the image frame, and sends the image frame to the server 103 for forwarding to the second device 102.
  • the second device 102 decodes the image and plays the image. frame.
  • the first device 101 can acquire a plurality of image frames, and the second device 102 can continuously play a plurality of image frames to achieve the effect of playing the video, and the user can watch the video on the second device 102.
  • FIG. 3 is a flowchart of a video encoding method according to an embodiment of the present application.
  • the embodiment of the present application describes a process of encoding an I frame for an image frame in a still scene, where the execution subject is a video encoding device, and the video encoding is performed.
  • the device can be a device with a video transmission function such as a terminal or a server. Referring to Figure 3, the method includes:
  • the video encoding device determines N image frames on the sequence of image frames of the video according to the sliding window.
  • the length of the sliding window is equal to N, and N is a positive integer greater than 1.
  • N can be determined according to the frame rate of the video, for example, can be two-thirds of the frame rate.
  • the sliding window can be used to determine N image frames each time, and the N image frames include N-1 encoded image frames and an image frame to be encoded at the end of the sliding window.
  • the image frame to be encoded can be encoded according to the image frame in the sliding window.
  • an image frame to be encoded is taken as an example.
  • the image frame to be encoded may be the Nth image frame of the video, or may be any image frame after the Nth image frame.
  • the step size of each movement of the sliding window can be set to 1, that is, each time the sliding window is moved by 1 frame, for each image frame after the Nth image frame and the Nth image frame.
  • the coding method provided by the embodiment of the present application is used for coding.
  • the image frames other than the first image frame before the Nth image frame since the distance between the image frames and the I frame is relatively short, encoding the P frame does not cause excessive degradation of the encoding quality, so the default encoding is P.
  • the frame is fine.
  • the video encoding device acquires a difference in motion amplitude of each image frame in the sliding window.
  • the motion amplitude of the image frame is the ratio between the interframe prediction cost of the image frame and the intra prediction cost, and can represent the variation range of the image frame compared with the previous image frame, and the amplitude of the change of the image frame and the previous image frame is larger.
  • the larger the motion amplitude the smaller the variation range of the image frame and the previous image frame, and the smaller the motion amplitude.
  • the difference between the motion amplitude of the image frame is the difference between the motion amplitude of the image frame and the motion amplitude of the previous image frame, and is used to indicate the fluctuation between the amplitude of the image frame and the amplitude of the change of the previous image frame.
  • the larger the difference in motion amplitude is the more the fluctuation of the amplitude of the image frame and the amplitude of the change of the previous image frame are.
  • the smaller the difference of the motion amplitude the more the fluctuation between the amplitude of the change of the image frame and the amplitude of the change of the previous image frame is. gentle.
  • the difference of the above variation amplitude can reflect the difference of the displayed picture between the two image frames, and the difference of the motion amplitude is large, indicating that the difference of the picture is large, and the difference of the motion amplitude is small, indicating that the picture difference is small.
  • the intra prediction cost is represented by I cost
  • the inter prediction cost is represented by P cost
  • the intraframe prediction cost I cost of the image frame may be performed by downsampling the image frame, and dividing the sampled image frame into a plurality of macroblocks of a specified size, and calculating, for each macroblock, the number of macroblocks in the macroblock.
  • the prediction block in the direction by calculating the SATD of the residual between the macroblock and the prediction block, obtains the optimal intra prediction cost Icost .
  • the sampling amplitude at the time of downsampling may be determined according to requirements. For example, the length of the sampled image frame may be one-half of the length of the original image frame, and the width may be one-half of the width of the original image frame.
  • the specified size can also be determined based on forecasted demand, for example 8*8.
  • SATD refers to the absolute value summation after the residual is transformed by Hadamard.
  • the inter-prediction cost P cost of the image frame may be used to downsample the image frame into a plurality of macroblocks of a specified size.
  • the optimal reference block of the macroblock is obtained by using integer pixel diamond prediction, and the SATD of the residual between the macroblock and the reference block is calculated, thereby obtaining an optimal motion amplitude, and the motion amplitude and the macroblock are obtained.
  • the motion amplitude difference of each image frame may be calculated before encoding, and when the current image frame is encoded, the motion amplitude difference of the image frame may be calculated, and due to the image frame before The image frame has already calculated the difference in motion amplitude when encoding, so it is not necessary to recalculate the difference in motion amplitude of these image frames.
  • This direct acquisition method can greatly reduce the amount of computation in the encoding process.
  • the video encoding device updates the static variable according to the difference in motion amplitude of each image frame in the sliding window.
  • the still image frame refers to an image frame in a still scene
  • the static variable is used to represent the determined number of consecutive still image frames
  • the stationary variable gradually increases with the increase of the determined continuous still image frame during the encoding process.
  • whether the image frame to be encoded is a still image frame may be determined according to the difference of the motion amplitude of each image frame in the sliding window, and the updated result is updated according to the determination result and the determined static variable, and the updated still is obtained. variable.
  • the following process can be employed to determine a still image frame and update the stationary variables:
  • the first preset condition is that the absolute value of the motion amplitude difference of the image frame to be encoded is smaller than the second preset threshold and sliding The absolute value of the sum of the motion amplitude differences of all image frames in the window is less than the third preset threshold. Therefore, the video encoding device determines whether the absolute value of the motion amplitude difference of the image frame to be encoded is smaller than the second preset threshold, and determines whether the absolute value of the sum of the motion amplitude differences of all image frames in the sliding window is less than a third preset threshold.
  • the image frame to be encoded is a still image frame, and the determined static variable is incremented by one to obtain an updated static variable. If no, it is determined that the image frame to be encoded is not a still image frame when the first preset condition is not satisfied, and the updated still variable is set to 0, and the number of consecutive still image frames is restarted.
  • a video encoding device can determine a stationary variable using the following formula:
  • f n represents a stationary variable determined according to an image frame to be encoded
  • f n-1 represents a stationary variable determined according to a previous image frame of the image frame to be encoded
  • n n represents a difference in motion amplitude of the image frame to be encoded
  • ⁇ n ⁇ n - ⁇ n-1
  • ⁇ n represents the motion amplitude of the image frame to be encoded
  • ⁇ n-1 represents the previous image frame of the image frame to be encoded
  • ⁇ n represents the sum of the motion amplitude differences of all image frames in the sliding window
  • n is a positive integer greater than 1
  • ⁇ T represents a second predetermined threshold
  • ⁇ T represents a third predetermined threshold.
  • the video encoding device determines whether the updated static variable is less than a first preset threshold. If yes, step 305 is performed, and if no, step 306 is performed.
  • the condition for programming the I frame is set as follows: the number of consecutive still image frames reaches a first preset threshold, under such conditions, the video In a long-time still scene and enough P frames have been programmed according to a sufficient number of still image frames, if the P-frame continues to be programmed, the encoding quality will be degraded too much, so I frames are programmed.
  • the video encoding device acquires the updated static variable, it is determined whether the updated static variable is smaller than the first preset threshold. If the updated static variable is smaller than the first preset threshold, it indicates that the number of consecutive still image frames does not exceed the upper limit, and the P frame may continue to be programmed for the purpose of improving coding efficiency. If the updated static variable is not less than the first preset threshold, indicating that the number of consecutive still image frames is too large, an I frame needs to be programmed for the purpose of improving the encoding quality.
  • the first preset threshold may be comprehensively determined by the requirement for the encoding quality and the coding efficiency, for example, the first preset threshold may be equal to N.
  • the video encoding device encodes the image frame to be encoded into a P frame.
  • the difference data between the image frame to be encoded and the previous image frame is obtained, and the difference data is encoded, and the encoded image frame is obtained as a P frame.
  • the video encoding device encodes the image frame to be encoded into an I frame.
  • the code is directly encoded according to the data in the image frame to be encoded, and the encoded image frame is an I frame.
  • the condition for programming the I frame may include a second preset condition in addition to the condition that the static variable is not less than the first preset threshold. Accordingly, the foregoing steps 304-306 may be performed by the following steps. Instead, determining whether the updated static variable is smaller than the first preset threshold, and determining whether the second preset condition is met, and waiting for the updated static variable to be not less than the first preset threshold and satisfying the second preset condition.
  • the coded image frame is encoded as an I frame, and the image frame to be encoded is encoded into a P frame when the updated still variable is smaller than the first preset threshold or does not satisfy the second preset condition.
  • the second preset condition it can be ensured that the image representing the image frame is relatively stable when the motion amplitude of the image frame is small, and the P-frame is not caused to cause excessive degradation of the encoding quality, so there is no need to program Too many I frames can be programmed into an I frame when the PSNR drops by a large margin, which not only ensures the coding quality, but also improves the coding efficiency as much as possible.
  • Too many I frames can be programmed into an I frame when the PSNR drops by a large margin, which not only ensures the coding quality, but also improves the coding efficiency as much as possible.
  • the motion amplitude of the image frame is large, the picture change of the image frame is severe. In this case, in order to avoid excessive degradation of the coding quality, the condition for encoding the I frame can be relaxed, and when the PSNR decreases by a small amount, Can be programmed into I frames.
  • the thresholds used by different image frames may be equal or not equal.
  • T 1.n, T 2.n, T 3.n and T 4.n T 1 to obtain the updated .n + 1, T 2.n + 1 , T 3.n + 1 and T 4.n + 1, using the updated threshold of the n + 1 th image frame is determined.
  • the update is as follows:
  • T 1.n+1 (1 - ⁇ T ) ⁇ T 1.n + ⁇ T ⁇ ( ⁇ I - ⁇ n-1 ) 0 ⁇ ⁇ n ⁇ ⁇ 1 ;
  • T 2.n+1 (1 - ⁇ T ) ⁇ T 2.n + ⁇ T ⁇ ( ⁇ I - ⁇ n-1 ) ⁇ 1 ⁇ ⁇ n ⁇ ⁇ 2 ;
  • T 3.n+1 (1 - ⁇ T ) ⁇ T 3.n + ⁇ T ⁇ ( ⁇ I - ⁇ n-1 ) ⁇ 2 ⁇ ⁇ n ⁇ ⁇ 3 ;
  • T 4.n+1 (1 - ⁇ T ) ⁇ T 4.n + ⁇ T ⁇ ( ⁇ I - ⁇ n-1 ) ⁇ 3 ⁇ ⁇ n ⁇ ⁇ 4 ;
  • next image frame to be encoded may be acquired, and the sliding window is moved backward according to the distance of one image frame, so that the next image frame to be encoded is at the end of the sliding window.
  • steps 301-306 can be repeatedly performed to continue encoding the next image frame to be encoded, and so on.
  • the sliding window is moved backward according to the distance of one image frame, so that the N+1th image frame is located at the end of the sliding window.
  • the N+1th image frame is encoded at this time.
  • the video encoding device may further set a preset maximum length of the GOP in the configuration file, where the preset maximum length specifies a maximum length between two I frames before and after, and then, when encoding the image frame to be encoded,
  • the video encoding device not only uses the condition in the above step 304 to determine whether to encode the I frame, but also needs to obtain the distance between the image frame to be encoded and the I frame closest to the image frame to be encoded before the image frame to be encoded, It is determined whether the distance reaches the preset maximum length. When it is determined that the distance reaches the preset maximum length, the image frame to be encoded is encoded into an I frame even if the condition in step 304 is not currently satisfied.
  • the method provided by the embodiment of the present application by determining N image frames according to the sliding window, updating the static variable according to the difference of the motion amplitude of each image frame in the sliding window, when the updated static variable is not less than the first preset threshold, If it is determined that the video is in a still scene for a long time, the I frame is programmed.
  • the present application provides a method for discriminating whether a video is in a still scene for a long time, and encodes an I frame when determining that the video is in a still scene for a long time, avoids inserting too many P frames, improves coding quality, and encodes I frames.
  • the P frame is reprogrammed, the number of bits occupied by the P frame is reduced, and the PSNR is increased, thereby improving the coding efficiency.
  • the present application uses the ratio of the inter prediction cost to the intra prediction cost to represent the current motion amplitude, uses the PSNR of the encoded image frame to represent the distortion, and uses the sliding window to pre-analyze the image frame in the sliding window. Combined with the change of the motion amplitude, a segmentation function is used to implement the adaptive algorithm of I frame, which improves the coding efficiency.
  • the preset maximum length of the GOP in the configuration file can be set to a larger value, so that it is not frequently programmed into the I frame according to the preset maximum length, and more is Whether or not to encode an I frame is determined according to the condition of the image frame, which greatly improves the coding efficiency.
  • the algorithm adopted in the embodiment of the present application performs assembly optimization on the types of processors such as armv7 and arm64, which can improve the processing speed.
  • FIG. 5 is a flowchart of a video encoding method according to an embodiment of the present application.
  • the embodiment of the present application takes a process of encoding three image frames as an example, and the execution subject is a video encoding device.
  • the method includes:
  • a sliding window whose length is equal to N, encode the first image frame in the video into an I frame, and encode the second image frame to the N-1th image frame of the video into a P frame.
  • the N-1 image frames are located in the first N-1 positions of the sliding window. When the Nth image frame is acquired, the Nth image frame is the current image frame to be encoded, and the Nth image frame is at the end of the sliding window.
  • N is assumed to be 3.
  • the Nth image frame is a still image frame, and the static variable is 1 at this time, because the static variable 1 is less than N,
  • the Nth image frame is encoded as a P frame.
  • the sliding window includes a second image frame to an N+1th image frame.
  • the N+1th image frame is a still image frame, and the static variable is updated to 2, because the static variable 2 is smaller than N, thus encoding the N+1th image frame as a P frame.
  • the sliding window includes a third image frame to an N+2th image frame.
  • the N+2th image frame is a still image frame, and the static variable is updated to 3, because the static variable 3 is equal to N, thus encoding the N+2th image frame as an I frame.
  • FIG. 6 is a flowchart of a video encoding method according to an embodiment of the present disclosure.
  • the embodiment of the present application describes a process of encoding an I frame during scene switching, and the execution subject is a video encoding device.
  • the method includes:
  • V Scene represents a set threshold, which may be predetermined by a video encoding device; D represents the distance; GOP min represents a preset minimum length of the image group GOP; GOP max represents a preset maximum length of the GOP; and F bias represents a fourth Preset threshold.
  • the image frame to be encoded is encoded as I frame.
  • the image frame to be encoded is encoded. Is a P frame.
  • the related art encodes an I frame when the video is switched.
  • the motion amplitude of the image frame to be encoded as an I frame should be greater than the difference between 1 and the fourth preset threshold and not less than the fifth preset threshold, and the I frame and the former are programmed.
  • the distance between one I frame cannot be less than the sixth preset threshold.
  • the fifth preset threshold is used to determine a minimum motion amplitude of an image frame to be encoded as an I frame
  • the sixth preset threshold is used to determine a minimum distance between two I frames before and after, the fifth preset threshold and the The sixth preset threshold may be determined after considering the encoding quality and the encoding efficiency.
  • the fifth preset threshold may be 0.8
  • the sixth preset threshold may be one-half of the frame rate.
  • the video encoding device may first determine whether the distance is smaller than a sixth preset threshold, and if so, directly encode the image frame to be encoded into a P frame, and if not, determine the motion of the image frame to be encoded. Whether the amplitude is greater than the difference between the first preset threshold and the fourth preset threshold, and whether the image frame to be encoded is encoded as an I frame or a P frame according to the determination result.
  • the video encoding device can use the foregoing steps 301-304 to perform the determination, and use the above steps 601-603 to determine. It is thus determined whether the image frame to be encoded is to be encoded as an I frame or a P frame.
  • the method provided by the embodiment of the present application ensures that the motion amplitude of the image frame to be encoded is greater than the difference between the amplitude of the image frame and the fourth preset threshold, and is not less than the fifth preset threshold, between the image frame to be encoded and the previous I frame.
  • the image frame to be encoded is encoded into an I frame, which avoids coding too many I frames, and improves coding efficiency.
  • FIG. 7 is a schematic structural diagram of a video encoding apparatus according to an embodiment of the present application.
  • the device includes:
  • the first obtaining module 702 is configured to perform step 302 above;
  • the second obtaining module 703 is configured to perform step 303 above;
  • the encoding module 704 is configured to perform the above steps 305 or 306.
  • the second acquisition module 703 is configured to update the stationary variable.
  • the encoding module 704 is configured to program an I frame when the updated static variable is not less than the specified number and meets the second preset condition.
  • the second preset condition includes at least one of the following conditions:
  • T 1.n+1 (1 - ⁇ T ) ⁇ T 1.n + ⁇ T ⁇ ( ⁇ I - ⁇ n-1 ) 0 ⁇ ⁇ n ⁇ ⁇ 1 ;
  • T 2.n+1 (1 - ⁇ T ) ⁇ T 2.n + ⁇ T ⁇ ( ⁇ I - ⁇ n-1 ) ⁇ 1 ⁇ ⁇ n ⁇ ⁇ 2 ;
  • T 3.n+1 (1 - ⁇ T ) ⁇ T 3.n + ⁇ T ⁇ ( ⁇ I - ⁇ n-1 ) ⁇ 2 ⁇ ⁇ n ⁇ ⁇ 3 ;
  • T 4.n+1 (1 - ⁇ T ) ⁇ T 4.n + ⁇ T ⁇ ( ⁇ I - ⁇ n-1 ) ⁇ 3 ⁇ ⁇ n ⁇ ⁇ 4 ;
  • ⁇ T is a positive threshold
  • the encoding module 704 is further configured to encode the image frame to be encoded into a P frame when the updated static variable is less than the first preset threshold.
  • the device further includes:
  • a switching module configured to acquire a next image frame to be encoded, and move the sliding window backward according to the distance of one image frame
  • the second obtaining module 703 is further configured to continue to update the static variable according to the difference of the motion amplitude of each image frame in the sliding window, until the current updated static variable is not less than the first preset threshold, and the current module is determined by the encoding module 704.
  • the encoded image frame is encoded as an I frame.
  • the device further includes:
  • a third obtaining module configured to perform step 601 above;
  • a calculation module configured to perform step 602 above
  • the encoding module 704 is configured to perform the above steps 604 or 605.
  • the video encoding apparatus when performing the encoding, the video encoding apparatus provided by the foregoing embodiment is only illustrated by the division of each functional module. In actual applications, the function allocation may be completed by different functional modules as needed. The internal structure of the video encoding device is divided into different functional modules to perform all or part of the functions described above. In addition, the video coding apparatus and the video coding method are provided in the same embodiment, and the specific implementation process is described in detail in the method embodiment, and details are not described herein again.
  • FIG. 8 is a schematic structural diagram of a terminal according to an embodiment of the present application.
  • the terminal can be used to implement the functions performed by the video encoding device in the video encoding method shown in the above embodiments. Specifically:
  • the terminal 800 may include an RF (Radio Frequency) circuit 110, a memory 120 including one or more computer readable storage media, an input unit 130, a display unit 140, a sensor 150, an audio circuit 160, a transmission module 170, including One or more processing core processor 180, and power supply 190 and the like. It will be understood by those skilled in the art that the terminal structure shown in Fig. 8 does not constitute a limitation to the terminal, and may include more or less components than those illustrated, or a combination of certain components, or different component arrangements. among them:
  • the RF circuit 110 can be used for transmitting and receiving information or during a call, and receiving and transmitting signals. Specifically, after receiving downlink information of the base station, the downlink information is processed by one or more processors 180. In addition, the data related to the uplink is sent to the base station. .
  • the RF circuit 110 includes, but is not limited to, an antenna, at least one amplifier, a tuner, one or more oscillators, a Subscriber Identity Module (SIM) card, a transceiver, a coupler, an LNA (Low Noise Amplifier). , duplexer, etc. In addition, the RF circuit 110 can also communicate with the network and other terminals through wireless communication.
  • SIM Subscriber Identity Module
  • LNA Low Noise Amplifier
  • the wireless communication may use any communication standard or protocol, including but not limited to GSM (Global System of Mobile communication), GPRS (General Packet Radio Service), CDMA (Code Division Multiple Access). , Code Division Multiple Access), WCDMA (Wideband Code Division Multiple Access), LTE (Long Term Evolution), e-mail, SMS (Short Messaging Service), and the like.
  • GSM Global System of Mobile communication
  • GPRS General Packet Radio Service
  • CDMA Code Division Multiple Access
  • WCDMA Wideband Code Division Multiple Access
  • LTE Long Term Evolution
  • e-mail Short Messaging Service
  • the memory 120 can be used to store software programs and modules, such as the software programs and modules corresponding to the terminals shown in the above exemplary embodiments, and the processor 180 executes various functional applications by running software programs and modules stored in the memory 120. And data processing, such as implementing video-based interactions.
  • the memory 120 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may be stored according to The data created by the use of the terminal 800 (such as audio data, phone book, etc.) and the like.
  • memory 120 can include high speed random access memory, and can also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, memory 120 may also include a memory controller to provide access to memory 120 by processor 180 and input unit 130.
  • the input unit 130 can be configured to receive input numeric or character information and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function controls.
  • input unit 130 can include touch-sensitive surface 131 as well as other input terminals 132.
  • Touch-sensitive surface 131 also referred to as a touch display or trackpad, can collect touch operations on or near the user (such as a user using a finger, stylus, etc., on any suitable object or accessory on touch-sensitive surface 131 or The operation near the touch-sensitive surface 131) and driving the corresponding linking device according to a preset program.
  • the touch-sensitive surface 131 can include two portions of a touch detection device and a touch controller.
  • the touch detection device detects the touch orientation of the user, and detects a signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts the touch information into contact coordinates, and sends the touch information.
  • the processor 180 is provided and can receive commands from the processor 180 and execute them.
  • the touch-sensitive surface 131 can be implemented in various types such as resistive, capacitive, infrared, and surface acoustic waves.
  • the input unit 130 may also include other input terminals 132.
  • other input terminals 132 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackballs, mice, joysticks, and the like.
  • Display unit 140 can be used to display information entered by the user or information provided to the user and various graphical user interfaces of terminal 800, which can be constructed from graphics, text, icons, video, and any combination thereof.
  • the display unit 140 may include a display panel 141.
  • the display panel 141 may be configured in the form of an LCD (Liquid Crystal Display), an OLED (Organic Light-Emitting Diode), or the like.
  • the touch-sensitive surface 131 may cover the display panel 141, and when the touch-sensitive surface 131 detects a touch operation thereon or nearby, it is transmitted to the processor 180 to determine the type of the touch event, and then the processor 180 according to the touch event The type provides a corresponding visual output on display panel 141.
  • touch-sensitive surface 131 and display panel 141 are implemented as two separate components to implement input and input functions, in some embodiments, touch-sensitive surface 131 can be integrated with display panel 141 for input. And output function.
  • Terminal 800 can also include at least one type of sensor 150, such as a light sensor, motion sensor, and other sensors.
  • the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 141 according to the brightness of the ambient light, and the proximity sensor may close the display panel 141 when the terminal 800 moves to the ear. / or backlight.
  • the gravity acceleration sensor can detect the magnitude of acceleration in all directions (usually three axes). When it is stationary, it can detect the magnitude and direction of gravity.
  • the gesture of the mobile phone such as horizontal and vertical screen switching, related Game, magnetometer attitude calibration), vibration recognition related functions (such as pedometer, tapping), etc.; as for the terminal 800 can also be configured with gyroscopes, barometers, hygrometers, thermometers, infrared sensors and other sensors, here Let me repeat.
  • the audio circuit 160, the speaker 161, and the microphone 162 can provide an audio interface between the user and the terminal 800.
  • the audio circuit 160 can transmit the converted electrical data of the received audio data to the speaker 161 for conversion to the sound signal output by the speaker 161; on the other hand, the microphone 162 converts the collected sound signal into an electrical signal by the audio circuit 160. After receiving, it is converted into audio data, and then processed by the audio data output processor 180, transmitted to the terminal, for example, via the RF circuit 110, or outputted to the memory 120 for further processing.
  • the audio circuit 160 may also include an earbud jack to provide communication of the peripheral earphones with the terminal 800.
  • the terminal 800 can help the user to send and receive emails, browse web pages, access streaming media, etc. through the transmission module 170, which provides users with wireless or wired broadband Internet access.
  • FIG. 8 shows the transmission module 170, it can be understood that it does not belong to the essential configuration of the terminal 800, and may be omitted as needed within the scope of not changing the essence of the invention.
  • the processor 180 is the control center of the terminal 800, which links various portions of the entire handset using various interfaces and lines, by running or executing software programs and/or modules stored in the memory 120, and recalling data stored in the memory 120, The various functions and processing data of the terminal 800 are performed to perform overall monitoring of the mobile phone.
  • the processor 180 may include one or more processing cores; preferably, the processor 180 may integrate an application processor and a modem processor, where the application processor mainly processes an operating system, a user interface, an application, and the like.
  • the modem processor primarily handles wireless communications. It can be understood that the above modem processor may not be integrated into the processor 180.
  • the terminal 800 also includes a power source 190 (such as a battery) for powering various components.
  • a power source 190 such as a battery
  • the power source can be logically coupled to the processor 180 through a power management system to manage functions such as charging, discharging, and power management through the power management system.
  • Power supply 190 may also include any one or more of a DC or AC power source, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.
  • the terminal 800 may further include a camera, a Bluetooth module, and the like, and details are not described herein again.
  • the display unit of the terminal 800 is a touch screen display, and the terminal 800 further includes a memory and at least one instruction, wherein at least one instruction is stored in the memory and configured to be loaded and executed by one or more processors To implement the video encoding method in the above embodiment.
  • FIG. 9 is a schematic structural diagram of a server according to an embodiment of the present application.
  • the server 900 may generate a large difference due to different configurations or performances, and may include one or more central processing units (CPUs) 922 ( For example, one or more processors) and memory 932, one or more storage media 930 that store application 942 or data 944 (eg, one or one storage device in Shanghai).
  • the memory 932 and the storage medium 930 may be short-term storage or persistent storage.
  • the program stored on storage medium 930 may include one or more modules (not shown), each of which may include a series of instructions in the server.
  • the central processing unit 922 can be configured to communicate with the storage medium 930, and load and execute a series of instructions in the storage medium 930 on the server 900 to implement the video encoding method in the above embodiments.
  • Server 900 may also include one or more power sources 926, one or more wired or wireless network interfaces 950, one or more input and output interfaces 959, one or more keyboards 956, and/or one or more operating systems 941.
  • operating systems 941 such as Windows Server TM, Mac OSX TM, Unix TM, Linux TM, FreeBSD TM and so on.
  • the server 900 can be configured to perform the steps performed by the video encoding device in the video encoding method provided by the foregoing embodiments.
  • the embodiment of the present application further provides a computer readable storage medium, where the computer readable storage medium stores at least one instruction loaded by a processor and executed to implement the operations performed in the video encoding method of the above embodiment. .
  • a person skilled in the art may understand that all or part of the steps of implementing the above embodiments may be completed by hardware, or may be instructed by a program to execute related hardware, and the program may be stored in a computer readable storage medium.
  • the storage medium mentioned may be a read only memory, a magnetic disk or an optical disk or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请公开了一种视频编码方法、装置、设备及存储介质,属于多媒体技术领域。方法包括:根据滑动窗口确定视频的图像帧序列上的N个图像帧;获取滑动窗口内每个图像帧的运动幅度差值;根据滑动窗口内每个图像帧的运动幅度差值更新静止变量,静止变量用于表示已确定的连续静止图像帧的数目;当更新后的静止变量不小于第一预设阈值时,将待编码图像帧编码为I帧。本申请提供了一种判别视频是否长时间处于静止场景的方式,并在确定视频长时间处于静止场景时编入I帧,避免插入过多的P帧,既提高了编码质量,也提高了编码效率。

Description

视频编码方法、装置、设备及存储介质
本申请要求于2017年6月15日提交中国国家知识产权局、申请号为2017104613677、发明名称为“视频编码方法、装置及设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及多媒体技术领域,特别涉及一种视频编码方法、装置、设备及存储介质。
背景技术
随着互联网的普及以及多媒体技术的快速发展,视频的应用越来越广泛,在很多情况下都需要传输视频,如多个用户进行视频通话时或者用户在线观看视频时。视频的图像帧序列中包括多个图像帧,在传输视频的过程中,如果直接传输这些图像帧会导致传输的数据量非常大,因此,为了减小数据量需要在传输之前进行编码。
H.264标准是一种常用的视频编码标准,H.264标准包括I帧和P帧,I帧为将当前的图像帧进行完整编码后得到的帧,P帧为根据当前图像帧与前一图像帧之间的差异数据进行编码后得到的帧。其中,I帧完整地保留了图像帧中的数据,编码质量较高,而P帧只需根据差异数据进行编码,编码效率较高,但编码质量较低。编入的I帧和P帧可以构成一个GOP(Group of Pictures,图像组),一个GOP以一个I帧开始,到下一个I帧之前的P帧结束。
考虑到视频发生场景切换时,当前图像帧与前一图像帧之间的差异比较大,若将当前图像帧编码为P帧,会导致编码质量下降过多,因此可以将当前图像帧编码为I帧。为此,进行编码时可以获取当前图像帧与前一图像帧之间的残差,该残差可以表示当前图像帧与前一图像帧之间的差异大小,判断该残差是否大于预设阈值,当该残差大于预设阈值时可以确定视频发生了场景切换,此时将当前图像帧编码为I帧。
在实现本申请的过程中,发明人发现相关技术至少存在以下缺陷:上述方 法会导致在某些情况下连续编入过多的P帧而导致编码质量下降。
发明内容
本申请实施例提供了一种视频编码方法、装置、设备及存储介质,能够解决编码质量下降问题。所述技术方案如下:
第一方面,提供了一种视频编码方法,应用于视频编码设备上,所述方法包括:
根据滑动窗口确定视频的图像帧序列上的N个图像帧,所述滑动窗口内的图像帧包括N-1个已编码图像帧以及一个处于窗尾的待编码图像帧;
获取所述滑动窗口内每个图像帧的运动幅度差值,所述图像帧的运动幅度差值为对应图像帧的运动幅度与前一图像帧的运动幅度之间的差值,所述图像帧的运动幅度为对应图像帧的帧间预测代价与帧内预测代价之间的比值;
根据所述滑动窗口内每个图像帧的运动幅度差值更新静止变量,所述静止变量用于表示已确定的连续静止图像帧的数目;
当更新后的静止变量不小于第一预设阈值时,将所述待编码图像帧编码为I帧。
第二方面,提供了一种视频编码装置,所述装置包括:
确定模块,用于根据滑动窗口确定视频的图像帧序列上的N个图像帧,所述滑动窗口内的图像帧包括N-1个已编码图像帧以及一个处于窗尾的待编码图像帧;
第一获取模块,用于获取所述滑动窗口内每个图像帧的运动幅度差值,所述图像帧的运动幅度差值为对应图像帧的运动幅度与前一图像帧的运动幅度之间的差值,所述图像帧的运动幅度为对应图像帧的帧间预测代价与帧内预测代价之间的比值;
第二获取模块,用于根据所述滑动窗口内每个图像帧的运动幅度差值更新静止变量,所述静止变量用于表示已确定的连续静止图像帧的数目;
编码模块,用于当更新后的静止变量不小于所述第一预设阈值时,将所述待编码图像帧编码为I帧。
第三方面,提供了一种视频编码设备,所述视频编码设备包括处理器和存 储器,所述存储器中存储有至少一条指令,所述指令由所述处理器加载并执行以实现如第一方面所述的视频编码方法。
第四方面,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一条指令,所述指令由处理器加载并执行以实现如第一方面所述的视频编码方法中所执行的操作。
本申请实施例提供的技术方案带来的有益效果是:
本申请实施例提供的方法、装置、设备及存储介质,通过根据滑动窗口确定N个图像帧,根据滑动窗口内每个图像帧的运动幅度差值更新静止变量,当更新后的静止变量不小于第一预设阈值时,确定位于滑动窗口内的视频长时间处于静止场景,则编入I帧。本申请提供了一种判别视频是否长时间处于静止场景的方式,并在确定视频长时间处于静止场景时编入I帧,避免插入过多的P帧,既提高了编码质量,也提高了编码效率。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请实施例提供的一种实施环境的示意图;
图2是本申请实施例提供的另一种实施环境的示意图;
图3是本申请实施例提供的一种视频编码方法的流程图;
图4A是本申请实施例提供的一种滑动窗口和图像帧的示意图;
图4B是本申请实施例提供的一种移动滑动窗口的示意图;
图5是本申请实施例提供的一种视频编码方法的流程图;
图6是本申请实施例提供的一种视频编码方法的流程图;
图7是本申请实施例提供的一种视频编码装置的结构示意图;
图8是本申请实施例提供的一种终端的结构示意图;
图9是本申请实施例提供的一种服务器的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
考虑到相关技术中并未给出在静止场景下插入I帧的方案,但在长时间处于静止场景时会编入过多的P帧,从而导致视频的编码质量下降过多,此时恰当地编入I帧会对整体性能有很大的帮助。因此,本申请实施例提供了一种在长时间处于静止场景时编入I帧的方法,能够提高编码质量。另外,还提供了一种在视频发生场景切换时编入I帧的方案,能够在提高编码质量的同时避免编入过多的I帧。上述编入I帧的方法具体请参见下述实施例。
本申请实施例提供了一种实施环境,该实施环境中包括第一设备101和第二设备102,第一设备101与第二设备102之间通过网络连接。第一设备101向第二设备102传输视频时,需要先对视频进行编码,将编码后的视频发送给第二设备102,由第二设备102进行解码后得到原视频。
本申请实施例可以应用于在线播放视频的场景下,参见图1,第一设备101为用于提供视频的视频服务器,第二设备102为用于播放视频的终端,视频服务器获取到已编码的视频后,可以对视频进行转码,在转码过程中采用本申请实施例提供的编码方法编入I帧,从而得到编码后的视频,发送给终端,由终端进行解码后播放视频,用户即可在终端上观看视频。
本申请实施例还可以应用于视频通话的场景下,参见图2,第一设备101和第二设备102为进行视频通话的两个终端,第一设备101与第二设备102之间通过服务器103连接。在进行视频通话的过程中,第一设备101每次获取到图像帧,对图像帧进行编码后发送给服务器103,由服务器103转发给第二设备102,第二设备102进行解码后播放该图像帧。第一设备101可以获取到多个图像帧,而第二设备102可以连续播放多个图像帧从而实现播放视频的效果,用户即可在第二设备102上观看视频。
当然,本申请实施例还可以应用于传输视频的其他场景下,在此不再赘述。
图3是本申请实施例提供的一种视频编码方法的流程图,本申请实施例对 在静止场景下针对一个图像帧编入I帧的过程进行说明,执行主体为视频编码设备,该视频编码设备可以为终端或者服务器等具备视频传输功能的设备。参见图3,该方法包括:
301、视频编码设备根据滑动窗口确定视频的图像帧序列上的N个图像帧。
其中,滑动窗口的长度等于N,N为大于1的正整数。N可以根据视频的帧率确定,例如可以为帧率的三分之二。
在对视频的图像帧序列进行编码的过程中,每次可以采用该滑动窗口确定N个图像帧,N个图像帧包括N-1个已编码图像帧以及一个处于滑动窗口尾部的待编码图像帧,此时根据滑动窗口内的图像帧可以对待编码图像帧进行编码。
例如,参见图4A,开始对视频进行编码时,将图像帧序列上的第一个图像帧编码为I帧,将视频的第二个图像帧至第N-1个图像帧均编码为P帧,获取到第N个图像帧,并开始对滑动窗口内的第N个图像帧进行编码。
需要说明的是,本申请实施例以一个待编码图像帧为例进行说明,该待编码图像帧可以为视频的第N个图像帧,也可以为第N个图像帧之后的任一图像帧,例如,可以将滑动窗口每次移动的步长设置为1,也即是,每次将滑动窗口移动1帧,则对于第N个图像帧和第N个图像帧之后的每个图像帧均可采用本申请实施例提供的编码方法进行编码。至于第N个图像帧之前除第一个图像帧之外的图像帧,由于这些图像帧与I帧的距离较近,编码为P帧并不会造成编码质量下降过多,因此默认编码为P帧即可。
302、视频编码设备获取滑动窗口内每个图像帧的运动幅度差值。
图像帧的运动幅度为图像帧的帧间预测代价与帧内预测代价之间的比值,可以表示图像帧与前一图像帧相比的变化幅度,图像帧与前一图像帧的变化幅度越大,运动幅度越大,图像帧与前一图像帧的变化幅度越小,运动幅度越小。
而图像帧的运动幅度差值为图像帧的运动幅度与前一图像帧的运动幅度之间的差值,用于表示图像帧的变化幅度与前一图像帧的变化幅度之间的波动情况,运动幅度差值越大表示图像帧的变化幅度与前一图像帧的变化幅度的波动越剧烈,运动幅度差值越小表示图像帧的变化幅度与前一图像帧的变化幅度之间的波动越平缓。上述变化幅度差值可以体现两个图像帧之间所显示画面的差异大小,运动幅度差值大,则说明画面差异大,运动幅度差值小,则说明画面差异小。
以I cost表示帧内预测代价,以P cost表示帧间预测代价,第n个图像帧的运 动幅度
Figure PCTCN2018087539-appb-000001
第n个图像帧的运动幅度差值为η n=α nn-1,n表示图像帧在视频的图像帧序列中的序号。
,第=,表示图像帧在视频的图像帧序列中的序号。其中,图像帧的帧内预测代价I cost可以通过将图像帧进行下采样后,将采样得到的图像帧划分为指定尺寸的多个宏块,对于每个宏块,计算在该宏块的多个方向上的预测块,通过计算宏块与预测块之间残差的SATD,从而得到最优的帧内预测代价I cost。其中,下采样时的采样幅度可以根据需求确定,例如采样得到的图像帧的长度可以为原始图像帧的长度的二分之一,宽度可以为原始图像帧的宽度的二分之一,另外该指定尺寸也可以根据预测需求确定,例如可以为8*8。SATD是指将残差经Hadamard(哈达码)变换后再进行绝对值求和。
其中,图像帧的帧间预测代价P cost可以将图像帧进行下采样后,将采样得到的图像帧划分为指定尺寸的多个宏块。对于每个宏块,采用整像素菱形预测得到宏块的最佳参考块,通过计算宏块与参考块之间残差的SATD,从而得到最优的运动幅度,将该运动幅度与宏块的I cost的最小值作为宏块的P cost,将多个宏块的P cost之和作为图像帧的P cost
需要说明的是,每个图像帧的运动幅度差值可以在进行编码之前计算得到,在对当前的图像帧进行编码时,可以计算该图像帧的运动幅度差值,且由于该图像帧之前的图像帧在进行编码时已经计算了运动幅度差值,因此无需重新计算这些图像帧的运动幅度差值,直接获取即可。这种直接获取的方式可以大大减少编码过程中的计算量。
303、视频编码设备根据滑动窗口内每个图像帧的运动幅度差值更新静止变量。
其中,静止图像帧是指处于静止场景下的图像帧,静止变量用于表示已确定的连续静止图像帧的数目,在编码过程中随着确定的连续静止图像帧的增多,该静止变量会逐渐累加,累加过程中若中间有任一图像帧不是静止图像帧,则将静止变量重置为0,重新开始累加。
在本申请实施例中,可以根据滑动窗口内每个图像帧的运动幅度差值,判断待编码图像帧是否为静止图像帧,根据判断结果和已确定的静止变量进行更新,得到更新后的静止变量。在一种实施方式中,可以采用如下过程来确定静止图像帧并进行静止变量的更新:
在判断待编码图像帧是否为静止图像帧时,判断当前是否满足第一预设条 件,该第一预设条件为待编码图像帧的运动幅度差值的绝对值小于第二预设阈值且滑动窗口内所有图像帧的运动幅度差值总和的绝对值小于第三预设阈值。因此,视频编码设备判断待编码图像帧的运动幅度差值的绝对值是否小于第二预设阈值,并判断滑动窗口内所有图像帧的运动幅度差值总和的绝对值是否小于第三预设阈值,如果是,确定满足第一预设条件,待编码图像帧为静止图像帧,将已确定的静止变量加1,得到更新后的静止变量。如果否,确定不满足第一预设条件时,待编码图像帧不是静止图像帧,将更新后的静止变量设置为0,重新开始统计连续的静止图像帧的数目。
例如,视频编码设备可以采用以下公式,确定静止变量:
Figure PCTCN2018087539-appb-000002
其中,f n表示根据待编码图像帧确定的静止变量,f n-1表示根据待编码图像帧的前一图像帧确定的静止变量,η n表示待编码图像帧的运动幅度差值,η n=α nn-1,α n表示待编码图像帧的运动幅度,α n-1表示待编码图像帧的前一图像帧,ξ n表示滑动窗口内所有图像帧的运动幅度差值总和,n为大于1的正整数,η T表示第二预设阈值,ξ T表示第三预设阈值。a和b的值可以综合考虑编码效率和编码质量后确定并根据实际的图像帧序列进行微调,例如η T=0.2,ξ T=1。
304、视频编码设备判断更新后的静止变量是否小于第一预设阈值,如果是,执行步骤305,如果否,执行步骤306。
本申请实施例中,为了避免在静止场景下编入过多的P帧,将编入I帧的条件设置为:连续静止图像帧的数目达到第一预设阈值,在这种条件下,视频处于长时间静止场景且根据足够数目的静止图像帧已经编入了足够的P帧,若继续编入P帧会导致编码质量下降过多,因此要编入I帧。
那么,视频编码设备获取到更新后的静止变量时,判断更新后的静止变量是否小于第一预设阈值。如果更新后的静止变量小于第一预设阈值,表示连续静止图像帧的数目没有超过上限,出于提高编码效率的考虑,可以继续编入P帧。而如果更新后的静止变量不小于第一预设阈值,表示连续静止图像帧的数目过大,出于提高编码质量的考虑,需要编入I帧。其中,第一预设阈值可以综合对编码质量和编码效率的需求确定,如该第一预设阈值可以等于N。
305、视频编码设备将待编码图像帧编码为P帧。
要编入P帧时,获取待编码图像帧与前一图像帧之间的差异数据,根据该差异数据进行编码,得到编码后的图像帧即为P帧。
306、视频编码设备将待编码图像帧编码为I帧。
要编入I帧时,直接根据待编码图像帧中的数据进行编码,得到编码后的图像帧即为I帧。
在另一实施例中,编入I帧的条件除上述静止变量不小于第一预设阈值的条件之外,还可以包括第二预设条件,相应地,上述步骤304-306可以由以下步骤代替:判断更新后的静止变量是否小于第一预设阈值,并判断是否满足第二预设条件,在更新后的静止变量不小于第一预设阈值且满足第二预设条件时,将待编码图像帧编码为I帧,而在更新后的静止变量小于第一预设阈值或者不满足第二预设条件时,将待编码图像帧编码为P帧。
0<α n≤σ 1且μ I.nn-1>T 1.n
σ 1<α n≤σ 2且μ I.nn-1>T 2.n
σ 2<α n≤σ 3且μ I.nn-1>T 3.n
σ 3<α n≤σ 4且μ I.nn-1>T 4.n
其中,n表示待编码图像帧的序号;α n表示待编码图像帧的运动幅度;μ I.n表示在待编码图像帧之前、距离待编码图像帧最近的I帧的亮度分量的PSNR(Peak Signal to Noise Ratio,峰值信噪比);μ n-1表示待编码图像帧的前一图像帧的亮度分量的PSNR,亮度分量的PSNR可以用于评估相应图像帧的编码质量;σ 1、σ 2、σ 3和σ 4,以及T 1.n、T 2.n、T 3.n和T 4.n为正数阈值,σ 1、σ 2、σ 3和σ 4为表示图像复杂度变化的阈值,可以根据需求设定,σ 1<σ 2,σ 2<σ 3,σ 3<σ 4,例如可以设置为σ 1=0.2,σ 2=0.3,σ 3=0.4,σ 4=0.5,T 1.n>T 2.n,T 2.n>T 3.n,T 3.n>T 4.n
采用上述第二预设条件,可以保证在图像帧的运动幅度较小的情况下,表示图像帧的画面比较稳定,此时编入P帧时不会导致编码质量下降过多,因此无需编入过多的I帧,可以待PSNR下降较大幅度时再编入I帧,既保证了编码质量,还尽可能地提高了编码效率。而在图像帧的运动幅度较大的情况下,表示图像帧的画面变化剧烈,此时为了避免编码质量下降过多,可以将编入I帧的条件放宽,在PSNR下降稍小的幅度时即可编入I帧。
本申请实施例中,针对T 1、T 2、T 3和T 4这四个阈值,不同的图像帧采用的阈值可以相等,也可以不相等。在一种可能实现方式中,在对第n个图像帧进行 编码完成后,可以对T 1.n、T 2.n、T 3.n和T 4.n进行更新,得到更新后的T 1.n+1、T 2.n+1、T 3.n+1和T 4.n+1,采用更新后的阈值对第n+1个图像帧进行判断。该更新方式如下:
T 1.n+1=(1-ω T)×T 1.nT×(μ In-1) 0<α n≤σ 1
T 2.n+1=(1-ω T)×T 2.nT×(μ In-1) σ 1<α n≤σ 2
T 3.n+1=(1-ω T)×T 3.nT×(μ In-1) σ 2<α n≤σ 3
T 4.n+1=(1-ω T)×T 4.nT×(μ In-1) σ 3<α n≤σ 4
其中,上述T 1、T 2、T 3、T 4阈值的初始值可以根据对在第一个I帧之后编入第一个I帧时PSNR的下降幅度的需求确定,例如可以设置为:T 1.1=5,T 2.1=4.5,T 3.1=3,T 4.1=2.5。ω T为表示当前PSNR差值在更新过程中占用的权重的正数阈值,例如可以设置为ω T=0.2。
本申请实施例中,在对待编码图像帧编码完成后,可以获取下一个待编码图像帧,按照一个图像帧的距离向后移动滑动窗口,使下一个待编码图像帧处于滑动窗口的窗尾,此时可以重复执行步骤301-306,继续对下一个待编码图像帧进行编码,以此类推。
基于图4A所示的举例,第N个图像帧编码完成后(假设编码为P帧),按照一个图像帧的距离向后移动滑动窗口,使第N+1个图像帧位于滑动窗口的窗尾,如图4B所示,此时开始对第N+1个图像帧进行编码。
需要说明的是,视频编码设备还可以在配置文件中设置GOP的预设最大长度,该预设最大长度规定了前后两个I帧之间的最大长度,那么,在对待编码图像帧进行编码时,视频编码设备不仅采用上述步骤304中的条件来确定是否要编入I帧,还需要获取待编码图像帧与在待编码图像帧之前、距离待编码图像帧最近的I帧之间的距离,判断该距离是否达到该预设最大长度,当确定该距离达到该预设最大长度时,即使当前不满足步骤304中的条件,也要将待编码图像帧编码为I帧。
本申请实施例提供的方法,通过根据滑动窗口确定N个图像帧,根据滑动窗口内每个图像帧的运动幅度差值更新静止变量,当更新后的静止变量不小于第一预设阈值时,确定视频长时间处于静止场景,则编入I帧。本申请提供了一种判别视频是否长时间处于静止场景的方式,并在确定视频长时间处于静止场景时编入I帧,避免插入过多的P帧,提高了编码质量,而且编入I帧再编入P帧时,P帧占用的比特数目减小,PSNR增加,从而提高了编码效率。
其中,本申请采用帧间预测代价与帧内预测代价的比值来表示当前的运动幅度,采用已编码完成的图像帧的PSNR来表示失真,采用滑动窗口对滑动窗口中的图像帧进行预分析,结合运动幅度的变化情况,利用一个分段函数实现自适应编入I帧的算法,提高了编码效率。
另外,在实时进行视频通话的场景下,可以将配置文件中的GOP的预设最大长度设置为较大的数值,保证不会频繁地根据预设最大长度编入I帧,而更多地是根据图像帧的情况决定是否编入I帧,极大地提升了编码效率。而且,本申请实施例采用的算法在armv7和arm64等类型的处理器上做了汇编优化,可以提高处理速度。
图5是本申请实施例提供的一种视频编码方法的流程图,本申请实施例以对3个图像帧进行编码的过程为例,执行主体为视频编码设备。参见图5,该方法包括:
501、设置长度等于N的滑动窗口,将视频中的第一个图像帧编码为I帧,将视频的第二个图像帧至第N-1个图像帧均编码为P帧,此时编码得到的N-1个图像帧位于滑动窗口的前N-1个位置。获取到第N个图像帧时,第N个图像帧为当前待编码图像帧,且第N个图像帧处于滑动窗口的窗尾。
本申请实施例中假设N为3。
502、若根据滑动窗口内的N个图像帧的运动幅度差值确定满足第一预设条件,第N个图像帧为静止图像帧,此时静止变量为1,由于静止变量1小于N,因此将第N个图像帧编码为P帧。
503、获取第N+1个图像帧,并按照一个图像帧的距离向后移动滑动窗口,此时滑动窗口中包括第2个图像帧至第N+1个图像帧。
504、若根据滑动窗口内的N个图像帧的运动幅度差值确定满足第一预设条件,第N+1个图像帧为静止图像帧,此时静止变量更新为2,由于静止变量2小于N,因此将第N+1个图像帧编码为P帧。
505、获取第N+2个图像帧,并按照一个图像帧的距离向后移动滑动窗口,此时滑动窗口中包括第3个图像帧至第N+2个图像帧。
506、若根据滑动窗口内的N个图像帧的运动幅度差值确定满足第一预设条件,第N+2个图像帧为静止图像帧,此时静止变量更新为3,由于静止变量3等于N,因此将第N+2个图像帧编码为I帧。
图6是本申请实施例提供的一种视频编码方法的流程图,本申请实施例对在场景切换时编入I帧的过程进行说明,执行主体为视频编码设备。参见图6,该方法包括:
601、获取待编码图像帧与在待编码图像帧之前、距离待编码图像帧最近的I帧之间的距离。
602、根据该距离计算第四预设阈值。
其中,计算第四预设阈值时采用以下公式:
Figure PCTCN2018087539-appb-000003
Figure PCTCN2018087539-appb-000004
Figure PCTCN2018087539-appb-000005
其中,V Scene表示设定的阈值,可以由视频编码设备预先确定;D表示该距离;GOP min表示图像组GOP的预设最小长度;GOP max表示GOP的预设最大长度;F bias表示第四预设阈值。
603、判断待编码图像帧的运动幅度是否大于1与第四预设阈值的差值、是否小于第五预设阈值,并判断该距离是否小于第六预设阈值。
604、当待编码图像帧的运动幅度大于1与第四预设阈值的差值,也不小于第五预设阈值,且该距离不小于第六预设阈值时,将待编码图像帧编码为I帧。
605、当待编码图像帧的运动幅度不大于1与第四预设阈值的差值,或者运动幅度小于第五预设阈值,或者该距离小于第六预设阈值时,将待编码图像帧编码为P帧。
相关技术会在视频发生场景切换时编入I帧,但是,如果视频频繁发生场景切换时,若编入过多的I帧,在I帧之后编入P帧时会导致质量下降。因此为了避免编入过多的I帧,要编码为I帧的图像帧的运动幅度应当大于1与第四预设阈值的差值且不小于第五预设阈值,编入的I帧与前一个I帧之间的距离不能小于第六预设阈值。该第五预设阈值用于确定要编码为I帧的图像帧的最小运动幅度,该第六预设阈值用于确定前后两个I帧之间的最小距离,该第 五预设阈值和该第六预设阈值可以综合考虑编码质量和编码效率后确定,例如第五预设阈值可以为0.8,第六预设阈值可以为帧率的二分之一。
实际应用中,在进行判断时,视频编码设备可以先判断该距离是否小于第六预设阈值,如果是,直接将待编码图像帧编码为P帧,如果否,再判断待编码图像帧的运动幅度是否大于1与第四预设阈值的差值、是否小于第五预设阈值,并根据判断结果确定将待编码图像帧编码为I帧还是P帧。
需要说明的是,上述图3所示实施例与图6所示实施例可以结合,在编码过程中,视频编码设备可以采用上述步骤301-304进行判断,并采用上述步骤601-603进行判断,从而确定要将待编码图像帧编码为I帧还是P帧。
本申请实施例提供的方法,通过保证在待编码图像帧的运动幅度大于1与第四预设阈值的差值且不小于第五预设阈值,待编码图像帧与前一个I帧之间的距离不小于第六预设阈值时,将待编码图像帧编码为I帧,避免了编入过多的I帧,提高了编码效率。
图7是本申请实施例提供的一种视频编码装置的结构示意图。参见图7,该装置包括:
确定模块701,用于执行上述步骤301;
第一获取模块702,用于执行上述步骤302;
第二获取模块703,用于执行上述步骤303;
编码模块704,用于执行上述步骤305或306。
在一种可能实现方式中,第二获取模块703用于更新静止变量。
在另一种可能实现方式中,编码模块704用于当更新后的静止变量不小于所述指定数目且满足第二预设条件时编入I帧。
第二预设条件包括以下条件中的至少一个:
0<α n≤σ 1且μ I.nn-1>T 1.n
σ 1<α n≤σ 2且μ I.nn-1>T 2.n
σ 2<α n≤σ 3且μ I.nn-1>T 3.n
σ 3<α n≤σ 4且μ I.nn-1>T 4.n
在另一种可能实现方式中,
T 1.n+1=(1-ω T)×T 1.nT×(μ In-1) 0<α n≤σ 1
T 2.n+1=(1-ω T)×T 2.nT×(μ In-1) σ 1<α n≤σ 2
T 3.n+1=(1-ω T)×T 3.nT×(μ In-1) σ 2<α n≤σ 3
T 4.n+1=(1-ω T)×T 4.nT×(μ In-1) σ 3<α n≤σ 4
其中,ω T为正数阈值。
在另一种可能实现方式中,编码模块704还用于当更新后的静止变量小于第一预设阈值时,将待编码图像帧编码为P帧。
在另一种可能实现方式中,该装置还包括:
切换模块,用于获取下一个待编码图像帧,按照一个图像帧的距离向后移动滑动窗口;
第二获取模块703,还用于继续根据滑动窗口内每个图像帧的运动幅度差值更新静止变量,直至当前更新后的静止变量不小于第一预设阈值时,由编码模块704将当前待编码图像帧编码为I帧。
在另一种可能实现方式中,该装置还包括:
第三获取模块,用于执行上述步骤601;
计算模块,用于执行上述步骤602;
编码模块704,用于执行上述步骤604或605。
上述所有可选技术方案,可以采用任意结合形成本公开的可选实施例,在此不再一一赘述。
需要说明的是:上述实施例提供的视频编码装置在进行编码时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将视频编码设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的视频编码装置与视频编码方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
图8是本申请实施例提供的一种终端的结构示意图。该终端可以用于实施上述实施例所示出的视频编码方法中视频编码设备所执行的功能。具体来讲:
终端800可以包括RF(Radio Frequency,射频)电路110、包括有一个或一个以上计算机可读存储介质的存储器120、输入单元130、显示单元140、传感器150、音频电路160、传输模块170、包括有一个或者一个以上处理核心的处理器180、以及电源190等部件。本领域技术人员可以理解,图8中示出的 终端结构并不构成对终端的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。其中:
RF电路110可用于收发信息或通话过程中,信号的接收和发送,特别地,将基站的下行信息接收后,交由一个或者一个以上处理器180处理;另外,将涉及上行的数据发送给基站。通常,RF电路110包括但不限于天线、至少一个放大器、调谐器、一个或多个振荡器、用户身份模块(SIM)卡、收发信机、耦合器、LNA(Low Noise Amplifier,低噪声放大器)、双工器等。此外,RF电路110还可以通过无线通信与网络和其他终端通信。所述无线通信可以使用任一通信标准或协议,包括但不限于GSM(Global System of Mobile communication,全球移动通讯系统)、GPRS(General Packet Radio Service,通用分组无线服务)、CDMA(Code Division Multiple Access,码分多址)、WCDMA(Wideband Code Division Multiple Access,宽带码分多址)、LTE(Long Term Evolution,长期演进)、电子邮件、SMS(Short Messaging Service,短消息服务)等。
存储器120可用于存储软件程序以及模块,如上述示例性实施例所示出的终端所对应的软件程序以及模块,处理器180通过运行存储在存储器120的软件程序以及模块,从而执行各种功能应用以及数据处理,如实现基于视频的交互等。存储器120可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据终端800的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器120可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。相应地,存储器120还可以包括存储器控制器,以提供处理器180和输入单元130对存储器120的访问。
输入单元130可用于接收输入的数字或字符信息,以及产生与用户设置以及功能控制有关的键盘、鼠标、操作杆、光学或者轨迹球信号输入。具体地,输入单元130可包括触敏表面131以及其他输入终端132。触敏表面131,也称为触摸显示屏或者触控板,可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触敏表面131上或在触敏表面131附近的操作),并根据预先设定的程式驱动相应的链接装置。可选的,触敏表面131可包括触摸检测装置和触摸控制器两个部分。其中,触摸检测装置检测 用户的触摸方位,并检测触摸操作带来的信号,将信号传送给触摸控制器;触摸控制器从触摸检测装置上接收触摸信息,并将它转换成触点坐标,再送给处理器180,并能接收处理器180发来的命令并加以执行。此外,可以采用电阻式、电容式、红外线以及表面声波等多种类型实现触敏表面131。除了触敏表面131,输入单元130还可以包括其他输入终端132。具体地,其他输入终端132可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆等中的一种或多种。
显示单元140可用于显示由用户输入的信息或提供给用户的信息以及终端800的各种图形用户接口,这些图形用户接口可以由图形、文本、图标、视频和其任意组合来构成。显示单元140可包括显示面板141,可选的,可以采用LCD(Liquid Crystal Display,液晶显示器)、OLED(Organic Light-Emitting Diode,有机发光二极管)等形式来配置显示面板141。进一步的,触敏表面131可覆盖显示面板141,当触敏表面131检测到在其上或附近的触摸操作后,传送给处理器180以确定触摸事件的类型,随后处理器180根据触摸事件的类型在显示面板141上提供相应的视觉输出。虽然在图8中,触敏表面131与显示面板141是作为两个独立的部件来实现输入和输入功能,但是在某些实施例中,可以将触敏表面131与显示面板141集成而实现输入和输出功能。
终端800还可包括至少一种传感器150,比如光传感器、运动传感器以及其他传感器。具体地,光传感器可包括环境光传感器及接近传感器,其中,环境光传感器可根据环境光线的明暗来调节显示面板141的亮度,接近传感器可在终端800移动到耳边时,关闭显示面板141和/或背光。作为运动传感器的一种,重力加速度传感器可检测各个方向上(一般为三轴)加速度的大小,静止时可检测出重力的大小及方向,可用于识别手机姿态的应用(比如横竖屏切换、相关游戏、磁力计姿态校准)、振动识别相关功能(比如计步器、敲击)等;至于终端800还可配置的陀螺仪、气压计、湿度计、温度计、红外线传感器等其他传感器,在此不再赘述。
音频电路160、扬声器161,传声器162可提供用户与终端800之间的音频接口。音频电路160可将接收到的音频数据转换后的电信号,传输到扬声器161,由扬声器161转换为声音信号输出;另一方面,传声器162将收集的声音信号转换为电信号,由音频电路160接收后转换为音频数据,再将音频数据输出处理器180处理后,经RF电路110以发送给比如另一终端,或者将音频 数据输出至存储器120以便进一步处理。音频电路160还可能包括耳塞插孔,以提供外设耳机与终端800的通信。
终端800通过传输模块170可以帮助用户收发电子邮件、浏览网页和访问流式媒体等,它为用户提供了无线或有线的宽带互联网访问。虽然图8示出了传输模块170,但是可以理解的是,其并不属于终端800的必须构成,完全可以根据需要在不改变发明的本质的范围内而省略。
处理器180是终端800的控制中心,利用各种接口和线路链接整个手机的各个部分,通过运行或执行存储在存储器120内的软件程序和/或模块,以及调用存储在存储器120内的数据,执行终端800的各种功能和处理数据,从而对手机进行整体监控。可选的,处理器180可包括一个或多个处理核心;优选的,处理器180可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器180中。
终端800还包括给各个部件供电的电源190(比如电池),优选的,电源可以通过电源管理系统与处理器180逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。电源190还可以包括一个或一个以上的直流或交流电源、再充电系统、电源故障检测电路、电源转换器或者逆变器、电源状态指示器等任意组件。
尽管未示出,终端800还可以包括摄像头、蓝牙模块等,在此不再赘述。具体在本实施例中,终端800的显示单元是触摸屏显示器,终端800还包括有存储器以及至少一条指令,其中至少一条指令存储于存储器中,且经配置以由一个或者一个以上处理器加载并执行,以实现上述实施例中的视频编码方法。
图9是本申请实施例提供的一种服务器的结构示意图,该服务器900可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上中央处理器(central processing units,CPU)922(例如,一个或一个以上处理器)和存储器932,一个或一个以上存储应用程序942或数据944的存储介质930(例如一个或一个以上海量存储设备)。其中,存储器932和存储介质930可以是短暂存储或持久存储。存储在存储介质930的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对服务器中的一系列指令。更进一步地,中央处理器922可以设置为与存储介质930通信,在服务器900上加载并执行 存储介质930中的一系列指令,以实现上述实施例中的视频编码方法。
服务器900还可以包括一个或一个以上电源926,一个或一个以上有线或无线网络接口950,一个或一个以上输入输出接口959,一个或一个以上键盘956,和/或,一个或一个以上操作系统941,例如Windows Server TM,Mac OSX TM,Unix TM,Linux TM,FreeBSD TM等等。
该服务器900可以用于执行上述实施例提供的视频编码方法中视频编码设备所执行的步骤。
本申请实施例还提供了一种计算机可读存储介质,该计算机可读存储介质中存储有至少一条指令,该指令由处理器加载并执行以实现上述实施例的视频编码方法中所执行的操作。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
以上所述仅为本申请的较佳实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (18)

  1. 一种视频编码方法,其特征在于,应用于视频编码设备上,所述方法包括:
    根据滑动窗口确定视频的图像帧序列上的N个图像帧,所述滑动窗口内的图像帧包括N-1个已编码图像帧以及一个处于窗尾的待编码图像帧;
    获取所述滑动窗口内每个图像帧的运动幅度差值,所述图像帧的运动幅度差值为对应图像帧的运动幅度与前一图像帧的运动幅度之间的差值,所述图像帧的运动幅度为对应图像帧的帧间预测代价与帧内预测代价之间的比值;
    根据所述滑动窗口内每个图像帧的运动幅度差值更新静止变量,所述静止变量用于表示已确定的连续静止图像帧的数目;
    当更新后的静止变量不小于第一预设阈值时,将所述待编码图像帧编码为I帧。
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述滑动窗口内每个图像帧的运动幅度差值更新静止变量,包括:
    当所述待编码图像帧的运动幅度以及所述滑动窗口内所有图像帧的运动幅度满足第一预设条件时,确定所述待编码图像帧为静止图像帧,将已确定的静止变量加1,得到更新后的静止变量;
    当不满足所述第一预设条件时,确定所述待编码图像帧不是静止图像帧,将更新后的静止变量设置为0;
    其中,所述第一预设条件为所述待编码图像帧的运动幅度差值的绝对值小于第二预设阈值且所述滑动窗口内所有图像帧的运动幅度差值总和的绝对值小于第三预设阈值。
  3. 根据权利要求1所述的方法,其特征在于,所述当更新后的静止变量不小于第一预设阈值时,将所述待编码图像帧编码为I帧,包括:
    当所述更新后的静止变量不小于所述第一预设阈值,且满足第二预设条件时,将所述待编码图像帧编码为I帧;
    所述第二预设条件包括以下条件中的至少一个:
    0<α n≤σ 1且μ I.nn-1>T 1.n
    σ 1<α n≤σ 2且μ I.nn-1>T 2.n
    σ 2<α n≤σ 3且μ I.nn-1>T 3.n
    σ 3<α n≤σ 4且μ I.nn-1>T 4.n
    其中,所述n表示所述待编码图像帧的序号;所述α n表示所述待编码图像帧的运动幅度;所述μ I.n表示在所述待编码图像帧之前、距离所述待编码图像帧最近的I帧的亮度分量的峰值信噪比PSNR;所述μ n-1表示所述待编码图像帧的前一图像帧的亮度分量的PSNR;所述σ 1、所述σ 2、所述σ 3和所述σ 4,以及所述T 1.n、所述T 2.n、所述T 3.n和所述T 4.n为正数阈值,σ 1<σ 2,σ 2<σ 3,σ 3<σ 4,T 1.n>T 2.n,T 2.n>T 3.n,T 3.n>T 4.n
  4. 根据权利要求1所述的方法,其特征在于,所述根据所述滑动窗口内每个图像帧的运动幅度差值更新静止变量之后,所述方法还包括:
    当所述更新后的静止变量小于所述第一预设阈值时,将所述待编码图像帧编码为P帧。
  5. 根据权利要求1至4任一项所述的方法,其特征在于,所述方法还包括:
    所述待编码图像帧编码完成后,获取下一个待编码图像帧,按照一个图像帧的距离向后移动所述滑动窗口,使所述下一个待编码图像帧处于所述滑动窗口的窗尾;
    继续根据所述滑动窗口内每个图像帧的运动幅度差值更新所述静止变量,直至当前更新后的静止变量不小于所述第一预设阈值时,将当前待编码图像帧编码为I帧。
  6. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    获取所述待编码图像帧与在所述待编码图像帧之前、距离所述待编码图像帧最近的I帧之间的距离;
    根据所述距离,采用以下公式计算第四预设阈值:
    Figure PCTCN2018087539-appb-100001
    Figure PCTCN2018087539-appb-100002
    Figure PCTCN2018087539-appb-100003
    其中,所述V Scene表示设定的阈值;所述D表示所述距离;所述GOP min表示图像组GOP的预设最小长度;所述GOP max表示GOP的预设最大长度;所述F bias表示所述第四预设阈值;
    当所述待编码图像帧的运动幅度大于1与所述第四预设阈值的差值,所述待编码图像帧的运动幅度不小于第五预设阈值,且所述距离不小于第六预设阈值时,将所述待编码图像帧编码为I帧。
  7. 一种视频编码装置,其特征在于,所述装置包括:
    确定模块,用于根据滑动窗口确定视频的图像帧序列上的N个图像帧,所述滑动窗口内的图像帧包括N-1个已编码图像帧以及一个处于窗尾的待编码图像帧;
    第一获取模块,用于获取所述滑动窗口内每个图像帧的运动幅度差值,所述图像帧的运动幅度差值为对应图像帧的运动幅度与前一图像帧的运动幅度之间的差值,所述图像帧的运动幅度为对应图像帧的帧间预测代价与帧内预测代价之间的比值;
    第二获取模块,用于根据所述滑动窗口内每个图像帧的运动幅度差值更新静止变量,所述静止变量用于表示已确定的连续静止图像帧的数目;
    编码模块,用于当更新后的静止变量不小于所述第一预设阈值时,将所述待编码图像帧编码为I帧。
  8. 根据权利要求7所述的装置,其特征在于,所述第二获取模块用于:
    当所述待编码图像帧的运动幅度以及所述滑动窗口内所有图像帧的运动幅度满足第一预设条件时,确定所述待编码图像帧为静止图像帧,将已确定的静止变量加1,得到更新后的静止变量;
    当不满足所述第一预设条件时,确定所述待编码图像帧不是静止图像帧,将更新后的静止变量设置为0;
    其中,所述第一预设条件为所述待编码图像帧的运动幅度差值的绝对值小 于第二预设阈值且所述滑动窗口内所有图像帧的运动幅度差值总和的绝对值小于第三预设阈值。
  9. 根据权利要求7所述的装置,其特征在于,所述编码模块用于当所述更新后的静止变量不小于所述第一预设阈值,且满足第二预设条件时,将所述待编码图像帧编码为I帧;
    所述第二预设条件包括以下条件中的至少一个:
    0<α n≤σ 1且μ I.nn-1>T 1.n
    σ 1<α n≤σ 2且μ I.nn-1>T 2.n
    σ 2<α n≤σ 3且μ I.nn-1>T 3.n
    σ 3<α n≤σ 4且μ I.nn-1>T 4.n
    其中,所述n表示所述待编码图像帧的序号;所述α n表示所述待编码图像帧的运动幅度;所述μ I.n表示在所述待编码图像帧之前、距离所述待编码图像帧最近的I帧的亮度分量的峰值信噪比PSNR;所述μ n-1表示所述待编码图像帧的前一图像帧的亮度分量的PSNR;所述σ 1、所述σ 2、所述σ 3和所述σ 4,以及所述T 1.n、所述T 2.n、所述T 3.n和所述T 4.n为正数阈值,σ 1<σ 2,σ 2<σ 3,σ 3<σ 4,T 1.n>T 2.n,T 2.n>T 3.n,T 3.n>T 4.n
  10. 根据权利要求7所述的装置,其特征在于,所述编码模块还用于当所述更新后的静止变量小于所述第一预设阈值时,将所述待编码图像帧编码为P帧。
  11. 根据权利要求7至10任一项所述的装置,其特征在于,所述装置还包括:
    切换模块,用于所述待编码图像帧编码完成后,获取下一个待编码图像帧,按照一个图像帧的距离向后移动所述滑动窗口,使所述下一个待编码图像帧处于所述滑动窗口的窗尾;
    所述第二获取模块,还用于继续根据所述滑动窗口内每个图像帧的运动幅度差值更新静止变量,直至当前更新后的静止变量不小于所述第一预设阈值时,由所述编码模块将当前待编码图像帧编码为I帧。
  12. 一种视频编码设备,其特征在于,所述视频编码设备包括处理器和存储 器,所述存储器中存储有至少一条指令,所述指令由所述处理器加载并执行以实现如下述的视频编码方法:
    根据滑动窗口确定视频的图像帧序列上的N个图像帧,所述滑动窗口内的图像帧包括N-1个已编码图像帧以及一个处于窗尾的待编码图像帧;
    获取所述滑动窗口内每个图像帧的运动幅度差值,所述图像帧的运动幅度差值为对应图像帧的运动幅度与前一图像帧的运动幅度之间的差值,所述图像帧的运动幅度为对应图像帧的帧间预测代价与帧内预测代价之间的比值;
    根据所述滑动窗口内每个图像帧的运动幅度差值更新静止变量,所述静止变量用于表示已确定的连续静止图像帧的数目;
    当更新后的静止变量不小于第一预设阈值时,将所述待编码图像帧编码为I帧。
  13. 根据权利要求12所述的视频编码设备,其特征在于,所述指令还由所述处理器加载并执行以实现下述方法:
    当所述待编码图像帧的运动幅度以及所述滑动窗口内所有图像帧的运动幅度满足第一预设条件时,确定所述待编码图像帧为静止图像帧,将已确定的静止变量加1,得到更新后的静止变量;
    当不满足所述第一预设条件时,确定所述待编码图像帧不是静止图像帧,将更新后的静止变量设置为0;
    其中,所述第一预设条件为所述待编码图像帧的运动幅度差值的绝对值小于第二预设阈值且所述滑动窗口内所有图像帧的运动幅度差值总和的绝对值小于第三预设阈值。
  14. 根据权利要求12所述的视频编码设备,其特征在于,所述指令还由所述处理器加载并执行以实现下述方法:
    当所述更新后的静止变量不小于所述第一预设阈值,且满足第二预设条件时,将所述待编码图像帧编码为I帧;
    所述第二预设条件包括以下条件中的至少一个:
    0<α n≤σ 1且μ I.nn-1>T 1.n
    σ 1<α n≤σ 2且μ I.nn-1>T 2.n
    σ 2<α n≤σ 3且μ I.nn-1>T 3.n
    σ 3<α n≤σ 4且μ I.nn-1>T 4.n
    其中,所述n表示所述待编码图像帧的序号;所述α n表示所述待编码图像帧的运动幅度;所述μ I.n表示在所述待编码图像帧之前、距离所述待编码图像帧最近的I帧的亮度分量的峰值信噪比PSNR;所述μ n-1表示所述待编码图像帧的前一图像帧的亮度分量的PSNR;所述σ 1、所述σ 2、所述σ 3和所述σ 4,以及所述T 1.n、所述T 2.n、所述T 3.n和所述T 4.n为正数阈值,σ 1<σ 2,σ 2<σ 3,σ 3<σ 4,T 1.n>T 2.n,T 2.n>T 3.n,T 3.n>T 4.n
  15. 根据权利要求12所述的视频编码设备,其特征在于,所述指令还由所述处理器加载并执行以实现下述方法:
    当所述更新后的静止变量小于所述第一预设阈值时,将所述待编码图像帧编码为P帧。
  16. 根据权利要求12至15任一项所述的视频编码设备,其特征在于,所述指令还由所述处理器加载并执行以实现下述方法:
    所述待编码图像帧编码完成后,获取下一个待编码图像帧,按照一个图像帧的距离向后移动所述滑动窗口,使所述下一个待编码图像帧处于所述滑动窗口的窗尾;
    继续根据所述滑动窗口内每个图像帧的运动幅度差值更新所述静止变量,直至当前更新后的静止变量不小于所述第一预设阈值时,将当前待编码图像帧编码为I帧。
  17. 根据权利要求12所述的视频编码设备,其特征在于,所述指令还由所述处理器加载并执行以实现下述方法:
    获取所述待编码图像帧与在所述待编码图像帧之前、距离所述待编码图像帧最近的I帧之间的距离;
    根据所述距离,采用以下公式计算第四预设阈值:
    Figure PCTCN2018087539-appb-100004
    Figure PCTCN2018087539-appb-100005
    Figure PCTCN2018087539-appb-100006
    其中,所述V Scene表示设定的阈值;所述D表示所述距离;所述GOP min表示图像组GOP的预设最小长度;所述GOP max表示GOP的预设最大长度;所述F bias表示所述第四预设阈值;
    当所述待编码图像帧的运动幅度大于1与所述第四预设阈值的差值,所述待编码图像帧的运动幅度不小于第五预设阈值,且所述距离不小于第六预设阈值时,将所述待编码图像帧编码为I帧。
  18. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有至少一条指令,所述指令由处理器加载并执行以实现如权利要求1至权利要求6任一项所述的视频编码方法中所执行的操作。
PCT/CN2018/087539 2017-06-15 2018-05-18 视频编码方法、装置、设备及存储介质 WO2018228130A1 (zh)

Priority Applications (5)

Application Number Priority Date Filing Date Title
EP18816657.3A EP3641310A4 (en) 2017-06-15 2018-05-18 VIDEO CODING METHOD, APPARATUS AND DEVICE, INFORMATION DEVICE AND MEDIUM
KR1020197024476A KR102225235B1 (ko) 2017-06-15 2018-05-18 비디오 인코딩 방법, 장치, 및 디바이스, 및 저장 매체
JP2019543009A JP6925587B2 (ja) 2017-06-15 2018-05-18 ビデオ符号化方法、装置、機器、及び記憶媒体
US16/401,671 US10893275B2 (en) 2017-06-15 2019-05-02 Video coding method, device, device and storage medium
US17/100,207 US11297328B2 (en) 2017-06-15 2020-11-20 Video coding method, device, device and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710461367.7 2017-06-15
CN201710461367.7A CN109151469B (zh) 2017-06-15 2017-06-15 视频编码方法、装置及设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/401,671 Continuation US10893275B2 (en) 2017-06-15 2019-05-02 Video coding method, device, device and storage medium

Publications (1)

Publication Number Publication Date
WO2018228130A1 true WO2018228130A1 (zh) 2018-12-20

Family

ID=64659774

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/087539 WO2018228130A1 (zh) 2017-06-15 2018-05-18 视频编码方法、装置、设备及存储介质

Country Status (6)

Country Link
US (2) US10893275B2 (zh)
EP (1) EP3641310A4 (zh)
JP (1) JP6925587B2 (zh)
KR (1) KR102225235B1 (zh)
CN (1) CN109151469B (zh)
WO (1) WO2018228130A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112272297A (zh) * 2020-10-28 2021-01-26 上海科江电子信息技术有限公司 嵌入在解码器端的图像质量静止帧检测方法

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112929701B (zh) * 2021-02-04 2023-03-17 浙江大华技术股份有限公司 一种视频编码方法、装置、设备及介质
CN113747159B (zh) * 2021-09-06 2023-10-13 深圳软牛科技有限公司 一种生成可变帧率视频媒体文件的方法、装置及相关组件
CN115695889A (zh) * 2022-09-30 2023-02-03 聚好看科技股份有限公司 显示设备及悬浮窗显示方法
CN116962685B (zh) * 2023-09-21 2024-01-30 杭州爱芯元智科技有限公司 视频编码方法、装置、电子设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110131622A1 (en) * 2006-02-27 2011-06-02 Cisco Technology, Inc. Method and apparatus for immediate display of multicast iptv over a bandwidth constrained network
CN105761263A (zh) * 2016-02-19 2016-07-13 浙江大学 一种基于镜头边界检测和聚类的视频关键帧提取方法
CN105898313A (zh) * 2014-12-15 2016-08-24 江南大学 一种新的基于视频大纲的监控视频可伸缩编码技术
CN106231301A (zh) * 2016-07-22 2016-12-14 上海交通大学 基于编码单元层次和率失真代价的hevc复杂度控制方法

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09327023A (ja) 1996-06-06 1997-12-16 Nippon Telegr & Teleph Corp <Ntt> フレーム内/フレーム間符号化切替方法および画像符号化装置
US6025888A (en) * 1997-11-03 2000-02-15 Lucent Technologies Inc. Method and apparatus for improved error recovery in video transmission over wireless channels
JP4649318B2 (ja) * 2004-12-13 2011-03-09 キヤノン株式会社 画像符号化装置、画像符号化方法、プログラム及び記憶媒体
US8654848B2 (en) * 2005-10-17 2014-02-18 Qualcomm Incorporated Method and apparatus for shot detection in video streaming
JP4688170B2 (ja) * 2007-01-26 2011-05-25 株式会社Kddi研究所 動画像符号化装置
US8938005B2 (en) * 2007-11-05 2015-01-20 Canon Kabushiki Kaisha Image encoding apparatus, method of controlling the same, and computer program
US9628811B2 (en) * 2007-12-17 2017-04-18 Qualcomm Incorporated Adaptive group of pictures (AGOP) structure determination
US8385404B2 (en) * 2008-09-11 2013-02-26 Google Inc. System and method for video encoding using constructed reference frame
CN101742293B (zh) * 2008-11-14 2012-11-28 北京中星微电子有限公司 一种基于视频运动特征的图像自适应帧场编码方法
JP5215951B2 (ja) * 2009-07-01 2013-06-19 キヤノン株式会社 符号化装置及びその制御方法、コンピュータプログラム
CN102572381A (zh) * 2010-12-29 2012-07-11 中国移动通信集团公司 视频监控场景判别方法及其监控图像编码方法、及装置
WO2012096164A1 (ja) * 2011-01-12 2012-07-19 パナソニック株式会社 画像符号化方法、画像復号方法、画像符号化装置および画像復号装置
CN103796019B (zh) * 2012-11-05 2017-03-29 北京勤能通达科技有限公司 一种均衡码率编码方法
KR20140110221A (ko) * 2013-03-06 2014-09-17 삼성전자주식회사 비디오 인코더, 장면 전환 검출 방법 및 비디오 인코더의 제어 방법
JP6365253B2 (ja) 2014-11-12 2018-08-01 富士通株式会社 映像データ処理装置、映像データ処理プログラムおよび映像データ処理方法
CN106254873B (zh) * 2016-08-31 2020-04-03 广州市网星信息技术有限公司 一种视频编码方法及视频编码装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110131622A1 (en) * 2006-02-27 2011-06-02 Cisco Technology, Inc. Method and apparatus for immediate display of multicast iptv over a bandwidth constrained network
CN105898313A (zh) * 2014-12-15 2016-08-24 江南大学 一种新的基于视频大纲的监控视频可伸缩编码技术
CN105761263A (zh) * 2016-02-19 2016-07-13 浙江大学 一种基于镜头边界检测和聚类的视频关键帧提取方法
CN106231301A (zh) * 2016-07-22 2016-12-14 上海交通大学 基于编码单元层次和率失真代价的hevc复杂度控制方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3641310A4 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112272297A (zh) * 2020-10-28 2021-01-26 上海科江电子信息技术有限公司 嵌入在解码器端的图像质量静止帧检测方法
CN112272297B (zh) * 2020-10-28 2023-01-31 上海科江电子信息技术有限公司 嵌入在解码器端的图像质量静止帧检测方法

Also Published As

Publication number Publication date
US20210076044A1 (en) 2021-03-11
US11297328B2 (en) 2022-04-05
US10893275B2 (en) 2021-01-12
KR20190109476A (ko) 2019-09-25
JP2020509668A (ja) 2020-03-26
CN109151469A (zh) 2019-01-04
US20190260998A1 (en) 2019-08-22
CN109151469B (zh) 2020-06-30
KR102225235B1 (ko) 2021-03-09
EP3641310A1 (en) 2020-04-22
JP6925587B2 (ja) 2021-08-25
EP3641310A4 (en) 2020-05-06

Similar Documents

Publication Publication Date Title
JP7229261B2 (ja) ビデオ符号化のビットレート制御方法、装置、機器、記憶媒体及びプログラム
WO2018228130A1 (zh) 视频编码方法、装置、设备及存储介质
US10986332B2 (en) Prediction mode selection method, video encoding device, and storage medium
CN107454416B (zh) 视频流发送方法和装置
US11202066B2 (en) Video data encoding and decoding method, device, and system, and storage medium
CN111010576B (zh) 一种数据处理方法及相关设备
CN113572836B (zh) 一种数据传输方法、装置、服务器及存储介质
CN108337533B (zh) 视频压缩方法和装置
WO2019169997A1 (zh) 视频运动估计方法、装置、终端及存储介质
US10284850B2 (en) Method and system to control bit rate in video encoding
CN105992001A (zh) 一种对图片进行量化处理的方法及装置
US10827198B2 (en) Motion estimation method, apparatus, and storage medium
CN111263216A (zh) 一种视频传输方法、装置、存储介质及终端
CN113630621A (zh) 一种视频处理的方法、相关装置及存储介质
CN109003313B (zh) 一种传输网页图片的方法、装置和系统
CN110213593B (zh) 一种运动矢量的确定方法、编码压缩方法和相关装置
CN116156232A (zh) 一种视频内容的播放方法以及相关装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18816657

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2019543009

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 20197024476

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2018816657

Country of ref document: EP

Effective date: 20200115