WO2018228130A1 - 视频编码方法、装置、设备及存储介质 - Google Patents
视频编码方法、装置、设备及存储介质 Download PDFInfo
- Publication number
- WO2018228130A1 WO2018228130A1 PCT/CN2018/087539 CN2018087539W WO2018228130A1 WO 2018228130 A1 WO2018228130 A1 WO 2018228130A1 CN 2018087539 W CN2018087539 W CN 2018087539W WO 2018228130 A1 WO2018228130 A1 WO 2018228130A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image frame
- encoded
- frame
- image
- motion amplitude
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/573—Motion compensation with multiple frame prediction using two or more reference frames in a given prediction direction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/107—Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/114—Adapting the group of pictures [GOP] structure, e.g. number of B-frames between two anchor frames
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/137—Motion inside a coding unit, e.g. average field, frame or block difference
- H04N19/139—Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/142—Detection of scene cut or scene change
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/59—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
- H04N7/141—Systems for two-way working between two video terminals, e.g. videophone
Definitions
- the present application relates to the field of multimedia technologies, and in particular, to a video encoding method, apparatus, device, and storage medium.
- a sequence of image frames of a video includes a plurality of image frames.
- the amount of data transmitted is very large. Therefore, in order to reduce the amount of data, encoding is required before transmission.
- the H.264 standard is a commonly used video coding standard.
- the H.264 standard includes I frames and P frames.
- the I frame is a frame obtained by completely encoding the current image frame.
- the P frame is based on the current image frame and the previous one.
- the programmed I frame and P frame may constitute a GOP (Group of Pictures), and one GOP starts with one I frame and ends with a P frame before the next I frame.
- GOP Group of Pictures
- the difference between the current image frame and the previous image frame is relatively large. If the current image frame is encoded as a P frame, the encoding quality is degraded too much, so the current image frame can be encoded as I. frame.
- the residual between the current image frame and the previous image frame may be obtained when the encoding is performed, and the residual may represent the difference between the current image frame and the previous image frame, and determine whether the residual is greater than a preset threshold. When the residual is greater than the preset threshold, it may be determined that the video has been switched, and the current image frame is encoded as an I frame.
- the inventors have found that the related art has at least the following drawbacks: the above method may result in continuous coding of too many P frames in some cases resulting in degradation of coding quality.
- the embodiment of the present application provides a video encoding method, device, device, and storage medium, which can solve the problem of degradation of encoding quality.
- the technical solution is as follows:
- a video encoding method for use in a video encoding device, the method comprising:
- N image frames on a sequence of image frames of the video according to the sliding window the image frames in the sliding window comprising N-1 encoded image frames and an image frame to be encoded at the end of the window;
- a difference in motion amplitude is a difference between a motion amplitude of the corresponding image frame and a motion amplitude of a previous image frame, the image frame
- the magnitude of motion is the ratio between the inter prediction cost of the corresponding image frame and the intra prediction cost
- the image frame to be encoded is encoded as an I frame.
- a video encoding apparatus comprising:
- a determining module configured to determine N image frames on a sequence of image frames of the video according to the sliding window, wherein the image frame in the sliding window comprises N-1 encoded image frames and an image frame to be encoded at the end of the window;
- a first obtaining module configured to acquire a difference in motion amplitude of each image frame in the sliding window, where a difference in motion amplitude of the image frame is between a motion amplitude of the corresponding image frame and a motion amplitude of the previous image frame a difference, a magnitude of motion of the image frame is a ratio between an inter prediction cost of the corresponding image frame and an intra prediction cost;
- a second acquiring module configured to update a static variable according to a difference in motion amplitude of each image frame in the sliding window, where the static variable is used to represent the determined number of consecutive still image frames;
- an encoding module configured to encode the image frame to be encoded into an I frame when the updated static variable is not less than the first preset threshold.
- a video encoding apparatus comprising a processor and a memory, the memory storing at least one instruction loaded by the processor and executed to implement the first aspect The video encoding method.
- a fourth aspect a computer readable storage medium having stored therein at least one instruction loaded by a processor and executed to implement the video encoding method of the first aspect The action taken.
- the method, the device, the device and the storage medium provided by the embodiments of the present application determine the N image frames according to the sliding window, and update the static variables according to the difference of the motion amplitude of each image frame in the sliding window, when the updated static variable is not less than
- the first preset threshold is determined, it is determined that the video located in the sliding window is in a still scene for a long time, and the I frame is programmed.
- the present application provides a method for discriminating whether a video is in a still scene for a long time, and encodes an I frame when determining that the video is in a still scene for a long time, thereby avoiding insertion of too many P frames, thereby improving coding quality and coding. effectiveness.
- FIG. 1 is a schematic diagram of an implementation environment provided by an embodiment of the present application.
- FIG. 2 is a schematic diagram of another implementation environment provided by an embodiment of the present application.
- FIG. 3 is a flowchart of a video encoding method according to an embodiment of the present application.
- 4A is a schematic diagram of a sliding window and an image frame provided by an embodiment of the present application.
- 4B is a schematic diagram of a mobile sliding window provided by an embodiment of the present application.
- FIG. 5 is a flowchart of a video encoding method according to an embodiment of the present application.
- FIG. 6 is a flowchart of a video encoding method according to an embodiment of the present application.
- FIG. 7 is a schematic structural diagram of a video encoding apparatus according to an embodiment of the present disclosure.
- FIG. 8 is a schematic structural diagram of a terminal according to an embodiment of the present disclosure.
- FIG. 9 is a schematic structural diagram of a server according to an embodiment of the present application.
- the embodiment of the present application provides a method for coding an I frame when it is in a still scene for a long time, and can improve the encoding quality.
- a scheme for encoding an I frame when a scene is switched in a video is provided, which can improve encoding quality while avoiding coding too many I frames.
- the embodiment of the present application provides an implementation environment, where the first device 101 and the second device 102 are included in the implementation environment, and the first device 101 and the second device 102 are connected through a network.
- the first device 101 transmits the video to the second device 102
- the video needs to be encoded first, and the encoded video is sent to the second device 102, and the second device 102 performs decoding to obtain the original video.
- the embodiment of the present application can be applied to a scenario in which a video is played online.
- the first device 101 is a video server for providing video
- the second device 102 is a terminal for playing video
- the video server obtains the encoded video.
- the video can be transcoded, and the I frame is encoded in the encoding process provided by the embodiment of the present application, so that the encoded video is obtained and sent to the terminal, and the video is played by the terminal, and the user is You can watch the video on the terminal.
- the embodiment of the present application can also be applied to a video call scenario.
- the first device 101 and the second device 102 are two terminals for performing a video call, and the first device 101 and the second device 102 pass through the server 103. connection.
- the first device 101 obtains an image frame, encodes the image frame, and sends the image frame to the server 103 for forwarding to the second device 102.
- the second device 102 decodes the image and plays the image. frame.
- the first device 101 can acquire a plurality of image frames, and the second device 102 can continuously play a plurality of image frames to achieve the effect of playing the video, and the user can watch the video on the second device 102.
- FIG. 3 is a flowchart of a video encoding method according to an embodiment of the present application.
- the embodiment of the present application describes a process of encoding an I frame for an image frame in a still scene, where the execution subject is a video encoding device, and the video encoding is performed.
- the device can be a device with a video transmission function such as a terminal or a server. Referring to Figure 3, the method includes:
- the video encoding device determines N image frames on the sequence of image frames of the video according to the sliding window.
- the length of the sliding window is equal to N, and N is a positive integer greater than 1.
- N can be determined according to the frame rate of the video, for example, can be two-thirds of the frame rate.
- the sliding window can be used to determine N image frames each time, and the N image frames include N-1 encoded image frames and an image frame to be encoded at the end of the sliding window.
- the image frame to be encoded can be encoded according to the image frame in the sliding window.
- an image frame to be encoded is taken as an example.
- the image frame to be encoded may be the Nth image frame of the video, or may be any image frame after the Nth image frame.
- the step size of each movement of the sliding window can be set to 1, that is, each time the sliding window is moved by 1 frame, for each image frame after the Nth image frame and the Nth image frame.
- the coding method provided by the embodiment of the present application is used for coding.
- the image frames other than the first image frame before the Nth image frame since the distance between the image frames and the I frame is relatively short, encoding the P frame does not cause excessive degradation of the encoding quality, so the default encoding is P.
- the frame is fine.
- the video encoding device acquires a difference in motion amplitude of each image frame in the sliding window.
- the motion amplitude of the image frame is the ratio between the interframe prediction cost of the image frame and the intra prediction cost, and can represent the variation range of the image frame compared with the previous image frame, and the amplitude of the change of the image frame and the previous image frame is larger.
- the larger the motion amplitude the smaller the variation range of the image frame and the previous image frame, and the smaller the motion amplitude.
- the difference between the motion amplitude of the image frame is the difference between the motion amplitude of the image frame and the motion amplitude of the previous image frame, and is used to indicate the fluctuation between the amplitude of the image frame and the amplitude of the change of the previous image frame.
- the larger the difference in motion amplitude is the more the fluctuation of the amplitude of the image frame and the amplitude of the change of the previous image frame are.
- the smaller the difference of the motion amplitude the more the fluctuation between the amplitude of the change of the image frame and the amplitude of the change of the previous image frame is. gentle.
- the difference of the above variation amplitude can reflect the difference of the displayed picture between the two image frames, and the difference of the motion amplitude is large, indicating that the difference of the picture is large, and the difference of the motion amplitude is small, indicating that the picture difference is small.
- the intra prediction cost is represented by I cost
- the inter prediction cost is represented by P cost
- the intraframe prediction cost I cost of the image frame may be performed by downsampling the image frame, and dividing the sampled image frame into a plurality of macroblocks of a specified size, and calculating, for each macroblock, the number of macroblocks in the macroblock.
- the prediction block in the direction by calculating the SATD of the residual between the macroblock and the prediction block, obtains the optimal intra prediction cost Icost .
- the sampling amplitude at the time of downsampling may be determined according to requirements. For example, the length of the sampled image frame may be one-half of the length of the original image frame, and the width may be one-half of the width of the original image frame.
- the specified size can also be determined based on forecasted demand, for example 8*8.
- SATD refers to the absolute value summation after the residual is transformed by Hadamard.
- the inter-prediction cost P cost of the image frame may be used to downsample the image frame into a plurality of macroblocks of a specified size.
- the optimal reference block of the macroblock is obtained by using integer pixel diamond prediction, and the SATD of the residual between the macroblock and the reference block is calculated, thereby obtaining an optimal motion amplitude, and the motion amplitude and the macroblock are obtained.
- the motion amplitude difference of each image frame may be calculated before encoding, and when the current image frame is encoded, the motion amplitude difference of the image frame may be calculated, and due to the image frame before The image frame has already calculated the difference in motion amplitude when encoding, so it is not necessary to recalculate the difference in motion amplitude of these image frames.
- This direct acquisition method can greatly reduce the amount of computation in the encoding process.
- the video encoding device updates the static variable according to the difference in motion amplitude of each image frame in the sliding window.
- the still image frame refers to an image frame in a still scene
- the static variable is used to represent the determined number of consecutive still image frames
- the stationary variable gradually increases with the increase of the determined continuous still image frame during the encoding process.
- whether the image frame to be encoded is a still image frame may be determined according to the difference of the motion amplitude of each image frame in the sliding window, and the updated result is updated according to the determination result and the determined static variable, and the updated still is obtained. variable.
- the following process can be employed to determine a still image frame and update the stationary variables:
- the first preset condition is that the absolute value of the motion amplitude difference of the image frame to be encoded is smaller than the second preset threshold and sliding The absolute value of the sum of the motion amplitude differences of all image frames in the window is less than the third preset threshold. Therefore, the video encoding device determines whether the absolute value of the motion amplitude difference of the image frame to be encoded is smaller than the second preset threshold, and determines whether the absolute value of the sum of the motion amplitude differences of all image frames in the sliding window is less than a third preset threshold.
- the image frame to be encoded is a still image frame, and the determined static variable is incremented by one to obtain an updated static variable. If no, it is determined that the image frame to be encoded is not a still image frame when the first preset condition is not satisfied, and the updated still variable is set to 0, and the number of consecutive still image frames is restarted.
- a video encoding device can determine a stationary variable using the following formula:
- f n represents a stationary variable determined according to an image frame to be encoded
- f n-1 represents a stationary variable determined according to a previous image frame of the image frame to be encoded
- n n represents a difference in motion amplitude of the image frame to be encoded
- ⁇ n ⁇ n - ⁇ n-1
- ⁇ n represents the motion amplitude of the image frame to be encoded
- ⁇ n-1 represents the previous image frame of the image frame to be encoded
- ⁇ n represents the sum of the motion amplitude differences of all image frames in the sliding window
- n is a positive integer greater than 1
- ⁇ T represents a second predetermined threshold
- ⁇ T represents a third predetermined threshold.
- the video encoding device determines whether the updated static variable is less than a first preset threshold. If yes, step 305 is performed, and if no, step 306 is performed.
- the condition for programming the I frame is set as follows: the number of consecutive still image frames reaches a first preset threshold, under such conditions, the video In a long-time still scene and enough P frames have been programmed according to a sufficient number of still image frames, if the P-frame continues to be programmed, the encoding quality will be degraded too much, so I frames are programmed.
- the video encoding device acquires the updated static variable, it is determined whether the updated static variable is smaller than the first preset threshold. If the updated static variable is smaller than the first preset threshold, it indicates that the number of consecutive still image frames does not exceed the upper limit, and the P frame may continue to be programmed for the purpose of improving coding efficiency. If the updated static variable is not less than the first preset threshold, indicating that the number of consecutive still image frames is too large, an I frame needs to be programmed for the purpose of improving the encoding quality.
- the first preset threshold may be comprehensively determined by the requirement for the encoding quality and the coding efficiency, for example, the first preset threshold may be equal to N.
- the video encoding device encodes the image frame to be encoded into a P frame.
- the difference data between the image frame to be encoded and the previous image frame is obtained, and the difference data is encoded, and the encoded image frame is obtained as a P frame.
- the video encoding device encodes the image frame to be encoded into an I frame.
- the code is directly encoded according to the data in the image frame to be encoded, and the encoded image frame is an I frame.
- the condition for programming the I frame may include a second preset condition in addition to the condition that the static variable is not less than the first preset threshold. Accordingly, the foregoing steps 304-306 may be performed by the following steps. Instead, determining whether the updated static variable is smaller than the first preset threshold, and determining whether the second preset condition is met, and waiting for the updated static variable to be not less than the first preset threshold and satisfying the second preset condition.
- the coded image frame is encoded as an I frame, and the image frame to be encoded is encoded into a P frame when the updated still variable is smaller than the first preset threshold or does not satisfy the second preset condition.
- the second preset condition it can be ensured that the image representing the image frame is relatively stable when the motion amplitude of the image frame is small, and the P-frame is not caused to cause excessive degradation of the encoding quality, so there is no need to program Too many I frames can be programmed into an I frame when the PSNR drops by a large margin, which not only ensures the coding quality, but also improves the coding efficiency as much as possible.
- Too many I frames can be programmed into an I frame when the PSNR drops by a large margin, which not only ensures the coding quality, but also improves the coding efficiency as much as possible.
- the motion amplitude of the image frame is large, the picture change of the image frame is severe. In this case, in order to avoid excessive degradation of the coding quality, the condition for encoding the I frame can be relaxed, and when the PSNR decreases by a small amount, Can be programmed into I frames.
- the thresholds used by different image frames may be equal or not equal.
- T 1.n, T 2.n, T 3.n and T 4.n T 1 to obtain the updated .n + 1, T 2.n + 1 , T 3.n + 1 and T 4.n + 1, using the updated threshold of the n + 1 th image frame is determined.
- the update is as follows:
- T 1.n+1 (1 - ⁇ T ) ⁇ T 1.n + ⁇ T ⁇ ( ⁇ I - ⁇ n-1 ) 0 ⁇ ⁇ n ⁇ ⁇ 1 ;
- T 2.n+1 (1 - ⁇ T ) ⁇ T 2.n + ⁇ T ⁇ ( ⁇ I - ⁇ n-1 ) ⁇ 1 ⁇ ⁇ n ⁇ ⁇ 2 ;
- T 3.n+1 (1 - ⁇ T ) ⁇ T 3.n + ⁇ T ⁇ ( ⁇ I - ⁇ n-1 ) ⁇ 2 ⁇ ⁇ n ⁇ ⁇ 3 ;
- T 4.n+1 (1 - ⁇ T ) ⁇ T 4.n + ⁇ T ⁇ ( ⁇ I - ⁇ n-1 ) ⁇ 3 ⁇ ⁇ n ⁇ ⁇ 4 ;
- next image frame to be encoded may be acquired, and the sliding window is moved backward according to the distance of one image frame, so that the next image frame to be encoded is at the end of the sliding window.
- steps 301-306 can be repeatedly performed to continue encoding the next image frame to be encoded, and so on.
- the sliding window is moved backward according to the distance of one image frame, so that the N+1th image frame is located at the end of the sliding window.
- the N+1th image frame is encoded at this time.
- the video encoding device may further set a preset maximum length of the GOP in the configuration file, where the preset maximum length specifies a maximum length between two I frames before and after, and then, when encoding the image frame to be encoded,
- the video encoding device not only uses the condition in the above step 304 to determine whether to encode the I frame, but also needs to obtain the distance between the image frame to be encoded and the I frame closest to the image frame to be encoded before the image frame to be encoded, It is determined whether the distance reaches the preset maximum length. When it is determined that the distance reaches the preset maximum length, the image frame to be encoded is encoded into an I frame even if the condition in step 304 is not currently satisfied.
- the method provided by the embodiment of the present application by determining N image frames according to the sliding window, updating the static variable according to the difference of the motion amplitude of each image frame in the sliding window, when the updated static variable is not less than the first preset threshold, If it is determined that the video is in a still scene for a long time, the I frame is programmed.
- the present application provides a method for discriminating whether a video is in a still scene for a long time, and encodes an I frame when determining that the video is in a still scene for a long time, avoids inserting too many P frames, improves coding quality, and encodes I frames.
- the P frame is reprogrammed, the number of bits occupied by the P frame is reduced, and the PSNR is increased, thereby improving the coding efficiency.
- the present application uses the ratio of the inter prediction cost to the intra prediction cost to represent the current motion amplitude, uses the PSNR of the encoded image frame to represent the distortion, and uses the sliding window to pre-analyze the image frame in the sliding window. Combined with the change of the motion amplitude, a segmentation function is used to implement the adaptive algorithm of I frame, which improves the coding efficiency.
- the preset maximum length of the GOP in the configuration file can be set to a larger value, so that it is not frequently programmed into the I frame according to the preset maximum length, and more is Whether or not to encode an I frame is determined according to the condition of the image frame, which greatly improves the coding efficiency.
- the algorithm adopted in the embodiment of the present application performs assembly optimization on the types of processors such as armv7 and arm64, which can improve the processing speed.
- FIG. 5 is a flowchart of a video encoding method according to an embodiment of the present application.
- the embodiment of the present application takes a process of encoding three image frames as an example, and the execution subject is a video encoding device.
- the method includes:
- a sliding window whose length is equal to N, encode the first image frame in the video into an I frame, and encode the second image frame to the N-1th image frame of the video into a P frame.
- the N-1 image frames are located in the first N-1 positions of the sliding window. When the Nth image frame is acquired, the Nth image frame is the current image frame to be encoded, and the Nth image frame is at the end of the sliding window.
- N is assumed to be 3.
- the Nth image frame is a still image frame, and the static variable is 1 at this time, because the static variable 1 is less than N,
- the Nth image frame is encoded as a P frame.
- the sliding window includes a second image frame to an N+1th image frame.
- the N+1th image frame is a still image frame, and the static variable is updated to 2, because the static variable 2 is smaller than N, thus encoding the N+1th image frame as a P frame.
- the sliding window includes a third image frame to an N+2th image frame.
- the N+2th image frame is a still image frame, and the static variable is updated to 3, because the static variable 3 is equal to N, thus encoding the N+2th image frame as an I frame.
- FIG. 6 is a flowchart of a video encoding method according to an embodiment of the present disclosure.
- the embodiment of the present application describes a process of encoding an I frame during scene switching, and the execution subject is a video encoding device.
- the method includes:
- V Scene represents a set threshold, which may be predetermined by a video encoding device; D represents the distance; GOP min represents a preset minimum length of the image group GOP; GOP max represents a preset maximum length of the GOP; and F bias represents a fourth Preset threshold.
- the image frame to be encoded is encoded as I frame.
- the image frame to be encoded is encoded. Is a P frame.
- the related art encodes an I frame when the video is switched.
- the motion amplitude of the image frame to be encoded as an I frame should be greater than the difference between 1 and the fourth preset threshold and not less than the fifth preset threshold, and the I frame and the former are programmed.
- the distance between one I frame cannot be less than the sixth preset threshold.
- the fifth preset threshold is used to determine a minimum motion amplitude of an image frame to be encoded as an I frame
- the sixth preset threshold is used to determine a minimum distance between two I frames before and after, the fifth preset threshold and the The sixth preset threshold may be determined after considering the encoding quality and the encoding efficiency.
- the fifth preset threshold may be 0.8
- the sixth preset threshold may be one-half of the frame rate.
- the video encoding device may first determine whether the distance is smaller than a sixth preset threshold, and if so, directly encode the image frame to be encoded into a P frame, and if not, determine the motion of the image frame to be encoded. Whether the amplitude is greater than the difference between the first preset threshold and the fourth preset threshold, and whether the image frame to be encoded is encoded as an I frame or a P frame according to the determination result.
- the video encoding device can use the foregoing steps 301-304 to perform the determination, and use the above steps 601-603 to determine. It is thus determined whether the image frame to be encoded is to be encoded as an I frame or a P frame.
- the method provided by the embodiment of the present application ensures that the motion amplitude of the image frame to be encoded is greater than the difference between the amplitude of the image frame and the fourth preset threshold, and is not less than the fifth preset threshold, between the image frame to be encoded and the previous I frame.
- the image frame to be encoded is encoded into an I frame, which avoids coding too many I frames, and improves coding efficiency.
- FIG. 7 is a schematic structural diagram of a video encoding apparatus according to an embodiment of the present application.
- the device includes:
- the first obtaining module 702 is configured to perform step 302 above;
- the second obtaining module 703 is configured to perform step 303 above;
- the encoding module 704 is configured to perform the above steps 305 or 306.
- the second acquisition module 703 is configured to update the stationary variable.
- the encoding module 704 is configured to program an I frame when the updated static variable is not less than the specified number and meets the second preset condition.
- the second preset condition includes at least one of the following conditions:
- T 1.n+1 (1 - ⁇ T ) ⁇ T 1.n + ⁇ T ⁇ ( ⁇ I - ⁇ n-1 ) 0 ⁇ ⁇ n ⁇ ⁇ 1 ;
- T 2.n+1 (1 - ⁇ T ) ⁇ T 2.n + ⁇ T ⁇ ( ⁇ I - ⁇ n-1 ) ⁇ 1 ⁇ ⁇ n ⁇ ⁇ 2 ;
- T 3.n+1 (1 - ⁇ T ) ⁇ T 3.n + ⁇ T ⁇ ( ⁇ I - ⁇ n-1 ) ⁇ 2 ⁇ ⁇ n ⁇ ⁇ 3 ;
- T 4.n+1 (1 - ⁇ T ) ⁇ T 4.n + ⁇ T ⁇ ( ⁇ I - ⁇ n-1 ) ⁇ 3 ⁇ ⁇ n ⁇ ⁇ 4 ;
- ⁇ T is a positive threshold
- the encoding module 704 is further configured to encode the image frame to be encoded into a P frame when the updated static variable is less than the first preset threshold.
- the device further includes:
- a switching module configured to acquire a next image frame to be encoded, and move the sliding window backward according to the distance of one image frame
- the second obtaining module 703 is further configured to continue to update the static variable according to the difference of the motion amplitude of each image frame in the sliding window, until the current updated static variable is not less than the first preset threshold, and the current module is determined by the encoding module 704.
- the encoded image frame is encoded as an I frame.
- the device further includes:
- a third obtaining module configured to perform step 601 above;
- a calculation module configured to perform step 602 above
- the encoding module 704 is configured to perform the above steps 604 or 605.
- the video encoding apparatus when performing the encoding, the video encoding apparatus provided by the foregoing embodiment is only illustrated by the division of each functional module. In actual applications, the function allocation may be completed by different functional modules as needed. The internal structure of the video encoding device is divided into different functional modules to perform all or part of the functions described above. In addition, the video coding apparatus and the video coding method are provided in the same embodiment, and the specific implementation process is described in detail in the method embodiment, and details are not described herein again.
- FIG. 8 is a schematic structural diagram of a terminal according to an embodiment of the present application.
- the terminal can be used to implement the functions performed by the video encoding device in the video encoding method shown in the above embodiments. Specifically:
- the terminal 800 may include an RF (Radio Frequency) circuit 110, a memory 120 including one or more computer readable storage media, an input unit 130, a display unit 140, a sensor 150, an audio circuit 160, a transmission module 170, including One or more processing core processor 180, and power supply 190 and the like. It will be understood by those skilled in the art that the terminal structure shown in Fig. 8 does not constitute a limitation to the terminal, and may include more or less components than those illustrated, or a combination of certain components, or different component arrangements. among them:
- the RF circuit 110 can be used for transmitting and receiving information or during a call, and receiving and transmitting signals. Specifically, after receiving downlink information of the base station, the downlink information is processed by one or more processors 180. In addition, the data related to the uplink is sent to the base station. .
- the RF circuit 110 includes, but is not limited to, an antenna, at least one amplifier, a tuner, one or more oscillators, a Subscriber Identity Module (SIM) card, a transceiver, a coupler, an LNA (Low Noise Amplifier). , duplexer, etc. In addition, the RF circuit 110 can also communicate with the network and other terminals through wireless communication.
- SIM Subscriber Identity Module
- LNA Low Noise Amplifier
- the wireless communication may use any communication standard or protocol, including but not limited to GSM (Global System of Mobile communication), GPRS (General Packet Radio Service), CDMA (Code Division Multiple Access). , Code Division Multiple Access), WCDMA (Wideband Code Division Multiple Access), LTE (Long Term Evolution), e-mail, SMS (Short Messaging Service), and the like.
- GSM Global System of Mobile communication
- GPRS General Packet Radio Service
- CDMA Code Division Multiple Access
- WCDMA Wideband Code Division Multiple Access
- LTE Long Term Evolution
- e-mail Short Messaging Service
- the memory 120 can be used to store software programs and modules, such as the software programs and modules corresponding to the terminals shown in the above exemplary embodiments, and the processor 180 executes various functional applications by running software programs and modules stored in the memory 120. And data processing, such as implementing video-based interactions.
- the memory 120 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may be stored according to The data created by the use of the terminal 800 (such as audio data, phone book, etc.) and the like.
- memory 120 can include high speed random access memory, and can also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, memory 120 may also include a memory controller to provide access to memory 120 by processor 180 and input unit 130.
- the input unit 130 can be configured to receive input numeric or character information and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function controls.
- input unit 130 can include touch-sensitive surface 131 as well as other input terminals 132.
- Touch-sensitive surface 131 also referred to as a touch display or trackpad, can collect touch operations on or near the user (such as a user using a finger, stylus, etc., on any suitable object or accessory on touch-sensitive surface 131 or The operation near the touch-sensitive surface 131) and driving the corresponding linking device according to a preset program.
- the touch-sensitive surface 131 can include two portions of a touch detection device and a touch controller.
- the touch detection device detects the touch orientation of the user, and detects a signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts the touch information into contact coordinates, and sends the touch information.
- the processor 180 is provided and can receive commands from the processor 180 and execute them.
- the touch-sensitive surface 131 can be implemented in various types such as resistive, capacitive, infrared, and surface acoustic waves.
- the input unit 130 may also include other input terminals 132.
- other input terminals 132 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackballs, mice, joysticks, and the like.
- Display unit 140 can be used to display information entered by the user or information provided to the user and various graphical user interfaces of terminal 800, which can be constructed from graphics, text, icons, video, and any combination thereof.
- the display unit 140 may include a display panel 141.
- the display panel 141 may be configured in the form of an LCD (Liquid Crystal Display), an OLED (Organic Light-Emitting Diode), or the like.
- the touch-sensitive surface 131 may cover the display panel 141, and when the touch-sensitive surface 131 detects a touch operation thereon or nearby, it is transmitted to the processor 180 to determine the type of the touch event, and then the processor 180 according to the touch event The type provides a corresponding visual output on display panel 141.
- touch-sensitive surface 131 and display panel 141 are implemented as two separate components to implement input and input functions, in some embodiments, touch-sensitive surface 131 can be integrated with display panel 141 for input. And output function.
- Terminal 800 can also include at least one type of sensor 150, such as a light sensor, motion sensor, and other sensors.
- the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 141 according to the brightness of the ambient light, and the proximity sensor may close the display panel 141 when the terminal 800 moves to the ear. / or backlight.
- the gravity acceleration sensor can detect the magnitude of acceleration in all directions (usually three axes). When it is stationary, it can detect the magnitude and direction of gravity.
- the gesture of the mobile phone such as horizontal and vertical screen switching, related Game, magnetometer attitude calibration), vibration recognition related functions (such as pedometer, tapping), etc.; as for the terminal 800 can also be configured with gyroscopes, barometers, hygrometers, thermometers, infrared sensors and other sensors, here Let me repeat.
- the audio circuit 160, the speaker 161, and the microphone 162 can provide an audio interface between the user and the terminal 800.
- the audio circuit 160 can transmit the converted electrical data of the received audio data to the speaker 161 for conversion to the sound signal output by the speaker 161; on the other hand, the microphone 162 converts the collected sound signal into an electrical signal by the audio circuit 160. After receiving, it is converted into audio data, and then processed by the audio data output processor 180, transmitted to the terminal, for example, via the RF circuit 110, or outputted to the memory 120 for further processing.
- the audio circuit 160 may also include an earbud jack to provide communication of the peripheral earphones with the terminal 800.
- the terminal 800 can help the user to send and receive emails, browse web pages, access streaming media, etc. through the transmission module 170, which provides users with wireless or wired broadband Internet access.
- FIG. 8 shows the transmission module 170, it can be understood that it does not belong to the essential configuration of the terminal 800, and may be omitted as needed within the scope of not changing the essence of the invention.
- the processor 180 is the control center of the terminal 800, which links various portions of the entire handset using various interfaces and lines, by running or executing software programs and/or modules stored in the memory 120, and recalling data stored in the memory 120, The various functions and processing data of the terminal 800 are performed to perform overall monitoring of the mobile phone.
- the processor 180 may include one or more processing cores; preferably, the processor 180 may integrate an application processor and a modem processor, where the application processor mainly processes an operating system, a user interface, an application, and the like.
- the modem processor primarily handles wireless communications. It can be understood that the above modem processor may not be integrated into the processor 180.
- the terminal 800 also includes a power source 190 (such as a battery) for powering various components.
- a power source 190 such as a battery
- the power source can be logically coupled to the processor 180 through a power management system to manage functions such as charging, discharging, and power management through the power management system.
- Power supply 190 may also include any one or more of a DC or AC power source, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.
- the terminal 800 may further include a camera, a Bluetooth module, and the like, and details are not described herein again.
- the display unit of the terminal 800 is a touch screen display, and the terminal 800 further includes a memory and at least one instruction, wherein at least one instruction is stored in the memory and configured to be loaded and executed by one or more processors To implement the video encoding method in the above embodiment.
- FIG. 9 is a schematic structural diagram of a server according to an embodiment of the present application.
- the server 900 may generate a large difference due to different configurations or performances, and may include one or more central processing units (CPUs) 922 ( For example, one or more processors) and memory 932, one or more storage media 930 that store application 942 or data 944 (eg, one or one storage device in Shanghai).
- the memory 932 and the storage medium 930 may be short-term storage or persistent storage.
- the program stored on storage medium 930 may include one or more modules (not shown), each of which may include a series of instructions in the server.
- the central processing unit 922 can be configured to communicate with the storage medium 930, and load and execute a series of instructions in the storage medium 930 on the server 900 to implement the video encoding method in the above embodiments.
- Server 900 may also include one or more power sources 926, one or more wired or wireless network interfaces 950, one or more input and output interfaces 959, one or more keyboards 956, and/or one or more operating systems 941.
- operating systems 941 such as Windows Server TM, Mac OSX TM, Unix TM, Linux TM, FreeBSD TM and so on.
- the server 900 can be configured to perform the steps performed by the video encoding device in the video encoding method provided by the foregoing embodiments.
- the embodiment of the present application further provides a computer readable storage medium, where the computer readable storage medium stores at least one instruction loaded by a processor and executed to implement the operations performed in the video encoding method of the above embodiment. .
- a person skilled in the art may understand that all or part of the steps of implementing the above embodiments may be completed by hardware, or may be instructed by a program to execute related hardware, and the program may be stored in a computer readable storage medium.
- the storage medium mentioned may be a read only memory, a magnetic disk or an optical disk or the like.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
Claims (18)
- 一种视频编码方法,其特征在于,应用于视频编码设备上,所述方法包括:根据滑动窗口确定视频的图像帧序列上的N个图像帧,所述滑动窗口内的图像帧包括N-1个已编码图像帧以及一个处于窗尾的待编码图像帧;获取所述滑动窗口内每个图像帧的运动幅度差值,所述图像帧的运动幅度差值为对应图像帧的运动幅度与前一图像帧的运动幅度之间的差值,所述图像帧的运动幅度为对应图像帧的帧间预测代价与帧内预测代价之间的比值;根据所述滑动窗口内每个图像帧的运动幅度差值更新静止变量,所述静止变量用于表示已确定的连续静止图像帧的数目;当更新后的静止变量不小于第一预设阈值时,将所述待编码图像帧编码为I帧。
- 根据权利要求1所述的方法,其特征在于,所述根据所述滑动窗口内每个图像帧的运动幅度差值更新静止变量,包括:当所述待编码图像帧的运动幅度以及所述滑动窗口内所有图像帧的运动幅度满足第一预设条件时,确定所述待编码图像帧为静止图像帧,将已确定的静止变量加1,得到更新后的静止变量;当不满足所述第一预设条件时,确定所述待编码图像帧不是静止图像帧,将更新后的静止变量设置为0;其中,所述第一预设条件为所述待编码图像帧的运动幅度差值的绝对值小于第二预设阈值且所述滑动窗口内所有图像帧的运动幅度差值总和的绝对值小于第三预设阈值。
- 根据权利要求1所述的方法,其特征在于,所述当更新后的静止变量不小于第一预设阈值时,将所述待编码图像帧编码为I帧,包括:当所述更新后的静止变量不小于所述第一预设阈值,且满足第二预设条件时,将所述待编码图像帧编码为I帧;所述第二预设条件包括以下条件中的至少一个:0<α n≤σ 1且μ I.n-μ n-1>T 1.n;σ 1<α n≤σ 2且μ I.n-μ n-1>T 2.n;σ 2<α n≤σ 3且μ I.n-μ n-1>T 3.n;σ 3<α n≤σ 4且μ I.n-μ n-1>T 4.n;其中,所述n表示所述待编码图像帧的序号;所述α n表示所述待编码图像帧的运动幅度;所述μ I.n表示在所述待编码图像帧之前、距离所述待编码图像帧最近的I帧的亮度分量的峰值信噪比PSNR;所述μ n-1表示所述待编码图像帧的前一图像帧的亮度分量的PSNR;所述σ 1、所述σ 2、所述σ 3和所述σ 4,以及所述T 1.n、所述T 2.n、所述T 3.n和所述T 4.n为正数阈值,σ 1<σ 2,σ 2<σ 3,σ 3<σ 4,T 1.n>T 2.n,T 2.n>T 3.n,T 3.n>T 4.n。
- 根据权利要求1所述的方法,其特征在于,所述根据所述滑动窗口内每个图像帧的运动幅度差值更新静止变量之后,所述方法还包括:当所述更新后的静止变量小于所述第一预设阈值时,将所述待编码图像帧编码为P帧。
- 根据权利要求1至4任一项所述的方法,其特征在于,所述方法还包括:所述待编码图像帧编码完成后,获取下一个待编码图像帧,按照一个图像帧的距离向后移动所述滑动窗口,使所述下一个待编码图像帧处于所述滑动窗口的窗尾;继续根据所述滑动窗口内每个图像帧的运动幅度差值更新所述静止变量,直至当前更新后的静止变量不小于所述第一预设阈值时,将当前待编码图像帧编码为I帧。
- 一种视频编码装置,其特征在于,所述装置包括:确定模块,用于根据滑动窗口确定视频的图像帧序列上的N个图像帧,所述滑动窗口内的图像帧包括N-1个已编码图像帧以及一个处于窗尾的待编码图像帧;第一获取模块,用于获取所述滑动窗口内每个图像帧的运动幅度差值,所述图像帧的运动幅度差值为对应图像帧的运动幅度与前一图像帧的运动幅度之间的差值,所述图像帧的运动幅度为对应图像帧的帧间预测代价与帧内预测代价之间的比值;第二获取模块,用于根据所述滑动窗口内每个图像帧的运动幅度差值更新静止变量,所述静止变量用于表示已确定的连续静止图像帧的数目;编码模块,用于当更新后的静止变量不小于所述第一预设阈值时,将所述待编码图像帧编码为I帧。
- 根据权利要求7所述的装置,其特征在于,所述第二获取模块用于:当所述待编码图像帧的运动幅度以及所述滑动窗口内所有图像帧的运动幅度满足第一预设条件时,确定所述待编码图像帧为静止图像帧,将已确定的静止变量加1,得到更新后的静止变量;当不满足所述第一预设条件时,确定所述待编码图像帧不是静止图像帧,将更新后的静止变量设置为0;其中,所述第一预设条件为所述待编码图像帧的运动幅度差值的绝对值小 于第二预设阈值且所述滑动窗口内所有图像帧的运动幅度差值总和的绝对值小于第三预设阈值。
- 根据权利要求7所述的装置,其特征在于,所述编码模块用于当所述更新后的静止变量不小于所述第一预设阈值,且满足第二预设条件时,将所述待编码图像帧编码为I帧;所述第二预设条件包括以下条件中的至少一个:0<α n≤σ 1且μ I.n-μ n-1>T 1.n;σ 1<α n≤σ 2且μ I.n-μ n-1>T 2.n;σ 2<α n≤σ 3且μ I.n-μ n-1>T 3.n;σ 3<α n≤σ 4且μ I.n-μ n-1>T 4.n;其中,所述n表示所述待编码图像帧的序号;所述α n表示所述待编码图像帧的运动幅度;所述μ I.n表示在所述待编码图像帧之前、距离所述待编码图像帧最近的I帧的亮度分量的峰值信噪比PSNR;所述μ n-1表示所述待编码图像帧的前一图像帧的亮度分量的PSNR;所述σ 1、所述σ 2、所述σ 3和所述σ 4,以及所述T 1.n、所述T 2.n、所述T 3.n和所述T 4.n为正数阈值,σ 1<σ 2,σ 2<σ 3,σ 3<σ 4,T 1.n>T 2.n,T 2.n>T 3.n,T 3.n>T 4.n。
- 根据权利要求7所述的装置,其特征在于,所述编码模块还用于当所述更新后的静止变量小于所述第一预设阈值时,将所述待编码图像帧编码为P帧。
- 根据权利要求7至10任一项所述的装置,其特征在于,所述装置还包括:切换模块,用于所述待编码图像帧编码完成后,获取下一个待编码图像帧,按照一个图像帧的距离向后移动所述滑动窗口,使所述下一个待编码图像帧处于所述滑动窗口的窗尾;所述第二获取模块,还用于继续根据所述滑动窗口内每个图像帧的运动幅度差值更新静止变量,直至当前更新后的静止变量不小于所述第一预设阈值时,由所述编码模块将当前待编码图像帧编码为I帧。
- 一种视频编码设备,其特征在于,所述视频编码设备包括处理器和存储 器,所述存储器中存储有至少一条指令,所述指令由所述处理器加载并执行以实现如下述的视频编码方法:根据滑动窗口确定视频的图像帧序列上的N个图像帧,所述滑动窗口内的图像帧包括N-1个已编码图像帧以及一个处于窗尾的待编码图像帧;获取所述滑动窗口内每个图像帧的运动幅度差值,所述图像帧的运动幅度差值为对应图像帧的运动幅度与前一图像帧的运动幅度之间的差值,所述图像帧的运动幅度为对应图像帧的帧间预测代价与帧内预测代价之间的比值;根据所述滑动窗口内每个图像帧的运动幅度差值更新静止变量,所述静止变量用于表示已确定的连续静止图像帧的数目;当更新后的静止变量不小于第一预设阈值时,将所述待编码图像帧编码为I帧。
- 根据权利要求12所述的视频编码设备,其特征在于,所述指令还由所述处理器加载并执行以实现下述方法:当所述待编码图像帧的运动幅度以及所述滑动窗口内所有图像帧的运动幅度满足第一预设条件时,确定所述待编码图像帧为静止图像帧,将已确定的静止变量加1,得到更新后的静止变量;当不满足所述第一预设条件时,确定所述待编码图像帧不是静止图像帧,将更新后的静止变量设置为0;其中,所述第一预设条件为所述待编码图像帧的运动幅度差值的绝对值小于第二预设阈值且所述滑动窗口内所有图像帧的运动幅度差值总和的绝对值小于第三预设阈值。
- 根据权利要求12所述的视频编码设备,其特征在于,所述指令还由所述处理器加载并执行以实现下述方法:当所述更新后的静止变量不小于所述第一预设阈值,且满足第二预设条件时,将所述待编码图像帧编码为I帧;所述第二预设条件包括以下条件中的至少一个:0<α n≤σ 1且μ I.n-μ n-1>T 1.n;σ 1<α n≤σ 2且μ I.n-μ n-1>T 2.n;σ 2<α n≤σ 3且μ I.n-μ n-1>T 3.n;σ 3<α n≤σ 4且μ I.n-μ n-1>T 4.n;其中,所述n表示所述待编码图像帧的序号;所述α n表示所述待编码图像帧的运动幅度;所述μ I.n表示在所述待编码图像帧之前、距离所述待编码图像帧最近的I帧的亮度分量的峰值信噪比PSNR;所述μ n-1表示所述待编码图像帧的前一图像帧的亮度分量的PSNR;所述σ 1、所述σ 2、所述σ 3和所述σ 4,以及所述T 1.n、所述T 2.n、所述T 3.n和所述T 4.n为正数阈值,σ 1<σ 2,σ 2<σ 3,σ 3<σ 4,T 1.n>T 2.n,T 2.n>T 3.n,T 3.n>T 4.n。
- 根据权利要求12所述的视频编码设备,其特征在于,所述指令还由所述处理器加载并执行以实现下述方法:当所述更新后的静止变量小于所述第一预设阈值时,将所述待编码图像帧编码为P帧。
- 根据权利要求12至15任一项所述的视频编码设备,其特征在于,所述指令还由所述处理器加载并执行以实现下述方法:所述待编码图像帧编码完成后,获取下一个待编码图像帧,按照一个图像帧的距离向后移动所述滑动窗口,使所述下一个待编码图像帧处于所述滑动窗口的窗尾;继续根据所述滑动窗口内每个图像帧的运动幅度差值更新所述静止变量,直至当前更新后的静止变量不小于所述第一预设阈值时,将当前待编码图像帧编码为I帧。
- 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有至少一条指令,所述指令由处理器加载并执行以实现如权利要求1至权利要求6任一项所述的视频编码方法中所执行的操作。
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP18816657.3A EP3641310A4 (en) | 2017-06-15 | 2018-05-18 | VIDEO CODING METHOD, APPARATUS AND DEVICE, INFORMATION DEVICE AND MEDIUM |
KR1020197024476A KR102225235B1 (ko) | 2017-06-15 | 2018-05-18 | 비디오 인코딩 방법, 장치, 및 디바이스, 및 저장 매체 |
JP2019543009A JP6925587B2 (ja) | 2017-06-15 | 2018-05-18 | ビデオ符号化方法、装置、機器、及び記憶媒体 |
US16/401,671 US10893275B2 (en) | 2017-06-15 | 2019-05-02 | Video coding method, device, device and storage medium |
US17/100,207 US11297328B2 (en) | 2017-06-15 | 2020-11-20 | Video coding method, device, device and storage medium |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710461367.7 | 2017-06-15 | ||
CN201710461367.7A CN109151469B (zh) | 2017-06-15 | 2017-06-15 | 视频编码方法、装置及设备 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/401,671 Continuation US10893275B2 (en) | 2017-06-15 | 2019-05-02 | Video coding method, device, device and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018228130A1 true WO2018228130A1 (zh) | 2018-12-20 |
Family
ID=64659774
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2018/087539 WO2018228130A1 (zh) | 2017-06-15 | 2018-05-18 | 视频编码方法、装置、设备及存储介质 |
Country Status (6)
Country | Link |
---|---|
US (2) | US10893275B2 (zh) |
EP (1) | EP3641310A4 (zh) |
JP (1) | JP6925587B2 (zh) |
KR (1) | KR102225235B1 (zh) |
CN (1) | CN109151469B (zh) |
WO (1) | WO2018228130A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112272297A (zh) * | 2020-10-28 | 2021-01-26 | 上海科江电子信息技术有限公司 | 嵌入在解码器端的图像质量静止帧检测方法 |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112929701B (zh) * | 2021-02-04 | 2023-03-17 | 浙江大华技术股份有限公司 | 一种视频编码方法、装置、设备及介质 |
CN113747159B (zh) * | 2021-09-06 | 2023-10-13 | 深圳软牛科技有限公司 | 一种生成可变帧率视频媒体文件的方法、装置及相关组件 |
CN115695889A (zh) * | 2022-09-30 | 2023-02-03 | 聚好看科技股份有限公司 | 显示设备及悬浮窗显示方法 |
CN116962685B (zh) * | 2023-09-21 | 2024-01-30 | 杭州爱芯元智科技有限公司 | 视频编码方法、装置、电子设备及存储介质 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110131622A1 (en) * | 2006-02-27 | 2011-06-02 | Cisco Technology, Inc. | Method and apparatus for immediate display of multicast iptv over a bandwidth constrained network |
CN105761263A (zh) * | 2016-02-19 | 2016-07-13 | 浙江大学 | 一种基于镜头边界检测和聚类的视频关键帧提取方法 |
CN105898313A (zh) * | 2014-12-15 | 2016-08-24 | 江南大学 | 一种新的基于视频大纲的监控视频可伸缩编码技术 |
CN106231301A (zh) * | 2016-07-22 | 2016-12-14 | 上海交通大学 | 基于编码单元层次和率失真代价的hevc复杂度控制方法 |
Family Cites Families (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH09327023A (ja) | 1996-06-06 | 1997-12-16 | Nippon Telegr & Teleph Corp <Ntt> | フレーム内/フレーム間符号化切替方法および画像符号化装置 |
US6025888A (en) * | 1997-11-03 | 2000-02-15 | Lucent Technologies Inc. | Method and apparatus for improved error recovery in video transmission over wireless channels |
JP4649318B2 (ja) * | 2004-12-13 | 2011-03-09 | キヤノン株式会社 | 画像符号化装置、画像符号化方法、プログラム及び記憶媒体 |
US8654848B2 (en) * | 2005-10-17 | 2014-02-18 | Qualcomm Incorporated | Method and apparatus for shot detection in video streaming |
JP4688170B2 (ja) * | 2007-01-26 | 2011-05-25 | 株式会社Kddi研究所 | 動画像符号化装置 |
US8938005B2 (en) * | 2007-11-05 | 2015-01-20 | Canon Kabushiki Kaisha | Image encoding apparatus, method of controlling the same, and computer program |
US9628811B2 (en) * | 2007-12-17 | 2017-04-18 | Qualcomm Incorporated | Adaptive group of pictures (AGOP) structure determination |
US8385404B2 (en) * | 2008-09-11 | 2013-02-26 | Google Inc. | System and method for video encoding using constructed reference frame |
CN101742293B (zh) * | 2008-11-14 | 2012-11-28 | 北京中星微电子有限公司 | 一种基于视频运动特征的图像自适应帧场编码方法 |
JP5215951B2 (ja) * | 2009-07-01 | 2013-06-19 | キヤノン株式会社 | 符号化装置及びその制御方法、コンピュータプログラム |
CN102572381A (zh) * | 2010-12-29 | 2012-07-11 | 中国移动通信集团公司 | 视频监控场景判别方法及其监控图像编码方法、及装置 |
WO2012096164A1 (ja) * | 2011-01-12 | 2012-07-19 | パナソニック株式会社 | 画像符号化方法、画像復号方法、画像符号化装置および画像復号装置 |
CN103796019B (zh) * | 2012-11-05 | 2017-03-29 | 北京勤能通达科技有限公司 | 一种均衡码率编码方法 |
KR20140110221A (ko) * | 2013-03-06 | 2014-09-17 | 삼성전자주식회사 | 비디오 인코더, 장면 전환 검출 방법 및 비디오 인코더의 제어 방법 |
JP6365253B2 (ja) | 2014-11-12 | 2018-08-01 | 富士通株式会社 | 映像データ処理装置、映像データ処理プログラムおよび映像データ処理方法 |
CN106254873B (zh) * | 2016-08-31 | 2020-04-03 | 广州市网星信息技术有限公司 | 一种视频编码方法及视频编码装置 |
-
2017
- 2017-06-15 CN CN201710461367.7A patent/CN109151469B/zh active Active
-
2018
- 2018-05-18 JP JP2019543009A patent/JP6925587B2/ja active Active
- 2018-05-18 KR KR1020197024476A patent/KR102225235B1/ko active IP Right Grant
- 2018-05-18 WO PCT/CN2018/087539 patent/WO2018228130A1/zh unknown
- 2018-05-18 EP EP18816657.3A patent/EP3641310A4/en active Pending
-
2019
- 2019-05-02 US US16/401,671 patent/US10893275B2/en active Active
-
2020
- 2020-11-20 US US17/100,207 patent/US11297328B2/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110131622A1 (en) * | 2006-02-27 | 2011-06-02 | Cisco Technology, Inc. | Method and apparatus for immediate display of multicast iptv over a bandwidth constrained network |
CN105898313A (zh) * | 2014-12-15 | 2016-08-24 | 江南大学 | 一种新的基于视频大纲的监控视频可伸缩编码技术 |
CN105761263A (zh) * | 2016-02-19 | 2016-07-13 | 浙江大学 | 一种基于镜头边界检测和聚类的视频关键帧提取方法 |
CN106231301A (zh) * | 2016-07-22 | 2016-12-14 | 上海交通大学 | 基于编码单元层次和率失真代价的hevc复杂度控制方法 |
Non-Patent Citations (1)
Title |
---|
See also references of EP3641310A4 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112272297A (zh) * | 2020-10-28 | 2021-01-26 | 上海科江电子信息技术有限公司 | 嵌入在解码器端的图像质量静止帧检测方法 |
CN112272297B (zh) * | 2020-10-28 | 2023-01-31 | 上海科江电子信息技术有限公司 | 嵌入在解码器端的图像质量静止帧检测方法 |
Also Published As
Publication number | Publication date |
---|---|
US20210076044A1 (en) | 2021-03-11 |
US11297328B2 (en) | 2022-04-05 |
US10893275B2 (en) | 2021-01-12 |
KR20190109476A (ko) | 2019-09-25 |
JP2020509668A (ja) | 2020-03-26 |
CN109151469A (zh) | 2019-01-04 |
US20190260998A1 (en) | 2019-08-22 |
CN109151469B (zh) | 2020-06-30 |
KR102225235B1 (ko) | 2021-03-09 |
EP3641310A1 (en) | 2020-04-22 |
JP6925587B2 (ja) | 2021-08-25 |
EP3641310A4 (en) | 2020-05-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7229261B2 (ja) | ビデオ符号化のビットレート制御方法、装置、機器、記憶媒体及びプログラム | |
WO2018228130A1 (zh) | 视频编码方法、装置、设备及存储介质 | |
US10986332B2 (en) | Prediction mode selection method, video encoding device, and storage medium | |
CN107454416B (zh) | 视频流发送方法和装置 | |
US11202066B2 (en) | Video data encoding and decoding method, device, and system, and storage medium | |
CN111010576B (zh) | 一种数据处理方法及相关设备 | |
CN113572836B (zh) | 一种数据传输方法、装置、服务器及存储介质 | |
CN108337533B (zh) | 视频压缩方法和装置 | |
WO2019169997A1 (zh) | 视频运动估计方法、装置、终端及存储介质 | |
US10284850B2 (en) | Method and system to control bit rate in video encoding | |
CN105992001A (zh) | 一种对图片进行量化处理的方法及装置 | |
US10827198B2 (en) | Motion estimation method, apparatus, and storage medium | |
CN111263216A (zh) | 一种视频传输方法、装置、存储介质及终端 | |
CN113630621A (zh) | 一种视频处理的方法、相关装置及存储介质 | |
CN109003313B (zh) | 一种传输网页图片的方法、装置和系统 | |
CN110213593B (zh) | 一种运动矢量的确定方法、编码压缩方法和相关装置 | |
CN116156232A (zh) | 一种视频内容的播放方法以及相关装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18816657 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2019543009 Country of ref document: JP Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 20197024476 Country of ref document: KR Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2018816657 Country of ref document: EP Effective date: 20200115 |