WO2021197157A1 - 视频流的处理方法、装置、电子设备及计算机可读介质 - Google Patents
视频流的处理方法、装置、电子设备及计算机可读介质 Download PDFInfo
- Publication number
- WO2021197157A1 WO2021197157A1 PCT/CN2021/082611 CN2021082611W WO2021197157A1 WO 2021197157 A1 WO2021197157 A1 WO 2021197157A1 CN 2021082611 W CN2021082611 W CN 2021082611W WO 2021197157 A1 WO2021197157 A1 WO 2021197157A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- video
- frame
- video frame
- processing
- buffer
- Prior art date
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 27
- 238000012545 processing Methods 0.000 claims abstract description 253
- 238000000034 method Methods 0.000 claims abstract description 66
- 238000007781 pre-processing Methods 0.000 claims description 19
- 230000008569 process Effects 0.000 claims description 19
- 230000002457 bidirectional effect Effects 0.000 claims description 12
- 238000004590 computer program Methods 0.000 claims description 11
- 238000012805 post-processing Methods 0.000 claims description 11
- 238000010923 batch production Methods 0.000 claims description 6
- 238000005516 engineering process Methods 0.000 abstract description 6
- 238000010586 diagram Methods 0.000 description 16
- 230000006870 function Effects 0.000 description 8
- 238000004891 communication Methods 0.000 description 7
- 230000006835 compression Effects 0.000 description 5
- 238000007906 compression Methods 0.000 description 5
- 230000009466 transformation Effects 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 4
- 230000003044 adaptive effect Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 3
- 238000013136 deep learning model Methods 0.000 description 3
- 230000014509 gene expression Effects 0.000 description 3
- 230000001133 acceleration Effects 0.000 description 2
- 230000003796 beauty Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 239000011521 glass Substances 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 238000011084 recovery Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 238000000844 transformation Methods 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/593—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/127—Prioritisation of hardware or computational resources
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
- H04N19/152—Data rate or code amount at the encoder output by measuring the fullness of the transmission buffer
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/156—Availability of hardware or computational resources, e.g. encoding based on power-saving criteria
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/182—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
- H04N19/436—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation using parallelised computational arrangements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/85—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/23406—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving management of server-side video buffer
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44004—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving video buffer management, e.g. video decoder buffer or video display buffer
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/81—Monomedia components thereof
- H04N21/8166—Monomedia components thereof involving executable data, e.g. software
- H04N21/8193—Monomedia components thereof involving executable data, e.g. software dedicated tools, e.g. video decoder software or IPMP tool
Definitions
- the embodiments of the application relate to the field of Internet technology, and in particular, to a method, device, electronic device, and computer-readable medium for processing a video stream.
- FFmpeg Full Forward MPEG, an open source and free cross-platform video and audio streaming solution
- FFmpeg is a set of digital audio that can be used to record and convert digital audio.
- Video and an open source computer program that converts it into a stream.
- LGPL or GPL license It provides a complete solution for recording, converting, and streaming audio and video.
- FFmpeg contains a very advanced audio/video codec library libavcodec. Since FFmpeg supports multiple encoding and decoding formats, many developers implement various functions such as video encoding and decoding, image scaling, and image synthesis based on FFmpeg. Therefore, FFmpeg is widely used in various video playback software. For example, in video streaming applications, many services use FFmpeg as the basic framework for video encoding, decoding, and video processing. For another example, in a live broadcast scene, after a video stream is read from a network capture device, it is decoded into an original video frame using FFmpeg, processed by the frame, and then re-encoded into a new video stream.
- the purpose of this application is to propose a video stream processing method, device, electronic equipment, and computer readable medium, which are used to solve the problem of how to reduce the software complexity of video stream processing in the prior art and improve the processing speed of the video stream.
- a method for processing a video stream includes: storing the video frames in the original video stream in a first buffer by invoking the video stream processing interface of the video stream processing tool; Processing is performed to obtain a processed video frame; based on the processed video frame, a standard video stream corresponding to the original video stream is generated.
- a video stream processing apparatus includes: a first storage module for storing video frames in the original video stream into a first buffer by calling a video stream processing interface of a video stream processing tool; a processing module for processing video frames A model for processing video frames in the first buffer to obtain processed video frames; a generating module for generating a standard video stream corresponding to the original video stream based on the processed video frame.
- an electronic device including: one or more processors; a computer-readable medium configured to store one or more programs, when the one or more programs are The one or more processors execute, so that the one or more processors implement the video stream processing method described in the first aspect of the foregoing embodiment.
- a computer-readable medium having a computer program stored thereon, and when the program is executed by a processor, the video stream processing method described in the first aspect of the above-mentioned embodiment is implemented .
- the video frame in the original video stream is stored in the first buffer by calling the video stream processing interface of the video stream processing tool; the video frame processing model is used for the first buffer.
- the video frames in the buffer are processed to obtain the processed video frames; based on the processed video frames, the standard video stream corresponding to the original video stream is generated.
- the video stream processing interface which processes the video stream, effectively reduces the software complexity of video stream processing, thereby increasing the processing speed of the video stream.
- the video frame in the original video stream is stored in the first buffer, without the need to save the video frame as a picture and read the picture. While ensuring the quality of the video frame, it also improves the processing speed of the entire video stream. .
- FIG. 1A is a schematic diagram of a processing procedure of a video stream provided by the prior art
- FIG. 1B is a flowchart of steps of a method for processing a video stream in Embodiment 1 of this application;
- FIG. 1C is a schematic diagram of a processing procedure of a video stream provided according to Embodiment 1 of the present application;
- FIG. 2A is a flowchart of the steps of a method for processing a video stream in Embodiment 2 of this application;
- 2B is a schematic diagram of a processing procedure of a video stream provided according to the second embodiment of the present application.
- FIG. 3 is a schematic structural diagram of a video stream processing device in Embodiment 3 of this application.
- FIG. 4 is a schematic structural diagram of a video stream processing device in Embodiment 4 of this application.
- FIG. 5 is a schematic structural diagram of a video stream processing device in Embodiment 5 of this application.
- FIG. 6 is a schematic diagram of the structure of the electronic device in the sixth embodiment of the application.
- FIG. 7 is the hardware structure of the electronic device in the seventh embodiment of the application.
- FIG. 1B it shows a flowchart of the steps of a video stream processing method in Embodiment 1 of the present application.
- the video stream processing method provided in this embodiment includes the following steps:
- step S101 by calling the video stream processing interface of the video stream processing tool, the video frames in the original video stream are stored in the first buffer.
- the video stream processing tool may be a FFmpeg (Fast Forward MPEG, an open source and free cross-platform video and audio streaming solution) tool.
- the video stream processing interface may be a video filter interface of the FFmpeg tool.
- the software module that implements the video stream processing method provided by the embodiments of the application is embedded in the FFmpeg tool as the video filter of the FFmpeg tool, that is, the video filter interface of the FFmpeg tool is used to implement the video stream.
- the video filter can be understood as a filter that performs various transformations on a video frame, such as operations such as scaling, rotation, color transformation, and filtering.
- the original video stream may be an original video stream collected by a network collection device, for example, an original video stream collected by a camera of a mobile phone terminal, an original video stream collected by a camera of a tablet computer, or an original video stream collected by a surveillance camera.
- the first buffer may be a buffer queue. It can be understood that the above description is only exemplary, and the embodiment of the present application does not make any limitation on this.
- the method before storing the video frames in the original video stream in the first buffer, the method further includes: preprocessing the video frames in the original video stream to obtain suitable In the video frame processed by the video frame processing model, the storing the video frame in the original video stream in the first buffer includes: storing the video frame suitable for processing by the video frame processing model in the first buffer In a buffer. In this way, by preprocessing the video frames in the original video stream, it is possible to obtain video frames suitable for processing by the video frame processing model. It can be understood that the above description is only exemplary, and the embodiment of the present application does not make any limitation on this.
- the pixel values of the pixel points in the video frames in the original video stream are scaled to obtain a value suitable for all The video frame processed by the video frame processing model.
- a video frame suitable for processing by the video frame processing model can be obtained.
- the video frame processing model when the video frame processing model is trained, the pixel values of the pixels of the input video frame sample are normally distributed, so that the video frame processing model is easier to train and converge. After the training of the video frame processing model converges, it is also expected in the actual scene that the pixel values of the input pixels of the video frame to be processed are normally distributed. Therefore, the video frame can be preprocessed to obtain a video frame suitable for processing by the video frame processing model. It can be understood that the above description is only exemplary, and the embodiment of the present application does not make any limitation on this.
- step S102 the video frame in the first buffer is processed through a video frame processing model to obtain a processed video frame.
- the video frame processing model can be understood as a neural network model for processing video frames, for example, an image enhancement model, an image super-resolution model, an image beauty model, and so on.
- the processing of the video frame processing model on the video frame can be understood as the inference of the video frame processing model on the video frame. Specifically, after the video frame processing model is trained, it needs to be deployed in the scene of video frame processing. Use the video frames in the actual scene to make predictions. This process is the inference of the video frame processing model. It can be understood that the above description is only exemplary, and the embodiments of the present application do not make any limitation on this.
- the method further includes: storing the processed video frame in the second buffer Area; fetch the processed video frame from the second buffer zone; post-process the processed video frame fetched from the second buffer zone to convert the processed video
- the data format of the frame is restored to the video image data format. Therefore, by performing post-processing on the processed video frame taken from the second buffer, the data format of the processed video frame can be restored to the video image data format, thereby facilitating subsequent processing of the processed video frame.
- Encoding compression It can be understood that the above description is only exemplary, and the embodiment of the present application does not make any limitation on this.
- the processed video frame fetched from the second buffer is post-processed
- the processed video frame fetched from the second buffer is The pixel value of the pixel point is scaled in the range to restore the data format of the processed video frame to the video image data format. Therefore, by scaling the pixel values of the pixels in the processed video frame fetched from the second buffer, the data format of the processed video frame can be restored to the video image data format, thereby facilitating The subsequent encoding and compression of the processed video frame.
- the pixel value of the pixel of the processed video frame is not within the normal range, and the processed video frame needs to be post-processed to restore the pixel value of the processed video frame to normal In the range. Then, the data format of the video frame after the pixel value is restored is restored to the video image data format.
- step S103 based on the processed video frame, a standard video stream corresponding to the original video stream is generated.
- the data format is restored to the processed video image data format.
- the video frame is encoded to obtain a standard video stream corresponding to the original video stream.
- the video image data format may be YUV data format or RGB data format. It can be understood that the above description is only exemplary, and the embodiment of the present application does not make any limitation on this.
- the input original video stream is decoded to obtain video frames in the original video stream.
- the video frame is preprocessed to obtain the preprocessed video frame.
- the pre-processed video frame is stored in the pre-processed video frame buffer queue. Then, take the preprocessed video frame from the preprocessed video frame buffer queue, and process the preprocessed video frame through the video frame processing model to obtain the processed video frame, and then store the processed video frame into the processed video frame Buffer queue.
- the processed video frame is taken out from the processed video frame buffer queue, and the processed video frame is post-processed to obtain the post-processed video frame. Finally, the post-processing video frame is encoded to obtain the standard video stream corresponding to the original video stream.
- the video frame in the original video stream is stored in the first buffer by calling the video stream processing interface of the video stream processing tool; the video frame processing model is used for the first buffer.
- the video frames in the buffer are processed to obtain the processed video frames; based on the processed video frames, the standard video stream corresponding to the original video stream is generated.
- the video stream processing interface which processes the video stream, effectively reduces the software complexity of video stream processing, thereby increasing the processing speed of the video stream.
- the video frame in the original video stream is stored in the first buffer, without the need to save the video frame as a picture and read the picture. While ensuring the quality of the video frame, it also improves the processing speed of the entire video stream. .
- the video stream processing method of this embodiment can be executed by any suitable device with data processing capabilities, including but not limited to: cameras, terminals, mobile terminals, PCs, servers, in-vehicle equipment, entertainment equipment, advertising equipment, personal digital Assistant (PDA), tablet computer, notebook computer, handheld game console, glasses, watch, wearable device, virtual display device or display enhancement device, etc.
- PDA personal digital Assistant
- FIG. 2A a flowchart of the steps of a method for processing a video stream in Embodiment 2 of the present application is shown.
- the video stream processing method provided in this embodiment includes the following steps:
- step S201 the video frame in the original video stream is stored in the first buffer by calling the video stream processing interface of the video stream processing tool.
- the video stream processing tool may be a FFmpeg (Fast Forward MPEG, an open source and free cross-platform video and audio streaming solution) tool.
- the video stream processing interface may be a video filter interface of the FFmpeg tool.
- the software module that implements the video stream processing method provided by the embodiments of the application is embedded in the FFmpeg tool as the video filter of the FFmpeg tool, that is, the video filter interface of the FFmpeg tool is used to implement the video stream.
- the video filter can be understood as a filter that performs various transformations on a video frame, such as operations such as scaling, rotation, color transformation, and filtering.
- the original video stream may be an original video stream collected by a network collection device, for example, an original video stream collected by a camera of a mobile phone terminal, an original video stream collected by a camera of a tablet computer, or an original video stream collected by a surveillance camera.
- the first buffer may be a buffer queue. It can be understood that the above description is only exemplary, and the embodiment of the present application does not make any limitation on this.
- the method before storing the video frames in the original video stream in the first buffer, the method further includes: preprocessing the video frames in the original video stream to obtain suitable In the video frame processed by the video frame processing model, the storing the video frame in the original video stream in the first buffer includes: storing the video frame suitable for processing by the video frame processing model in the first buffer In a buffer. In this way, by preprocessing the video frames in the original video stream, it is possible to obtain video frames suitable for processing by the video frame processing model. It can be understood that the above description is only exemplary, and the embodiment of the present application does not make any limitation on this.
- the pixel values of the pixel points in the video frames in the original video stream are scaled to obtain a value suitable for all The video frame processed by the video frame processing model.
- a video frame suitable for processing by the video frame processing model can be obtained.
- the video frame processing model when the video frame processing model is trained, the pixel values of the pixels of the input video frame sample are normally distributed, so that the video frame processing model is easier to train and converge. After the training of the video frame processing model converges, it is also expected in the actual scene that the pixel values of the input pixels of the video frame to be processed are normally distributed. Therefore, the video frame can be preprocessed to obtain a video frame suitable for processing by the video frame processing model. It can be understood that the above description is only exemplary, and the embodiment of the present application does not make any limitation on this.
- step S202 a batch size for batch processing of video frames in the first buffer by the video frame processing model is determined based on the decoding duration of the next video frame of the video frame.
- the batch processing can be understood as batch processing of the video frames in the first buffer.
- the video frame processing model processes the video frames in the first buffer in batches, which has higher processing efficiency and lower average processing time.
- the video frame processing model can be understood as a neural network model for processing video frames, for example, an image enhancement model, an image super-resolution model, an image beauty model, and so on. It can be understood that the above description is only exemplary, and the embodiment of the present application does not make any limitation on this.
- the method before the determining the batch size of the video frame processing model for batch processing of the video frames in the first buffer, the method further includes: decoding the images in the original video stream
- the decoding duration of the next video frame is determined according to the information of the image group.
- the image group can be understood as a group of continuous pictures, consisting of one I frame and multiple B frames/P frames, that is, a sequence of video frames encoded by a video stream. In this way, the decoding duration of the next video frame can be accurately determined through the information of the image group. It can be understood that the above description is only exemplary, and the embodiment of the present application does not make any limitation on this.
- the information of the image group includes the frame type of the next video frame of the video frame.
- the decoding duration of the next video frame is determined according to the information of the image group, when the frame type of the next video frame is an intra-frame coded frame, it is determined that the decoding duration of the next video frame is all The decoding duration of the intra-frame coded frame; when the frame type of the next video frame is a forward predictive coded frame, it is determined that the decoding duration of the next video frame is the decoded duration of the forward predictive coded frame; when When the frame type of the next video frame is a bidirectional predictive interpolation coding frame, the decoding duration of the next video frame of the video frame is determined according to the frame type of the next video frame of the next video frame.
- the decoding time length of the intra-frame coded frame and the decoding time length of the forward predictive coded frame are pre-stated and configured. It can be understood that the above description is only exemplary, and the embodiment of the present application does not make any limitation on this.
- the decoding duration of the next video frame of the video frame is determined according to the frame type of the next video frame of the next video frame
- the decoding time of the next video frame of the video frame is determined
- the frame type of the next video frame is an intra-coded frame
- the decoding duration of the next video frame of the video frame is the decoding duration of the intra-coded frame and the decoding duration of the bidirectional predictive interpolation coded frame
- the frame type of the next video frame of the next video frame of the video frame is a forward predictive coded frame
- the decoding duration of the next video frame of the video frame is that of the forward predictive coded frame
- the decoding duration of the bidirectional predictive interpolation coded frame is calculated in advance and configured. It can be understood that the above description is only exemplary, and the embodiment of the present application does not make any limitation on this.
- the video stream has its own unique characteristics. Due to the huge amount of data, video streams are generally encoded and compressed to reduce transmission and storage pressure.
- the video image in the video stream is generally divided into image groups. In the image group, the delay caused by the decoding of different video frames is different.
- video images are encoded as I-frames, B-frames, and P-frames, and B-frames have to wait for the next key frame to be decoded due to their bidirectional prediction characteristics. Therefore, it is necessary to consider the encoding characteristics of the video stream, and determine the decoding duration of the video frame in the image group according to the type of the video frame in the image group.
- the I frame is an intra-frame coded frame.
- the I frame is a key frame, and there is no need to rely on other frames for processing during encoding and decoding.
- the P frame that is, the forward predictive coding frame, records the difference information from the previous I frame or P frame, and the encoding and decoding depends on the previous I frame or P frame.
- the B frame the bidirectional predictive interpolation coding frame, contains the difference information between the current frame and the previous and next frames. It needs to rely on the previous key frame (I frame or P frame) and the next key frame (I frame or P frame). Codec. It can be understood that the above description is only exemplary, and the embodiment of the present application does not make any limitation on this.
- the video frame processing model determines the batch size for batch processing of video frames in the first buffer based on the decoding duration of the next video frame of the video frame, based on the The decoding duration of the next video frame, the preprocessing duration of the next video frame, and the first timestamp of the video frame stored in the first buffer, it is determined that the next video frame is stored in the first buffer.
- the second time stamp of a buffer the number of video frames in the first buffer when the next video frame is stored in the first buffer is determined, and the video frame processing model takes the number as The batch size batches the duration of processing the video frames in the first buffer; based on the second timestamp and the duration, it is determined that the video frame processing model uses the number as the batch size to complete the
- the third time stamp for batch processing of video frames in the first buffer; if it is determined that the third time stamp and the video in the first buffer when the next video frame is stored in the first buffer If the difference between the minimum time stamp at which the frame is preprocessed is greater than or equal to the preset maximum processing duration of the video frame processing model, it is determined that the video frame processing model batches the video frames in the first buffer
- the batch size is the stated quantity.
- the pre-processing duration and the batch processing duration of the video frame processing model under different batch sizes are pre-stated and configured. With this, it is possible to adaptively determine the batch size of the video frame processing model for batch processing of the video frames in the first buffer based on the decoding duration of the next video frame of the video frame. It can be understood that the above description is only exemplary, and the embodiment of the present application does not make any limitation on this.
- the method further includes: if it is determined that the difference is less than the maximum processing duration, waiting for the next video frame of the next video frame to be stored in the first buffer, until waiting The duration is equal to the difference between the maximum processing duration and the difference.
- step S203 batch processing is performed on the video frames in the first buffer according to the determined batch size through the video frame processing model to obtain batch processed video frames.
- the method further includes: The batch-processed video frames are stored in a second buffer; the batch-processed video frames are retrieved from the second buffer; the batch-processed video frames retrieved from the second buffer
- the video frames are post-processed to restore the data format of the batch-processed video frames to the video image data format. Therefore, by post-processing the batch-processed video frames fetched from the second buffer, the data format of the batch-processed video frames can be restored to the video image data format, which is convenient for subsequent batch processing. Encoding and compression of video frames. It can be understood that the above description is only exemplary, and the embodiment of the present application does not make any limitation on this.
- the batch-processed video frames fetched from the second buffer are subjected to post-processing.
- the pixel values of the pixels in the frame are scaled in the range to restore the data format of the video frame after the batch processing to the video image data format.
- the data format of the batch-processed video frames can be restored to the video image data format, thereby It is convenient for subsequent encoding and compression of video frames after batch processing. It can be understood that the above description is only exemplary, and the embodiment of the present application does not make any limitation on this.
- the pixel values of the pixels of the batch-processed video frames are not within the normal range, and the batch-processed video frames need to be post-processed, and the pixel values of the pixels of the batch-processed video frames Recovery is within the normal range. Then, the data format of the video frame after the pixel value is restored is restored to the video image data format. It can be understood that the above description is only exemplary, and the embodiment of the present application does not make any limitation on this.
- step S204 based on the batch-processed video frames, a standard video stream corresponding to the original video stream is generated.
- the batch processing for restoring the data format to the video image data format is encoded to obtain a standard video stream corresponding to the original video stream.
- the video image data format may be YUV data format or RGB data format. It can be understood that the above description is only exemplary, and the embodiment of the present application does not make any limitation on this.
- the processing flow of the video stream is as follows: First, the input original video stream is read.
- t decode is the sum of the decoding duration of the next key frame of the B frame and the decoding duration of the B frame, namely:
- next video frame is a P frame
- t decode t P
- next video frame is an I frame
- t decode t I.
- the video frame is preprocessed first. This part includes traditional image processing methods, normalization, and range scaling.
- the preprocessing completion time stamp t ready of the video frame is recorded. After the preprocessing is completed, the waiting buffer queue Q is not full.
- the buffer queue Q When the buffer queue Q is not full, use the processing thread to store the video frame, the decoding time t decode of the next video frame of the video frame, and the preprocessing completion timestamp t ready of the video frame into the buffer queue, and store the records in the buffer queue The current timestamp t now . If under the current timestamp t now , in order to meet the limitation of t max , it is necessary to adaptively select a suitable batch size for model inference. Specifically, in the buffer queue Q, it can be inferred that after the next video frame of the video frame is stored in the buffer queue Q, the model processing completion timestamp with the length of the buffer queue Q as the batch size is:
- the restriction condition t finish -min i ⁇ Q (it ready )) ⁇ t max needs to be satisfied.
- the model processing part is suspended, waiting for the next video frame of the next video frame to be stored in the buffer queue Q, until the waiting time reaches t max -(t finish -min i ⁇ Q (it ready )) . Otherwise, perform the model batch processing operation, and store the video frames after the model batch processing in another buffer queue.
- len(Q) is the current length of the buffer queue Q.
- the video stream processing process provided by this embodiment is not a video stream processing process constructed with a machine learning framework combined with a codec. Its inference performance is higher than that of tensorflow and other frameworks. It has no strong coupling with the framework and is suitable for video stream processing in different scenarios. .
- the software module designed with the TensorRT framework as the processing framework combined with the preprocessing part of the cuda acceleration the total acceleration is nearly 7 times that of the original scheme. After combining with adaptive batch processing, there may still be a performance improvement of 10% to 20%.
- the video frame processing model determines the batch size of the video frames in the first buffer, and uses the video frame processing model to determine the Batch process the video frames in the first buffer to obtain batch-processed video frames.
- it can be adaptively based on the decoding time of the next video frame of the video frame. Determine the video frame processing model to batch process the batch size of the video frames in the first buffer, and use the video frame processing model to batch process the video frames in the first buffer according to the determined batch size to effectively ensure that the video While stream processing is real-time, it also further improves the speed of the entire video stream processing.
- the video stream processing method of this embodiment can be executed by any suitable device with data processing capabilities, including but not limited to: cameras, terminals, mobile terminals, PCs, servers, in-vehicle equipment, entertainment equipment, advertising equipment, personal digital Assistant (PDA), tablet computer, notebook computer, handheld game console, glasses, watch, wearable device, virtual display device or display enhancement device, etc.
- PDA personal digital Assistant
- FIG. 3 a schematic structural diagram of a video stream processing apparatus in Embodiment 3 of the present application is shown.
- the video stream processing device includes: a first storage module 301, configured to store the video frames in the original video stream in the first buffer by calling the video stream processing interface of the video stream processing tool; and processing;
- the module 302 is used to process the video frames in the first buffer by using the video frame processing model to obtain processed video frames;
- the generating module 303 is used to generate all the video frames based on the processed video frames.
- the video stream processing apparatus of this embodiment is used to implement the corresponding video stream processing methods in the foregoing multiple method embodiments, and has the beneficial effects of the corresponding method embodiments, which will not be repeated here.
- FIG. 4 a schematic structural diagram of a video stream processing apparatus in Embodiment 4 of the present application is shown.
- the video stream processing device includes: a first storing module 401, configured to store the video frames in the original video stream into the first buffer by calling the video stream processing interface of the video stream processing tool; and processing;
- the module 404 is used to process the video frames in the first buffer by the video frame processing model to obtain the processed video frames;
- the generating module 405 is used to generate all the video frames based on the processed video frames.
- the apparatus further includes: a first determining module 403, configured to determine the video frame processing model to batch process the video frame based on the decoding duration of the next video frame of the video frame The batch size of the video frames in the first buffer.
- the processing module 404 is specifically configured to: use the video frame processing model to perform processing on the video frames in the first buffer according to the determined batch size. Batch processing to obtain video frames after batch processing.
- the apparatus further includes: a second determining module 402, configured to decode the video frame in the image group in the original video stream according to the image The group information determines the decoding duration of the next video frame.
- the information of the image group includes the frame type of the next video frame of the video frame
- the second determining module 402 includes: a first determining sub-module 4021 for when the next video frame When the frame type of is an intra-coded frame, it is determined that the decoding duration of the next video frame is the decoding duration of the intra-coded frame; the second determining submodule 4022 is configured to determine the frame type of the next video frame When it is a forward-prediction encoded frame, it is determined that the decoding duration of the next video frame is the decoding duration of the forward-prediction encoded frame; the third determining sub-module 4023 is used for when the frame type of the next video frame is When a frame is interpolated with bidirectional prediction, the decoding duration of the next video frame of the video frame is determined according to the frame type of the next video frame of the next video frame.
- the third determining submodule 4023 is specifically configured to determine the next video frame of the video frame when the frame type of the next video frame of the video frame is an intra-coded frame
- the decoding duration of a video frame is the sum of the decoding duration of the intra-frame coded frame and the decoding duration of the bidirectional predictive interpolation coded frame; when the frame type of the next video frame of the video frame is the previous
- the decoding duration of the next video frame of the video frame is the sum of the decoding duration of the forward predictive coded frame and the decoding duration of the bidirectional predictive interpolation coded frame.
- the first determining module 403 includes: a fourth determining sub-module 4031, configured to be based on the decoding duration of the next video frame, the preprocessing duration of the next video frame, and the video frame
- the first time stamp stored in the first buffer area is used to determine the second time stamp stored in the first buffer area of the next video frame
- the fifth determining sub-module 4032 is used to determine the next video frame
- the number of video frames in the first buffer when the frames are stored in the first buffer, and the video frame processing model uses the number as the batch size to batch process the videos in the first buffer
- the sixth determining sub-module 4033 is configured to determine, based on the second timestamp and the duration, that the video frame processing model uses the number as the batch size to complete the data in the first buffer
- a seventh determining sub-module 4034 configured to determine if the third time stamp and the next video frame are stored in the first buffer area, the first buffer area If the
- the first determining module 403 further includes: a waiting sub-module 4035, configured to wait for the next video frame of the next video frame to be stored in the store if it is determined that the difference is less than the maximum processing time.
- the waiting time is equal to the difference between the maximum processing time and the difference.
- the video stream processing apparatus of this embodiment is used to implement the corresponding video stream processing methods in the foregoing multiple method embodiments, and has the beneficial effects of the corresponding method embodiments, which will not be repeated here.
- FIG. 5 there is shown a schematic structural diagram of a video stream processing apparatus in Embodiment 5 of the present application.
- the video stream processing device includes: a first storing module 502, configured to store the video frames in the original video stream in the first buffer by calling the video stream processing interface of the video stream processing tool; and processing;
- the module 503 is used to process the video frames in the first buffer area through the video frame processing model to obtain processed video frames;
- the generating module 507 is used to generate all the video frames based on the processed video frames.
- the device before the first storing module 502, the device further includes: a preprocessing module 501, configured to preprocess the video frames in the original video stream to obtain suitable video frames for processing
- the first storing module 502 is specifically configured to store the video frame suitable for processing by the video frame processing model in the first buffer.
- the preprocessing module 501 is specifically configured to: scale the pixel values of pixels in the video frame in the original video stream to obtain a video suitable for processing by the video frame processing model frame.
- the device further includes: a second storing module 504, configured to store the batch-processed video frames in a second buffer; and a fetching module 505, configured to download from Fetch the batch-processed video frames from the second buffer; a post-processing module 506, configured to perform post-processing on the batch-processed video frames fetched from the second buffer, so as to The data format of the video frames after the batch processing is restored to the video image data format.
- the post-processing module 506 is specifically configured to: perform a range scaling on the pixel values of the pixels in the batch-processed video frames fetched from the second buffer, so as to scale the The data format of the video frames after batch processing is restored to the video image data format.
- the generating module 507 is specifically configured to: encode the batch-processed video frames whose data format is restored to the video image data format to obtain a standard video corresponding to the original video stream flow.
- the video stream processing apparatus of this embodiment is used to implement the corresponding video stream processing methods in the foregoing multiple method embodiments, and has the beneficial effects of the corresponding method embodiments, which will not be repeated here.
- FIG. 6 is a schematic diagram of the structure of the electronic device in the sixth embodiment of the application; the electronic device may include:
- the computer-readable medium 602 may be configured to store one or more programs,
- the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the video stream processing method described in the first embodiment or the second embodiment.
- FIG. 7 is the hardware structure of the electronic device in the seventh embodiment of the application; as shown in FIG. 7, the hardware structure of the electronic device may include: a processor 701, a communication interface 702, a computer-readable medium 703, and a communication bus 704;
- the processor 701, the communication interface 702, and the computer-readable medium 703 communicate with each other through the communication bus 704;
- the communication interface 702 may be an interface of a communication module, such as an interface of a GSM module;
- the processor 701 may be specifically configured to: store the video frames in the original video stream in the first buffer by calling the video stream processing interface of the video stream processing tool; The video frames in the area are processed to obtain processed video frames; based on the processed video frames, a standard video stream corresponding to the original video stream is generated.
- the processor 701 may be a general-purpose processor, including a central processing unit (CPU), a network processor (Network Processor, NP), etc.; it may also be a digital signal processor (DSP), an application specific integrated circuit (ASIC), etc. ), ready-made programmable gate array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components.
- CPU central processing unit
- NP Network Processor
- DSP digital signal processor
- ASIC application specific integrated circuit
- FPGA ready-made programmable gate array
- the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
- the computer-readable medium 703 may be, but is not limited to, a random access storage medium (Random Access Memory, RAM), a read-only storage medium (Read Only Memory, ROM), a programmable read-only storage medium (Programmable Read-Only Memory, PROM), Erasable Programmable Read-Only Memory (EPROM), Electrical Erasable Programmable Read-Only Memory (EEPROM), etc.
- RAM Random Access Memory
- ROM read-only storage medium
- PROM Programmable Read-Only Memory
- EPROM Erasable Programmable Read-Only Memory
- EEPROM Electrical Erasable Programmable Read-Only Memory
- an embodiment of the present disclosure includes a computer program product, which includes a computer program carried on a computer-readable medium, and the computer program includes program code configured to execute the method shown in the flowchart.
- the computer program may be downloaded and installed from the network through the communication part, and/or installed from a removable medium.
- CPU central processing unit
- the computer program executes the above-mentioned functions defined in the method of the present application.
- the computer-readable medium described in this application may be a computer-readable signal medium or a computer-readable storage medium or any combination of the two.
- the computer-readable medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the above. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access storage media (RAM), read-only storage media (ROM), erasable Type programmable read-only storage medium (EPROM or flash memory), optical fiber, portable compact disk read-only storage medium (CD-ROM), optical storage medium, magnetic storage medium, or any suitable combination of the above.
- RAM random access storage media
- ROM read-only storage media
- EPROM or flash memory erasable Type programmable read-only storage medium
- CD-ROM portable compact disk read-only storage medium
- magnetic storage medium or any suitable combination of the above.
- the computer-readable storage medium may be any tangible medium that contains or stores a program, and the program may be used by or in combination with an instruction execution system, apparatus, or device.
- a computer-readable signal medium may include a data signal propagated in a baseband or as a part of a carrier wave, and a computer-readable program code is carried therein. This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
- the computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium.
- the computer-readable medium may send, propagate, or transmit a program configured to be used by or in combination with the instruction execution system, apparatus, or device. .
- the program code contained on the computer-readable medium can be transmitted by any suitable medium, including but not limited to: wireless, wire, optical cable, RF, etc., or any suitable combination of the above.
- the computer program code configured to perform the operations of the present application can be written in one or more programming languages or a combination thereof.
- the programming languages include object-oriented programming languages—such as Java, Smalltalk, C++, and also conventional The procedural programming language-such as "C" language or similar programming language.
- the program code can be executed entirely on the user's computer, partly on the user's computer, executed as an independent software package, partly on the user's computer and partly executed on a remote computer, or entirely executed on the remote computer or server.
- the remote computer can be connected to the user's computer through any kind of network: including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (for example, using an Internet service provider to connect to the Internet connect).
- LAN local area network
- WAN wide area network
- each block in the flowchart or block diagram may represent a module, program segment, or part of the code, and the module, program segment, or part of the code contains one or more configurations to implement the specified logical functions.
- Executable instructions There are specific sequence relationships in the above specific embodiments, but these sequence relationships are only exemplary. In specific implementation, these steps may be fewer, more, or the order of execution may be adjusted. That is, in some alternative implementations, the functions marked in the block may also occur in a different order from the order marked in the drawings.
- each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart can be implemented by a dedicated hardware-based system that performs the specified functions or operations Or it can be realized by a combination of dedicated hardware and computer instructions.
- the modules involved in the embodiments described in this application can be implemented in software or hardware.
- the described module may also be provided in the processor, for example, it may be described as: a processor includes a first storing module, a processing module, and a generating module. Among them, the names of these modules do not constitute a limitation on the module itself under certain circumstances.
- the first storage module can also be described as "by calling the video stream processing interface of the video stream processing tool, the original video stream Store the video frames in the first buffer into the module".
- the present application also provides a computer-readable medium on which a computer program is stored.
- the program is executed by a processor, the video stream processing method described in the first or second embodiment is implemented.
- the present application also provides a computer-readable medium, which may be included in the device described in the above embodiment; or it may exist alone without being assembled into the device.
- the above-mentioned computer-readable medium carries one or more programs.
- the device By calling the video stream processing interface of the video stream processing tool, Stored in the first buffer; process the video frames in the first buffer through the video frame processing model to obtain processed video frames; generate the original video based on the processed video frames The standard video stream corresponding to the stream.
- first, second, the first or “the second” used in various embodiments of the present disclosure can modify various components regardless of order and/or importance , But these expressions do not limit the corresponding components.
- the above expressions are only configured for the purpose of distinguishing elements from other elements.
- the first user equipment and the second user equipment represent different user equipment, although both are user equipment.
- the first element may be referred to as the second element, and similarly, the second element may be referred to as the first element.
- an element for example, a first element
- another element for example, a second element
- an element e.g., a second element
- an element e.g., a second element
- the one element is directly connected to the other element or the one element is connected via another element (e.g., The third element) is indirectly connected to the other element.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computing Systems (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
本申请实施例提供了一种视频流的处理方法、装置、电子设备及计算机可读介质,涉及互联网技术领域。其中,所述方法包括:通过调用视频流处理工具的视频流处理接口,将原始视频流中的视频帧存入第一缓冲区中;通过视频帧处理模型,对所述第一缓冲区中的视频帧进行处理,以获得处理后的视频帧;基于所述处理后的视频帧,生成所述原始视频流对应的标准视频流。通过本申请实施例,不仅能够有效降低视频流处理的软件复杂度,而且还能够有效提高视频流的处理速度。
Description
本申请要求2020年03月31日递交的申请号为202010244868.1、发明名称为“视频流的处理方法、装置、电子设备及计算机可读介质”中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请实施例涉及互联网技术领域,尤其涉及一种视频流的处理方法、装置、电子设备及计算机可读介质。
随着计算机技术和互联网技术的发展,越来越多的用户在更多场合需要对自己个人录制或工作需求录制的音视频进行处理,如变调、加背景音乐、音频转换、对视频素材进行剪辑回放等,目前市场上也有很多对音视频进行处理的软件和技术,如FFmpeg(Fast Forward MPEG,开源免费跨平台的视频和音频流方案)技术,FFmpeg是一套可以用来记录、转换数字音频、视频,并能将其转化为流的开源计算机程序。采用LGPL或GPL许可证。它提供了录制、转换以及流化音视频的完整解决方案。它包含了非常先进的音频/视频编解码库libavcodec。由于FFmpeg支持多种编解码格式,很多开发者基于FFmpeg来实现对视频的编解码、画面缩放,画面合成等各种功能。因此,FFmpeg被广泛应用于各种视频播放软件中。例如,在视频流应用软件中,许多服务使用FFmpeg作为视频编码、解码及视频处理的基础框架。又例如,在直播场景中,从网络采集设备中读取视频流后,使用FFmpeg解码为原始视频帧,经过帧处理,而后重新编码为新的视频流。
随着深度学习的发展,视频越来越依赖不同的深度学习模型进行视频处理。例如,在窄带高清传输等场景下,需要对视频帧图像的不同部分进行目标识别、图像增强等操作。而FFmpeg中所支持的传统图像变换难以支持深度学习模型对视频进行处理。因此,业界常见的处理方式为将解码后的原始视频帧逐帧保存为图片,而后采用深度学习模型对图片进行处理,之后将处理后的图片重新编码成视频。具体地,如图1A所示,使用FFmpeg对输入的视频流进行解码,获得原始视频帧,并将原始视频帧逐帧保存为图片存储于磁盘中,而后从磁盘中读取图片,并采用深度学习模型对读取的图片进行处理,获得处理后的图片,之后使用FFmpeg将处理后的图片重新编码成视频。然而,这种处理 方式在于将整个处理流程分散在不同的软件模块之中,提高了视频流处理的软件复杂度。具体地,视频流解码成原始视频帧和推理后的图片编码成视频由FFmpeg完成,而模型推理由tensorRT框架或者tensorflow框架完成。此外,将原始视频帧逐帧保存为图片存储于磁盘中,并从磁盘中读取图片,降低了视频流的处理速度。
由此可见,如何降低视频流处理的软件复杂度,并提高视频流的处理速度成为当前亟待解决的技术问题。
发明内容
本申请的目的在于提出一种视频流的处理方法、装置、电子设备及计算机可读介质,用于解决现有技术中存在的如何降低视频流处理的软件复杂度,并提高视频流的处理速度的技术问题。
根据本申请实施例的第一方面,提供了一种视频流的处理方法。所述方法包括:通过调用视频流处理工具的视频流处理接口,将原始视频流中的视频帧存入第一缓冲区中;通过视频帧处理模型,对所述第一缓冲区中的视频帧进行处理,以获得处理后的视频帧;基于所述处理后的视频帧,生成所述原始视频流对应的标准视频流。
根据本申请实施例的第二方面,提供了一种视频流的处理装置。所述装置包括:第一存入模块,用于通过调用视频流处理工具的视频流处理接口,将原始视频流中的视频帧存入第一缓冲区中;处理模块,用于通过视频帧处理模型,对所述第一缓冲区中的视频帧进行处理,以获得处理后的视频帧;生成模块,用于基于所述处理后的视频帧,生成所述原始视频流对应的标准视频流。
根据本申请实施例的第三方面,提供了一种电子设备,包括:一个或多个处理器;计算机可读介质,配置为存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如上述实施例的第一方面所述的视频流的处理方法。
根据本申请实施例的第四方面,提供了一种计算机可读介质,其上存储有计算机程序,该程序被处理器执行时实现如上述实施例的第一方面所述的视频流的处理方法。
根据本申请实施例提供的视频流的处理方案,通过调用视频流处理工具的视频流处理接口,将原始视频流中的视频帧存入第一缓冲区中;通过视频帧处理模型,对第一缓冲区中的视频帧进行处理,以获得处理后的视频帧;基于处理后的视频帧,生成原始视频流对应的标准视频流,与现有的其它方式相比,通过调用视频流处理工具的视频流处 理接口,对视频流进行处理,有效降低了视频流处理的软件复杂度,进而提高了视频流的处理速度。此外,将原始视频流中的视频帧存入第一缓冲区中,而无需额外将视频帧保存为图片并读取图片,在保证视频帧的质量的同时,还提高了整个视频流处理的速度。
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本申请的其它特征、目的和优点将会变得更明显:
图1A为现有技术提供的视频流的处理过程的示意图;
图1B为本申请实施例一中视频流的处理方法的步骤流程图;
图1C为根据本申请实施例一提供的视频流的处理过程的示意图;
图2A为本申请实施例二中视频流的处理方法的步骤流程图;
图2B为根据本申请实施例二提供的视频流的处理过程的示意图;
图3为本申请实施例三中视频流的处理装置的结构示意图;
图4为本申请实施例四中视频流的处理装置的结构示意图;
图5为本申请实施例五中视频流的处理装置的结构示意图;
图6为本申请实施例六中电子设备的结构示意图;
图7为本申请实施例七中电子设备的硬件结构。
下面结合附图和实施例对本申请作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅配置为解释相关发明,而非对该发明的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与有关发明相关的部分。
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本申请。
参照图1B,示出了本申请实施例一的视频流的处理方法的步骤流程图。
具体地,本实施例提供的视频流的处理方法包括以下步骤:
在步骤S101中,通过调用视频流处理工具的视频流处理接口,将原始视频流中的视频帧存入第一缓冲区中。
在本申请实施例中,所述视频流处理工具可为FFmpeg(Fast Forward MPEG,开源免费跨平台的视频和音频流方案)工具。所述视频流处理接口可为FFmpeg工具的视频滤 波器接口。具体地,将实现本申请实施例提供的视频流的处理方法的软件模块作为FFmpeg工具的视频滤波器内嵌入FFmpeg工具之中,也即是利用了FFmpeg工具的视频滤波器接口实现了视频流的处理方法的软件模块。所述视频滤波器可理解为对视频帧进行各种变换的过滤器,例如缩放、旋转、色彩变换、滤波等操作。所述原始视频流可为网络采集设备采集的原始视频流,例如,手机终端的摄像头采集的原始视频流,平板电脑的摄像头采集的原始视频流,监控摄像头采集的原始视频流。所述第一缓冲区可为缓冲队列。可以理解的是,以上描述仅为示例性的,本申请实施例对此不做任何限定。
在一些可选实施例中,所述将原始视频流中的视频帧存入第一缓冲区中之前,所述方法还包括:对所述原始视频流中的视频帧进行预处理,以获得适于所述视频帧处理模型处理的视频帧,所述将原始视频流中的视频帧存入第一缓冲区中,包括:将适于所述视频帧处理模型处理的视频帧存入所述第一缓冲区中。籍此,通过对原始视频流中的视频帧进行预处理,能够获得适于视频帧处理模型处理的视频帧。可以理解的是,以上描述仅为示例性的,本申请实施例对此不做任何限定。
在一个具体的例子中,在对所述原始视频流中的视频帧进行预处理时,对所述原始视频流中的视频帧中的像素点的像素值进行值域缩放,以获得适于所述视频帧处理模型处理的视频帧。籍此,通过对原始视频流中的视频帧中的像素点的像素值进行值域缩放,能够获得适于视频帧处理模型处理的视频帧。可以理解的是,以上描述仅为示例性的,本申请实施例对此不做任何限定。
在一个具体的例子中,在训练视频帧处理模型时,输入的视频帧样本的像素点的像素值是正态分布的,这样视频帧处理模型更容易训练收敛。在视频帧处理模型训练收敛之后,在实际的场景中也希望输入的待处理的视频帧的像素点的像素值是正态分布的。因此,可对视频帧进行预处理,以获得适于视频帧处理模型处理的视频帧。可以理解的是,以上描述仅为示例性的,本申请实施例对此不做任何限定。
在步骤S102中,通过视频帧处理模型,对所述第一缓冲区中的视频帧进行处理,以获得处理后的视频帧。
在本申请实施例中,所述视频帧处理模型可理解为用于处理视频帧的神经网络模型,例如,图像增强模型、图像超分辨率模型、图像美颜模型等。视频帧处理模型针对视频帧的处理可理解为视频帧处理模型针对视频帧的推理。具体地,当视频帧处理模型被训练完成后,需要部署到视频帧处理的场景中。使用实际场景中的视频帧进行预测,这一过程即为视频帧处理模型的推理。可以理解的是,以上描述仅为示例性的,本申请实施 例对此不做任何限定。
在一些可选实施例中,所述通过视频帧处理模型,对所述第一缓冲区中的视频帧进行处理之后,所述方法还包括:将所述处理后的视频帧存入第二缓冲区中;从所述第二缓冲区中取出所述处理后的视频帧;对从所述第二缓冲区中取出的所述处理后的视频帧进行后处理,以将所述处理后的视频帧的数据格式恢复为视频图像数据格式。籍此,通过对从第二缓冲区中取出的处理后的视频帧进行后处理,能够将处理后的视频帧的数据格式恢复为视频图像数据格式,从而方便于后续针对处理后的视频帧的编码压缩。可以理解的是,以上描述仅为示例性的,本申请实施例对此不做任何限定。
在一个具体的例子中,在对从所述第二缓冲区中取出的所述处理后的视频帧进行后处理时,对从所述第二缓冲区中取出的所述处理后的视频帧中的像素点的像素值进行值域缩放,以将所述处理后的视频帧的数据格式恢复为视频图像数据格式。籍此,通过对从第二缓冲区中取出的处理后的视频帧中的像素点的像素值进行值域缩放,能够将处理后的视频帧的数据格式恢复为视频图像数据格式,从而方便于后续针对处理后的视频帧的编码压缩。可以理解的是,以上描述仅为示例性的,本申请实施例对此不做任何限定。
在一个具体的例子中,处理后的视频帧的像素点的像素值不在正常的范围内,需要对处理后的视频帧进行后处理,将处理后的视频帧的像素点的像素值恢复在正常的范围内。然后,再将像素值恢复后的视频帧的数据格式恢复为视频图像数据格式。可以理解的是,以上描述仅为示例性的,本申请实施例对此不做任何限定。
在步骤S103中,基于所述处理后的视频帧,生成所述原始视频流对应的标准视频流。
在一些可选实施例中,在基于所述处理后的视频帧,生成所述原始视频流对应的标准视频流时,对所述数据格式恢复为所述视频图像数据格式的所述处理后的视频帧进行编码,以获得所述原始视频流对应的标准视频流。其中,所述视频图像数据格式可为YUV数据格式或者RGB数据格式。可以理解的是,以上描述仅为示例性的,本申请实施例对此不做任何限定。
在一个具体的例子中,如图1C所示,对输入的原始视频流进行解码,以获得原始视频流中的视频帧。在获得原始视频流中的视频帧之后,对视频帧进行预处理,以获得预处理视频帧。在获得预处理视频帧之后,将预处理视频帧存入预处理视频帧缓冲队列。然后,从预处理视频帧缓冲队列中取出预处理视频帧,并通过视频帧处理模型,对预处理视频帧进行处理,以获得处理后视频帧,再将处理后视频帧存入处理后视频帧缓冲队列。之后,从处理后视频帧缓冲队列中取出处理后视频帧,并对处理后视频帧进行后处 理,以获得后处理视频帧。最后,对后处理视频帧进行编码,以获得原始视频流对应的标准视频流。可以理解的是,以上描述仅为示例性的,本申请实施例对此不做任何限定。
根据本申请实施例提供的视频流的处理方法,通过调用视频流处理工具的视频流处理接口,将原始视频流中的视频帧存入第一缓冲区中;通过视频帧处理模型,对第一缓冲区中的视频帧进行处理,以获得处理后的视频帧;基于处理后的视频帧,生成原始视频流对应的标准视频流,与现有的其它方式相比,通过调用视频流处理工具的视频流处理接口,对视频流进行处理,有效降低了视频流处理的软件复杂度,进而提高了视频流的处理速度。此外,将原始视频流中的视频帧存入第一缓冲区中,而无需额外将视频帧保存为图片并读取图片,在保证视频帧的质量的同时,还提高了整个视频流处理的速度。
本实施例的视频流的处理方法可以由任意适当的具有数据处理能力的设备执行,包括但不限于:摄像头、终端、移动终端、PC机、服务器、车载设备、娱乐设备、广告设备、个人数码助理(PDA)、平板电脑、笔记本电脑、掌上游戏机、眼镜、手表、可穿戴设备、虚拟显示设备或显示增强设备等。
参照图2A,示出了本申请实施例二的视频流的处理方法的步骤流程图。
具体地,本实施例提供的视频流的处理方法包括以下步骤:
在步骤S201中,通过调用视频流处理工具的视频流处理接口,将原始视频流中的视频帧存入第一缓冲区中。
在本申请实施例中,所述视频流处理工具可为FFmpeg(Fast Forward MPEG,开源免费跨平台的视频和音频流方案)工具。所述视频流处理接口可为FFmpeg工具的视频滤波器接口。具体地,将实现本申请实施例提供的视频流的处理方法的软件模块作为FFmpeg工具的视频滤波器内嵌入FFmpeg工具之中,也即是利用了FFmpeg工具的视频滤波器接口实现了视频流的处理方法的软件模块。所述视频滤波器可理解为对视频帧进行各种变换的过滤器,例如缩放、旋转、色彩变换、滤波等操作。所述原始视频流可为网络采集设备采集的原始视频流,例如,手机终端的摄像头采集的原始视频流,平板电脑的摄像头采集的原始视频流,监控摄像头采集的原始视频流。所述第一缓冲区可为缓冲队列。可以理解的是,以上描述仅为示例性的,本申请实施例对此不做任何限定。
在一些可选实施例中,所述将原始视频流中的视频帧存入第一缓冲区中之前,所述方法还包括:对所述原始视频流中的视频帧进行预处理,以获得适于所述视频帧处理模型处理的视频帧,所述将原始视频流中的视频帧存入第一缓冲区中,包括:将适于所述视频帧处理模型处理的视频帧存入所述第一缓冲区中。籍此,通过对原始视频流中的视 频帧进行预处理,能够获得适于视频帧处理模型处理的视频帧。可以理解的是,以上描述仅为示例性的,本申请实施例对此不做任何限定。
在一个具体的例子中,在对所述原始视频流中的视频帧进行预处理时,对所述原始视频流中的视频帧中的像素点的像素值进行值域缩放,以获得适于所述视频帧处理模型处理的视频帧。籍此,通过对原始视频流中的视频帧中的像素点的像素值进行值域缩放,能够获得适于视频帧处理模型处理的视频帧。可以理解的是,以上描述仅为示例性的,本申请实施例对此不做任何限定。
在一个具体的例子中,在训练视频帧处理模型时,输入的视频帧样本的像素点的像素值是正态分布的,这样视频帧处理模型更容易训练收敛。在视频帧处理模型训练收敛之后,在实际的场景中也希望输入的待处理的视频帧的像素点的像素值是正态分布的。因此,可对视频帧进行预处理,以获得适于视频帧处理模型处理的视频帧。可以理解的是,以上描述仅为示例性的,本申请实施例对此不做任何限定。
在步骤S202中,基于所述视频帧的下一视频帧的解码时长,确定所述视频帧处理模型批量处理所述第一缓冲区中的视频帧的批大小。
在本申请实施例中,所述批量处理可理解为对所述第一缓冲区中的视频帧进行批量的处理。所述视频帧处理模型通过批量处理所述第一缓冲区中的视频帧,处理效率更高,平均处理时间更低。所述视频帧处理模型可理解为用于处理视频帧的神经网络模型,例如,图像增强模型、图像超分辨率模型、图像美颜模型等。可以理解的是,以上描述仅为示例性的,本申请实施例对此不做任何限定。
在一些可选实施例中,所述确定所述视频帧处理模型批量处理所述第一缓冲区中的视频帧的批大小之前,所述方法还包括:在解码所述原始视频流中的图像组中的所述视频帧时,根据所述图像组的信息,确定所述下一视频帧的解码时长。其中,所述图像组可理解为一组连续的画面,由一张I帧和多张B帧/P帧组成,也即是视频流编码的视频帧序列。籍此,通过图像组的信息,能够准确地下一视频帧的解码时长。可以理解的是,以上描述仅为示例性的,本申请实施例对此不做任何限定。
在一个具体的例子中,所述图像组的信息包括所述视频帧的下一视频帧的帧类型。在根据所述图像组的信息,确定所述下一视频帧的解码时长时,当所述下一视频帧的帧类型为帧内编码帧时,确定所述下一视频帧的解码时长为所述帧内编码帧的解码时长;当所述下一视频帧的帧类型为前向预测编码帧时,确定所述下一视频帧的解码时长为所述前向预测编码帧的解码时长;当所述下一视频帧的帧类型为双向预测内插编码帧时, 根据所述下一视频帧的下一视频帧的帧类型,确定所述视频帧的下一视频帧的解码时长。其中,所述帧内编码帧的解码时长和所述前向预测编码帧的解码时长是预先统计的,并且配置好的。可以理解的是,以上描述仅为示例性的,本申请实施例对此不做任何限定。
在一个具体的例子中,在根据所述下一视频帧的下一视频帧的帧类型,确定所述视频帧的下一视频帧的解码时长时,当所述视频帧的下一视频帧的下一视频帧的帧类型为帧内编码帧时,确定所述视频帧的下一视频帧的解码时长为所述帧内编码帧的解码时长与所述双向预测内插编码帧的解码时长之和;当所述视频帧的下一视频帧的下一视频帧的帧类型为前向预测编码帧时,确定所述视频帧的下一视频帧的解码时长为所述前向预测编码帧的解码时长与所述双向预测内插编码帧的解码时长之和。其中,所述双向预测内插编码帧的解码时长是预先统计的,并且配置好的。可以理解的是,以上描述仅为示例性的,本申请实施例对此不做任何限定。
在一个具体的例子中,与其它数据不相同,视频流有其独特特性。视频流由于数据量巨大,一般会进行编码压缩以减小传输和存储压力。而在编码压缩算法中,一般会将视频流中的视频图像分割为图像组,在图像组中,不同的视频帧解码导致的时延不同。以X264编码为例,视频图像被编码为I帧、B帧和P帧,而B帧由于其双向预测的特性,需等待后一关键帧解码后才能进行解码。因此,需要考虑视频流的编码特性,根据图像组中的视频帧的类型,确定图像组中的视频帧的解码时长。其中,I帧,即帧内编码帧。I帧为关键帧,在编解码时无需依赖其他帧进行处理。P帧,即前向预测编码帧,记录了与前一个I帧或P帧的差别信息,编解码时依赖前一I帧或P帧。B帧,即双向预测内插编码帧,包含了本帧与前后帧之间的差别信息,需要依靠前一关键帧(I帧或P帧)与后一关键帧(I帧或P帧)进行编解码。可以理解的是,以上描述仅为示例性的,本申请实施例对此不做任何限定。
在一些可选实施例中,在基于所述视频帧的下一视频帧的解码时长,确定所述视频帧处理模型批量处理所述第一缓冲区中的视频帧的批大小时,基于所述下一视频帧的解码时长、所述下一视频帧的预处理时长,及所述视频帧存入所述第一缓冲区的第一时间戳,确定所述下一视频帧存入所述第一缓冲区的第二时间戳;确定所述下一视频帧存入所述第一缓冲区时所述第一缓冲区中的视频帧的数量,及所述视频帧处理模型以所述数量为所述批大小批量处理所述第一缓冲区中的视频帧的时长;基于所述第二时间戳和所述时长,确定所述视频帧处理模型以所述数量为所述批大小完成所述第一缓冲区中的视频帧的批量处理的第三时间戳;如果确定所述第三时间戳与所述下一视频帧存入所述第 一缓冲区时所述第一缓冲区中的视频帧完成预处理的最小时间戳的差值大于或等于预先设定的所述视频帧处理模型的最大处理时长,则确定所述视频帧处理模型批量处理所述第一缓冲区中的视频帧的批大小为所述数量。其中,所述预处理时长和不同批大小下视频帧处理模型的批量处理时长是预先统计的,并且配置好的。籍此,能够基于视频帧的下一视频帧的解码时长,自适应地确定视频帧处理模型批量处理第一缓冲区中的视频帧的批大小。可以理解的是,以上描述仅为示例性的,本申请实施例对此不做任何限定。
在一个具体的例子中,所述方法还包括:如果确定所述差值小于所述最大处理时长,则等待所述下一视频帧的下一视频帧存入所述第一缓冲区,直到等待时长等于所述最大处理时长与所述差值的差值。可以理解的是,以上描述仅为示例性的,本申请实施例对此不做任何限定。
在步骤S203中,通过所述视频帧处理模型,根据确定的所述批大小,对所述第一缓冲区中的视频帧进行批量处理,以获得批量处理后的视频帧。
在一些可选实施例中,所述通过所述视频帧处理模型,根据确定的所述批大小,对所述第一缓冲区中的视频帧进行批量处理之后,所述方法还包括:将所述批量处理后的视频帧存入第二缓冲区中;从所述第二缓冲区中取出所述批量处理后的视频帧;对从所述第二缓冲区中取出的所述批量处理后的视频帧进行后处理,以将所述批量处理后的视频帧的数据格式恢复为视频图像数据格式。籍此,通过对从第二缓冲区中取出的批量处理后的视频帧进行后处理,能够将批量处理后的视频帧的数据格式恢复为视频图像数据格式,从而方便于后续针对批量处理后的视频帧的编码压缩。可以理解的是,以上描述仅为示例性的,本申请实施例对此不做任何限定。
在一个具体的例子中,在对从所述第二缓冲区中取出的所述批量处理后的视频帧进行后处理时,对从所述第二缓冲区中取出的所述批量处理后的视频帧中的像素点的像素值进行值域缩放,以将所述批量处理后的视频帧的数据格式恢复为视频图像数据格式。籍此,通过对从第二缓冲区中取出的批量处理后的视频帧中的像素点的像素值进行值域缩放,能够将批量处理后的视频帧的数据格式恢复为视频图像数据格式,从而方便于后续针对批量处理后的视频帧的编码压缩。可以理解的是,以上描述仅为示例性的,本申请实施例对此不做任何限定。
在一个具体的例子中,批量处理后的视频帧的像素点的像素值不在正常的范围内,需要对批量处理后的视频帧进行后处理,将批量处理后的视频帧的像素点的像素值恢复在正常的范围内。然后,再将像素值恢复后的视频帧的数据格式恢复为视频图像数据格 式。可以理解的是,以上描述仅为示例性的,本申请实施例对此不做任何限定。
在步骤S204中,基于所述批量处理后的视频帧,生成所述原始视频流对应的标准视频流。
在一些可选实施例中,在基于所述批量处理后的视频帧,生成所述原始视频流对应的标准视频流时,对所述数据格式恢复为所述视频图像数据格式的所述批量处理后的视频帧进行编码,以获得所述原始视频流对应的标准视频流。其中,所述视频图像数据格式可为YUV数据格式或者RGB数据格式。可以理解的是,以上描述仅为示例性的,本申请实施例对此不做任何限定。
在一个具体的例子中,由于需要自适应地选择合适的批大小,因此需要预先统计不同批大小下视频帧处理模型所需的处理时长t
process,预处理所需的处理时长t
preprocess,所有视频帧的预处理时长均为t
preprocess,I帧的解码时长t
I,P帧的解码时长t
P,B帧的解码时长t
B。此外,还需要设定视频帧处理模型的最大处理时长t
max。如图2B所示,视频流的处理流程如下:首先读取输入的原始视频流。使用硬件加速的h264解码器解码原始视频流,转换为原始的视频帧。在解码每一个图像组中的图像帧时,根据图像组的信息可计算下一视频帧图像解码所需的时长t
decode。当下一视频帧为B帧时,t
decode为B帧的下一个关键帧的解码时长与B帧的解码时长之和,即:
当下一视频帧为P帧时,t
decode=t
P,当下一视频帧为I帧时,t
decode=t
I。然后,执行包含自适应批处理的模型推理的视频帧处理。具体地,首先将视频帧进行预处理,这一部分包含传统的图像处理方法、归一化和值域缩放等。同时记录视频帧预处理完成时间戳t
ready。在完成预处理后,等待缓冲队列Q未满。当缓冲队列Q未满时,使用处理线程将视频帧、视频帧的下一视频帧的解码时长t
decode,及视频帧的预处理完成时间戳t
ready存入缓冲队列,并记录存入缓存队列的当前时间戳t
now。若在当前时间戳t
now下,为了满足t
max的限制,需要自适应地选择合适的批大小进行模型推理。具体来说,在缓冲队列Q中,可以推测当视频帧的下一视频帧存入缓冲队列Q后,以缓冲队列Q的长度为批大小的模型处理完成时间戳为:
t
finish=t
process[len(Q)+1]+t
decode[len(Q)]+t
now+t
preprocess
如果继续等待,则需要满足限制条件t
finish-min
i∈Q(i.t
ready))< t
max。当满足此条件时,将模型处理部分挂起,等待下一视频帧的下一视频帧存入缓冲队列Q,直到等待时长达到t
max-(t
finish-min
i∈Q(i.t
ready))为止。否则,执行模型批量处理操作,并将模型批量处理后的视频帧存入另一缓冲队列。其中,len(Q)为缓冲队列Q的当前长度。当另一缓冲队列不为空时,从另一缓冲队列中取出批量处理后的视频帧,对批量处理后的视频帧进行后处理操作,将批量处理后的视频帧的数据格式恢复为YUV或RGB数据格式。最后,编码输出标准视频流。使用硬件加速的h264编码器将数据格式恢复后的批量处理后的视频帧编码为标准视频流。可以理解的是,以上描述仅为示例性的,本申请实施例对此不做任何限定。
在实际应用中,通过引入自适应批处理的操作,考虑了视频分组中的视频帧的编解码的特性,并利用了GPU在批大小较大时效率较高的特点,进一步提升了视频流处理性能和实时性。此外,本实施例提供的视频流处理流程不是以机器学习框架结合编解码器构建的视频流处理流程,推理性能较tensorflow等框架更高,与框架无强耦合,适合不同场景下的视频流处理。例如,在实际直播场景中,以TensorRT框架为处理框架设计的软件模块结合cuda加速的预处理部分,合计加速为原有方案的近7倍。结合自适应批处理后仍可能有10%~20%的性能提升。
在本申请实施例一的基础上,基于视频帧的下一视频帧的解码时长,确定视频帧处理模型批量处理第一缓冲区中的视频帧的批大小,并通过视频帧处理模型,根据确定的批大小,对第一缓冲区中的视频帧进行批量处理,以获得批量处理后的视频帧,与现有的其它方式相比,能够基于视频帧的下一视频帧的解码时长,自适应地确定视频帧处理模型批量处理第一缓冲区中的视频帧的批大小,并通过视频帧处理模型,根据确定的批大小,对第一缓冲区中的视频帧进行批量处理,在有效保证视频流处理的实时性的同时,还进一步提高了整个视频流处理的速度。
本实施例的视频流的处理方法可以由任意适当的具有数据处理能力的设备执行,包括但不限于:摄像头、终端、移动终端、PC机、服务器、车载设备、娱乐设备、广告设备、个人数码助理(PDA)、平板电脑、笔记本电脑、掌上游戏机、眼镜、手表、可穿戴设备、虚拟显示设备或显示增强设备等。
参照图3,示出了本申请实施例三中视频流的处理装置的结构示意图。
本实施例提供的视频流的处理装置包括:第一存入模块301,用于通过调用视频流处理工具的视频流处理接口,将原始视频流中的视频帧存入第一缓冲区中;处理模块302,用于通过视频帧处理模型,对所述第一缓冲区中的视频帧进行处理,以获得处理后的视 频帧;生成模块303,用于基于所述处理后的视频帧,生成所述原始视频流对应的标准视频流。
本实施例的视频流的处理装置用于实现前述多个方法实施例中相应的视频流的处理方法,并具有相应的方法实施例的有益效果,在此不再赘述。
参照图4,示出了本申请实施例四中视频流的处理装置的结构示意图。
本实施例提供的视频流的处理装置包括:第一存入模块401,用于通过调用视频流处理工具的视频流处理接口,将原始视频流中的视频帧存入第一缓冲区中;处理模块404,用于通过视频帧处理模型,对所述第一缓冲区中的视频帧进行处理,以获得处理后的视频帧;生成模块405,用于基于所述处理后的视频帧,生成所述原始视频流对应的标准视频流。
可选地,所述处理模块404之前,所述装置还包括:第一确定模块403,用于基于所述视频帧的下一视频帧的解码时长,确定所述视频帧处理模型批量处理所述第一缓冲区中的视频帧的批大小,所述处理模块404,具体用于:通过所述视频帧处理模型,根据确定的所述批大小,对所述第一缓冲区中的视频帧进行批量处理,以获得批量处理后的视频帧。
可选地,所述第一确定模块403之前,所述装置还包括:第二确定模块402,用于在解码所述原始视频流中的图像组中的所述视频帧时,根据所述图像组的信息,确定所述下一视频帧的解码时长。
可选地,所述图像组的信息包括所述视频帧的下一视频帧的帧类型,所述第二确定模块402,包括:第一确定子模块4021,用于当所述下一视频帧的帧类型为帧内编码帧时,确定所述下一视频帧的解码时长为所述帧内编码帧的解码时长;第二确定子模块4022,用于当所述下一视频帧的帧类型为前向预测编码帧时,确定所述下一视频帧的解码时长为所述前向预测编码帧的解码时长;第三确定子模块4023,用于当所述下一视频帧的帧类型为双向预测内插编码帧时,根据所述下一视频帧的下一视频帧的帧类型,确定所述视频帧的下一视频帧的解码时长。
可选地,所述第三确定子模块4023,具体用于:当所述视频帧的下一视频帧的下一视频帧的帧类型为帧内编码帧时,确定所述视频帧的下一视频帧的解码时长为所述帧内编码帧的解码时长与所述双向预测内插编码帧的解码时长之和;当所述视频帧的下一视频帧的下一视频帧的帧类型为前向预测编码帧时,确定所述视频帧的下一视频帧的解码时长为所述前向预测编码帧的解码时长与所述双向预测内插编码帧的解码时长之和。
可选地,所述第一确定模块403,包括:第四确定子模块4031,用于基于所述下一视频帧的解码时长、所述下一视频帧的预处理时长,及所述视频帧存入所述第一缓冲区的第一时间戳,确定所述下一视频帧存入所述第一缓冲区的第二时间戳;第五确定子模块4032,用于确定所述下一视频帧存入所述第一缓冲区时所述第一缓冲区中的视频帧的数量,及所述视频帧处理模型以所述数量为所述批大小批量处理所述第一缓冲区中的视频帧的时长;第六确定子模块4033,用于基于所述第二时间戳和所述时长,确定所述视频帧处理模型以所述数量为所述批大小完成所述第一缓冲区中的视频帧的批量处理的第三时间戳;第七确定子模块4034,用于如果确定所述第三时间戳与所述下一视频帧存入所述第一缓冲区时所述第一缓冲区中的视频帧完成预处理的最小时间戳的差值大于或等于预先设定的所述视频帧处理模型的最大处理时长,则确定所述视频帧处理模型批量处理所述第一缓冲区中的视频帧的批大小为所述数量。
可选地,所述第一确定模块403还包括:等待子模块4035,用于如果确定所述差值小于所述最大处理时长,则等待所述下一视频帧的下一视频帧存入所述第一缓冲区,直到等待时长等于所述最大处理时长与所述差值的差值。
本实施例的视频流的处理装置用于实现前述多个方法实施例中相应的视频流的处理方法,并具有相应的方法实施例的有益效果,在此不再赘述。
参照图5,示出了本申请实施例五中视频流的处理装置的结构示意图。
本实施例提供的视频流的处理装置包括:第一存入模块502,用于通过调用视频流处理工具的视频流处理接口,将原始视频流中的视频帧存入第一缓冲区中;处理模块503,用于通过视频帧处理模型,对所述第一缓冲区中的视频帧进行处理,以获得处理后的视频帧;生成模块507,用于基于所述处理后的视频帧,生成所述原始视频流对应的标准视频流。
可选地,所述第一存入模块502之前,所述装置还包括:预处理模块501,用于对所述原始视频流中的视频帧进行预处理,以获得适于所述视频帧处理模型处理的视频帧,所述第一存入模块502,具体用于:将适于所述视频帧处理模型处理的视频帧存入所述第一缓冲区中。
可选地,所述预处理模块501,具体用于:对所述原始视频流中的视频帧中的像素点的像素值进行值域缩放,以获得适于所述视频帧处理模型处理的视频帧。
可选地,所述处理模块503之后,所述装置还包括:第二存入模块504,用于将所述批量处理后的视频帧存入第二缓冲区中;取出模块505,用于从所述第二缓冲区中取 出所述批量处理后的视频帧;后处理模块506,用于对从所述第二缓冲区中取出的所述批量处理后的视频帧进行后处理,以将所述批量处理后的视频帧的数据格式恢复为视频图像数据格式。
可选地,所述后处理模块506,具体用于:对从所述第二缓冲区中取出的所述批量处理后的视频帧中的像素点的像素值进行值域缩放,以将所述批量处理后的视频帧的数据格式恢复为视频图像数据格式。
可选地,所述生成模块507,具体用于:对所述数据格式恢复为所述视频图像数据格式的所述批量处理后的视频帧进行编码,以获得所述原始视频流对应的标准视频流。
本实施例的视频流的处理装置用于实现前述多个方法实施例中相应的视频流的处理方法,并具有相应的方法实施例的有益效果,在此不再赘述。
图6为本申请实施例六中电子设备的结构示意图;该电子设备可以包括:
一个或多个处理器601;
计算机可读介质602,可以配置为存储一个或多个程序,
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如上述实施例一或实施例二所述的视频流的处理方法。
图7为本申请实施例七中电子设备的硬件结构;如图7所示,该电子设备的硬件结构可以包括:处理器701,通信接口702,计算机可读介质703和通信总线704;
其中处理器701、通信接口702、计算机可读介质703通过通信总线704完成相互间的通信;
可选地,通信接口702可以为通信模块的接口,如GSM模块的接口;
其中,处理器701具体可以配置为:通过调用视频流处理工具的视频流处理接口,将原始视频流中的视频帧存入第一缓冲区中;通过视频帧处理模型,对所述第一缓冲区中的视频帧进行处理,以获得处理后的视频帧;基于所述处理后的视频帧,生成所述原始视频流对应的标准视频流。
处理器701可以是通用处理器,包括中央处理器(Central Processing Unit,简称CPU)、网络处理器(Network Processor,简称NP)等;还可以是数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其它可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
计算机可读介质703可以是,但不限于,随机存取存储介质(Random Access Memory,RAM),只读存储介质(Read Only Memory,ROM),可编程只读存储介质(Programmable Read-Only Memory,PROM),可擦除只读存储介质(Erasable Programmable Read-Only Memory,EPROM),电可擦除只读存储介质(Electric Erasable Programmable Read-Only Memory,EEPROM)等。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含配置为执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分从网络上被下载和安装,和/或从可拆卸介质被安装。在该计算机程序被中央处理单元(CPU)执行时,执行本申请的方法中限定的上述功能。需要说明的是,本申请所述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读介质例如可以但不限于是电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储介质(RAM)、只读存储介质(ROM)、可擦式可编程只读存储介质(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储介质(CD-ROM)、光存储介质件、磁存储介质件、或者上述的任意合适的组合。在本申请中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本申请中,计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输配置为由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、RF等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言或其组合来编写配置为执行本申请的操作的计算机程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如”C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服 务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络:包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本申请各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个配置为实现规定的逻辑功能的可执行指令。上述具体实施例中有特定先后关系,但这些先后关系只是示例性的,在具体实现的时候,这些步骤可能会更少、更多或执行顺序有调整。即在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本申请实施例中所涉及到的模块可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的模块也可以设置在处理器中,例如,可以描述为:一种处理器包括第一存入模块、处理模块和生成模块。其中,这些模块的名称在某种情况下并不构成对该模块本身的限定,例如,第一存入模块还可以被描述为“通过调用视频流处理工具的视频流处理接口,将原始视频流中的视频帧存入第一缓冲区中的模块”。
作为另一方面,本申请还提供了一种计算机可读介质,其上存储有计算机程序,该程序被处理器执行时实现如上述实施例一或实施例二所描述的视频流的处理方法。
作为另一方面,本申请还提供了一种计算机可读介质,该计算机可读介质可以是上述实施例中描述的装置中所包含的;也可以是单独存在,而未装配入该装置中。上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该装置执行时,使得该装置:通过调用视频流处理工具的视频流处理接口,将原始视频流中的视频帧存入第一缓冲区中;通过视频帧处理模型,对所述第一缓冲区中的视频帧进行处理,以获得处理后的视频帧;基于所述处理后的视频帧,生成所述原始视频流对应的标准视频流。
在本公开的各种实施方式中所使用的表述“第一”、“第二”、“所述第一”或“所述第二”可修饰各种部件而与顺序和/或重要性无关,但是这些表述不限制相应部件。以上表述仅配置为将元件与其它元件区分开的目的。例如,第一用户设备和第二用户设备表示不同的用户设备,虽然两者均是用户设备。例如,在不背离本公开的范围的前提下, 第一元件可称作第二元件,类似地,第二元件可称作第一元件。
当一个元件(例如,第一元件)称为与另一元件(例如,第二元件)“(可操作地或可通信地)联接”或“(可操作地或可通信地)联接至”另一元件(例如,第二元件)或“连接至”另一元件(例如,第二元件)时,应理解为该一个元件直接连接至该另一元件或者该一个元件经由又一个元件(例如,第三元件)间接连接至该另一个元件。相反,可理解,当元件(例如,第一元件)称为“直接连接”或“直接联接”至另一元件(第二元件)时,则没有元件(例如,第三元件)插入在这两者之间。
以上描述仅为本申请的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本申请中所涉及的发明范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述发明构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本申请中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。
Claims (15)
- 一种视频流的处理方法,包括:通过调用视频流处理工具的视频流处理接口,将原始视频流中的视频帧存入第一缓冲区中;通过视频帧处理模型,对所述第一缓冲区中的视频帧进行处理,以获得处理后的视频帧;基于所述处理后的视频帧,生成所述原始视频流对应的标准视频流。
- 根据权利要求1所述的方法,其中,所述将原始视频流中的视频帧存入第一缓冲区中之前,所述方法还包括:对所述原始视频流中的视频帧进行预处理,以获得适于所述视频帧处理模型处理的视频帧,所述将原始视频流中的视频帧存入第一缓冲区中,包括:将适于所述视频帧处理模型处理的视频帧存入所述第一缓冲区中。
- 根据权利要求2所述的方法,其中,所述对所述原始视频流中的视频帧进行预处理,包括:对所述原始视频流中的视频帧中的像素点的像素值进行值域缩放,以获得适于所述视频帧处理模型处理的视频帧。
- 根据权利要求1所述的方法,其中,所述通过视频帧处理模型,对所述第一缓冲区中的视频帧进行处理之前,所述方法还包括:基于所述视频帧的下一视频帧的解码时长,确定所述视频帧处理模型批量处理所述第一缓冲区中的视频帧的批大小,所述通过视频帧处理模型,对所述第一缓冲区中的视频帧进行处理,包括:通过所述视频帧处理模型,根据确定的所述批大小,对所述第一缓冲区中的视频帧进行批量处理,以获得批量处理后的视频帧。
- 根据权利要求4所述的方法,其中,所述确定所述视频帧处理模型批量处理所述第一缓冲区中的视频帧的批大小之前,所述方法还包括:在解码所述原始视频流中的图像组中的所述视频帧时,根据所述图像组的信息,确定所述下一视频帧的解码时长。
- 根据权利要求5所述的方法,其中,所述图像组的信息包括所述视频帧的下一视频帧的帧类型,所述根据所述图像组的信息,确定所述下一视频帧的解码时长,包括:当所述下一视频帧的帧类型为帧内编码帧时,确定所述下一视频帧的解码时长为所述帧内编码帧的解码时长;当所述下一视频帧的帧类型为前向预测编码帧时,确定所述下一视频帧的解码时长为所述前向预测编码帧的解码时长;当所述下一视频帧的帧类型为双向预测内插编码帧时,根据所述下一视频帧的下一视频帧的帧类型,确定所述视频帧的下一视频帧的解码时长。
- 根据权利要求6所述的方法,其中,所述根据所述下一视频帧的下一视频帧的帧类型,确定所述视频帧的下一视频帧的解码时长,包括:当所述视频帧的下一视频帧的下一视频帧的帧类型为帧内编码帧时,确定所述视频帧的下一视频帧的解码时长为所述帧内编码帧的解码时长与所述双向预测内插编码帧的解码时长之和;当所述视频帧的下一视频帧的下一视频帧的帧类型为前向预测编码帧时,确定所述视频帧的下一视频帧的解码时长为所述前向预测编码帧的解码时长与所述双向预测内插编码帧的解码时长之和。
- 根据权利要求4所述的方法,其中,所述基于所述视频帧的下一视频帧的解码时长,确定所述视频帧处理模型批量处理所述第一缓冲区中的视频帧的批大小,包括:基于所述下一视频帧的解码时长、所述下一视频帧的预处理时长,及所述视频帧存入所述第一缓冲区的第一时间戳,确定所述下一视频帧存入所述第一缓冲区的第二时间戳;确定所述下一视频帧存入所述第一缓冲区时所述第一缓冲区中的视频帧的数量,及所述视频帧处理模型以所述数量为所述批大小批量处理所述第一缓冲区中的视频帧的时长;基于所述第二时间戳和所述时长,确定所述视频帧处理模型以所述数量为所述批大小完成所述第一缓冲区中的视频帧的批量处理的第三时间戳;如果确定所述第三时间戳与所述下一视频帧存入所述第一缓冲区时所述第一缓冲区中的视频帧完成预处理的最小时间戳的差值大于或等于预先设定的所述视频帧处理模型的最大处理时长,则确定所述视频帧处理模型批量处理所述第一缓冲区中的视频帧的批大小为所述数量。
- 根据权利要求8所述的方法,其中,所述方法还包括:如果确定所述差值小于所述最大处理时长,则等待所述下一视频帧的下一视频帧存入所述第一缓冲区,直到等待时长等于所述最大处理时长与所述差值的差值。
- 根据权利要求4所述的方法,其中,所述通过所述视频帧处理模型,根据确定的所述批大小,对所述第一缓冲区中的视频帧进行批量处理之后,所述方法还包括:将所述批量处理后的视频帧存入第二缓冲区中;从所述第二缓冲区中取出所述批量处理后的视频帧;对从所述第二缓冲区中取出的所述批量处理后的视频帧进行后处理,以将所述批量处理后的视频帧的数据格式恢复为视频图像数据格式。
- 根据权利要求10所述的方法,其中,所述对从所述第二缓冲区中取出的所述批量处理后的视频帧进行后处理,包括:对从所述第二缓冲区中取出的所述批量处理后的视频帧中的像素点的像素值进行值域缩放,以将所述批量处理后的视频帧的数据格式恢复为视频图像数据格式。
- 根据权利要求10所述的方法,其中,所述基于所述处理后的视频帧,生成所述原始视频流对应的标准视频流,包括:对所述数据格式恢复为所述视频图像数据格式的所述批量处理后的视频帧进行编码,以获得所述原始视频流对应的标准视频流。
- 一种视频流的处理装置,所述装置包括:第一存入模块,用于通过调用视频流处理工具的视频流处理接口,将原始视频流中的视频帧存入第一缓冲区中;处理模块,用于通过视频帧处理模型,对所述第一缓冲区中的视频帧进行处理,以获得处理后的视频帧;生成模块,用于基于所述处理后的视频帧,生成所述原始视频流对应的标准视频流。
- 一种电子设备,所述设备包括:一个或多个处理器;计算机可读介质,配置为存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-12中任意一项权利要求所述的视频流的处理方法。
- 一种计算机可读介质,其上存储有计算机程序,该程序被处理器执行时实现如权利要求1-12中任意一项权利要求所述的视频流的处理方法。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP21781084.5A EP4099694A4 (en) | 2020-03-31 | 2021-03-24 | Video stream processing method and apparatus, and electronic device and computer-readable medium |
US17/956,156 US11997314B2 (en) | 2020-03-31 | 2022-09-29 | Video stream processing method and apparatus, and electronic device and computer-readable medium |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010244868.1A CN113473126B (zh) | 2020-03-31 | 2020-03-31 | 视频流的处理方法、装置、电子设备及计算机可读介质 |
CN202010244868.1 | 2020-03-31 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/956,156 Continuation US11997314B2 (en) | 2020-03-31 | 2022-09-29 | Video stream processing method and apparatus, and electronic device and computer-readable medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021197157A1 true WO2021197157A1 (zh) | 2021-10-07 |
Family
ID=77866046
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/082611 WO2021197157A1 (zh) | 2020-03-31 | 2021-03-24 | 视频流的处理方法、装置、电子设备及计算机可读介质 |
Country Status (4)
Country | Link |
---|---|
US (1) | US11997314B2 (zh) |
EP (1) | EP4099694A4 (zh) |
CN (1) | CN113473126B (zh) |
WO (1) | WO2021197157A1 (zh) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114040247A (zh) * | 2021-11-09 | 2022-02-11 | 新智认知数据服务有限公司 | 网络视频流的处理方法、电子设备及计算机可读存储介质 |
CN114170553A (zh) * | 2021-12-09 | 2022-03-11 | 北京字节跳动网络技术有限公司 | 图像处理方法、装置和电子设备 |
CN114389893A (zh) * | 2022-01-22 | 2022-04-22 | 重庆长安汽车股份有限公司 | 一种活体视频处理的车辆实名认证系统、认证方法及汽车 |
CN114449295A (zh) * | 2022-01-30 | 2022-05-06 | 京东方科技集团股份有限公司 | 视频处理方法、装置、电子设备及存储介质 |
CN117241043B (zh) * | 2023-11-10 | 2024-03-19 | 深圳中微电科技有限公司 | 视频硬件解码错误恢复的方法、系统及存储介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105827976A (zh) * | 2016-04-26 | 2016-08-03 | 北京博瑞空间科技发展有限公司 | 基于gpu的视频采集与处理装置及系统 |
CN106204488A (zh) * | 2016-07-12 | 2016-12-07 | 湖南翰博薇微电子科技有限公司 | 一种OpenCL加速的视频去雾方法 |
CN106507204A (zh) * | 2016-12-07 | 2017-03-15 | 腾讯科技(上海)有限公司 | 一种视频倒放方法和装置 |
CN108810085A (zh) * | 2017-04-28 | 2018-11-13 | 慧与发展有限责任合伙企业 | IoT数据的实时处理 |
CN108881916A (zh) * | 2018-06-21 | 2018-11-23 | 深圳市斯迈龙科技有限公司 | 远程桌面的视频优化处理方法及装置 |
Family Cites Families (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7412149B2 (en) | 2004-10-28 | 2008-08-12 | Bitband Technologies, Ltd. | Trick mode generation in video streaming |
US20060224763A1 (en) | 2005-03-18 | 2006-10-05 | Sharp Laboratories Of America, Inc. | Switching and simultaneous usage of 802.11a and 802.11g technologies for video streaming |
EP1879347B1 (en) | 2006-07-14 | 2012-05-30 | Sony Europe Limited | System and method of audio/video streaming |
EP1879346A1 (en) | 2006-07-14 | 2008-01-16 | Sony Service Centre (Europe) N.V. | System and method of audio/video streaming |
US7706384B2 (en) * | 2007-04-20 | 2010-04-27 | Sharp Laboratories Of America, Inc. | Packet scheduling with quality-aware frame dropping for video streaming |
EP2094014A1 (en) | 2008-02-21 | 2009-08-26 | British Telecommunications Public Limited Company | Video streaming |
CN101686383B (zh) | 2008-09-23 | 2013-05-01 | Utc消防和保安美国有限公司 | 通过网络传输媒体的方法及系统 |
US8782267B2 (en) | 2009-05-29 | 2014-07-15 | Comcast Cable Communications, Llc | Methods, systems, devices, and computer-readable media for delivering additional content using a multicast streaming |
US20110289543A1 (en) | 2010-05-19 | 2011-11-24 | Goosen Hendrik A | Video streaming system including a fast channel change mechanism |
US20110289544A1 (en) | 2010-05-19 | 2011-11-24 | Goosen Hendrik A | Video streaming system including a fast channel change mechanism |
US9485546B2 (en) | 2010-06-29 | 2016-11-01 | Qualcomm Incorporated | Signaling video samples for trick mode video representations |
US8854418B2 (en) | 2011-05-23 | 2014-10-07 | Broadcom Corporation | Integrated media gateway processing and control to reduce latency for 2-way video conference applications |
US9819717B2 (en) | 2011-12-28 | 2017-11-14 | Intel Corporation | Video adaptation for content-aware wireless streaming |
US8863208B2 (en) | 2012-06-18 | 2014-10-14 | Micropower Technologies, Inc. | Synchronizing the storing of streaming video |
US9094737B2 (en) | 2013-05-30 | 2015-07-28 | Sonic Ip, Inc. | Network video streaming with trick play based on separate trick play files |
US20140359678A1 (en) | 2013-05-30 | 2014-12-04 | Sonic Ip, Inc. | Device video streaming with trick play based on separate trick play files |
US20150373075A1 (en) | 2014-06-23 | 2015-12-24 | Radia Perlman | Multiple network transport sessions to provide context adaptive video streaming |
US10972519B2 (en) | 2015-09-24 | 2021-04-06 | Flir Commercial Systems, Inc. | Real-time video streaming to client video element |
CN109891906B (zh) | 2016-04-08 | 2021-10-15 | 维斯比特股份有限公司 | 递送360°视频流的系统和方法 |
US10313417B2 (en) | 2016-04-18 | 2019-06-04 | Qualcomm Incorporated | Methods and systems for auto-zoom based adaptive video streaming |
US20180098131A1 (en) | 2016-09-30 | 2018-04-05 | Avago Technologies General Ip (Singapore) Pte. Ltd. | Apparatus and methods for adaptive bit-rate streaming of 360 video |
CN107295285B (zh) * | 2017-08-11 | 2018-07-27 | 腾讯科技(深圳)有限公司 | 视频数据的处理方法、处理装置及存储介质 |
EP3515075A1 (en) | 2018-01-23 | 2019-07-24 | THEO Technologies | Video streaming |
CN109672931B (zh) * | 2018-12-20 | 2020-03-20 | 北京百度网讯科技有限公司 | 用于处理视频帧的方法和装置 |
-
2020
- 2020-03-31 CN CN202010244868.1A patent/CN113473126B/zh active Active
-
2021
- 2021-03-24 EP EP21781084.5A patent/EP4099694A4/en active Pending
- 2021-03-24 WO PCT/CN2021/082611 patent/WO2021197157A1/zh unknown
-
2022
- 2022-09-29 US US17/956,156 patent/US11997314B2/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105827976A (zh) * | 2016-04-26 | 2016-08-03 | 北京博瑞空间科技发展有限公司 | 基于gpu的视频采集与处理装置及系统 |
CN106204488A (zh) * | 2016-07-12 | 2016-12-07 | 湖南翰博薇微电子科技有限公司 | 一种OpenCL加速的视频去雾方法 |
CN106507204A (zh) * | 2016-12-07 | 2017-03-15 | 腾讯科技(上海)有限公司 | 一种视频倒放方法和装置 |
CN108810085A (zh) * | 2017-04-28 | 2018-11-13 | 慧与发展有限责任合伙企业 | IoT数据的实时处理 |
CN108881916A (zh) * | 2018-06-21 | 2018-11-23 | 深圳市斯迈龙科技有限公司 | 远程桌面的视频优化处理方法及装置 |
Non-Patent Citations (1)
Title |
---|
See also references of EP4099694A4 * |
Also Published As
Publication number | Publication date |
---|---|
EP4099694A4 (en) | 2023-06-28 |
CN113473126A (zh) | 2021-10-01 |
US11997314B2 (en) | 2024-05-28 |
EP4099694A1 (en) | 2022-12-07 |
US20230034764A1 (en) | 2023-02-02 |
CN113473126B (zh) | 2023-03-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021197157A1 (zh) | 视频流的处理方法、装置、电子设备及计算机可读介质 | |
US11200426B2 (en) | Video frame extraction method and apparatus, computer-readable medium | |
WO2019001108A1 (zh) | 视频处理的方法和装置 | |
CN112073737B (zh) | 在直播视频流应用中重新编码预测的图像帧 | |
JP2003087785A (ja) | 動画像符号化データの形式変換方法及び装置 | |
JP7515546B2 (ja) | 中断可能な映像トランスコーディング | |
WO2021196994A1 (zh) | 编码的方法及装置、终端和存储介质 | |
CN113747242A (zh) | 图像处理方法、装置、电子设备及存储介质 | |
CN112261377A (zh) | web版监控视频播放方法、电子设备及存储介质 | |
KR101680545B1 (ko) | 파노라마 동영상 생성 서비스 제공 방법 및 장치 | |
US11095901B2 (en) | Object manipulation video conference compression | |
CN114222156A (zh) | 视频剪辑方法、装置、计算机设备和存储介质 | |
CN114205662A (zh) | iOS端的低延迟视频渲染方法及装置 | |
CN112261417B (zh) | 视频推送方法及系统、设备及可读存储介质 | |
CN113645448A (zh) | 一种适用于指挥调度的视频解码方法和装置 | |
CN112291483A (zh) | 视频推送方法及系统、电子设备及可读存储介质 | |
WO2022061723A1 (zh) | 一种图像处理方法、设备、终端及存储介质 | |
WO2024164736A1 (zh) | 视频处理方法及装置、计算机可读介质和电子设备 | |
WO2024164502A1 (zh) | 视频编码方法、装置、设备及存储介质 | |
WO2024120031A1 (zh) | 处理视频数据的方法、装置、计算机设备和存储介质 | |
CN116996695B (zh) | 一种全景图像压缩方法、装置、设备及介质 | |
WO2024183387A1 (zh) | 一种图像编解码方法、装置及系统 | |
CN115567720B (zh) | 视频传输方法、装置、存储介质和设备 | |
WO2024022427A1 (zh) | 视频录制方法、装置、设备、存储介质和程序产品 | |
CN118524238A (zh) | 一种视频关键帧的快照系统及方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21781084 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2021781084 Country of ref document: EP Effective date: 20220901 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |