WO2018126975A1 - 一种全景视频转码方法、装置和设备 - Google Patents
一种全景视频转码方法、装置和设备 Download PDFInfo
- Publication number
- WO2018126975A1 WO2018126975A1 PCT/CN2017/119195 CN2017119195W WO2018126975A1 WO 2018126975 A1 WO2018126975 A1 WO 2018126975A1 CN 2017119195 W CN2017119195 W CN 2017119195W WO 2018126975 A1 WO2018126975 A1 WO 2018126975A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- gpu
- video data
- module
- encoding
- video
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 69
- 238000013507 mapping Methods 0.000 claims abstract description 80
- 238000012545 processing Methods 0.000 claims abstract description 72
- 230000008569 process Effects 0.000 claims description 34
- 238000012856 packing Methods 0.000 claims description 17
- 230000001133 acceleration Effects 0.000 claims description 9
- 238000004806 packaging method and process Methods 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 7
- 238000002474 experimental method Methods 0.000 description 6
- 230000004044 response Effects 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 238000013459 approach Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000004883 computer application Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012858 packaging process Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/119—Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/40—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video transcoding, i.e. partial or full decoding of a coded input stream followed by re-encoding of the decoded output stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/2343—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
- H04N21/234309—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/20—Processor architectures; Processor configuration, e.g. pipelining
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/14—Coding unit complexity, e.g. amount of activity or edge presence estimation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/156—Availability of hardware or computational resources, e.g. encoding based on power-saving criteria
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/597—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/2343—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
- H04N21/440218—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4
Definitions
- the present invention relates to the field of computer application technologies, and in particular, to a panoramic video transcoding method, apparatus and device.
- the prior art proposes to map a panoramic video to multiple different viewing angles, so that each of the mapped videos has high definition in a specific viewing angle, and the portion that is farther away from the viewing angle gradually decreases in clarity. degree. In this way, the resolution of each mapped video is greatly reduced compared to the original panoramic video, so that the code rate of the transcoding is also reduced.
- the foregoing method in the prior art may first decode the original panoramic video as shown in FIG. 1; then map the decoded panoramic video to N viewing angles respectively to obtain N panoramic video, where N is a positive integer;
- the N panoramic video is separately encoded, and the encoded video stream is sliced and packaged for output.
- the mapping and encoding of the multi-channel panoramic video in the above process requires huge computational resources, which brings great pressure on the transcoding system currently deployed on the CPU, and it is difficult to perform real-time processing. Therefore, it can only be used for VR video on demand, and cannot meet the needs of VR video live broadcast.
- the present invention provides a panoramic video transcoding method, the method comprising:
- Some or all of the processing in decoding, mapping, and encoding is performed by a graphics processor GPU.
- the present invention also provides a panoramic video transcoding device, the device comprising:
- a decoding module configured to decode a panoramic video
- mapping module configured to map the decoded video data to N views, to obtain N channels of video data, where N is a preset positive integer
- An encoding module configured to respectively encode N channels of video data to obtain N channels of video streams
- a slice packing module configured to separately slice and package the N video streams
- One or more programs the one or more programs being stored in the memory, executed by the GPU or CPU to implement the following operations:
- the panoramic video transcoding part or all of the processing of decoding, mapping and encoding is performed by the GPU, and the method for accelerating the panoramic video transcoding by using the GPU resource is compared.
- the pure CPU architecture is used to transcode the panoramic video, which improves the real-time performance and meets the requirements of VR video live broadcast.
- FIG. 1 is a schematic diagram of a process of transcoding a panoramic video in the prior art
- FIG. 4 is a schematic diagram of a third video transcoding according to an embodiment of the present invention.
- FIG. 5 is a structural diagram of a device according to an embodiment of the present invention.
- FIG. 6 is a structural diagram of a device according to an embodiment of the present invention.
- the word “if” as used herein may be interpreted as “when” or “when” or “in response to determining” or “in response to detecting.”
- the phrase “if determined” or “if detected (conditions or events stated)” may be interpreted as “when determined” or “in response to determination” or “when detected (stated condition or event) “Time” or “in response to a test (condition or event stated)”.
- the transcoding of panoramic video refers to the process of converting the video stream into another video stream to adapt to different network bandwidths, different terminal processing capabilities and different user requirements.
- the following processes are mainly included:
- the decoding process is to decode the panoramic video and decode the panoramic video into one frame and one frame.
- the panoramic video can be obtained from a video source of the panoramic video, or can be locally stored panoramic video data.
- the encoding process is to encode the N channels of video data to obtain N channels of video streams.
- the slicing and packing process that is, separately slicing and packing the N video streams, and outputting them.
- mapping and encoding process brings great computational pressure.
- mapping and encoding need to perform N-way processing. Therefore, in the embodiment of the present invention, some or all of the above-mentioned decoding, mapping and encoding are performed by the GPU. If the GPU performs partial processing in the above decoding, mapping, and encoding, the remaining processing still has CPU execution, but the slicing and packing processing therein is executed by the CPU.
- the GPU it has powerful parallel computing capability, and the serial computing capability is weak. It is especially suitable for the mapping and encoding processing involved in the present invention. Therefore, the mapping and encoding processing can be preferentially performed by the GPU, but the slice and The packaging process is very unsuitable for GPU execution.
- the CPU it has powerful serial computing, and the parallel computing power is weak, so the slicing and packing processing is very CPU-executable.
- other types of processors can be used instead of the CPU, such as a DSP (Digital Signal Processor).
- the processing power of the GPU is limited. If the resolution of the panoramic video exceeds the processing power of the GPU, it needs to be decoded by the CPU. In the embodiment of the present invention, whether the resolution of the panoramic video is higher than a preset first threshold may be determined.
- the first threshold herein refers to the highest resolution of the panoramic video that the GPU can decode, and is actually processed by the GPU. Ability to decide.
- mapping processing it can be performed by a general purpose computing module in the GPU.
- the mapping here is to calculate and convert the decoded video data to a specific viewing angle, which will generate a large amount of calculation.
- the computing power of the GPU is very powerful, so the computing power of the image is stronger than that of the CPU of the same price or several times the price. Therefore, the mapping process is preferably performed by the GPU.
- M can be an empirical value or an experimental value. For example, after the experiment, it is determined that the decoded video data is mapped to the M perspective fashion which can meet the real-time requirement, and cannot be satisfied if mapped to M+1 perspectives. For real-time requirements, the M value is taken as the number of views of the GPU for mapping processing.
- the video data decoded by the GPU is mapped to the M views, and when the remaining views are mapped by the CPU, the total processing rate is the largest, and the M value can be taken as the number of views processed by the GPU.
- the M value can be taken as the number of views processed by the GPU.
- Whether the real-time requirement is satisfied may be reflected in the relationship between the mapping processing rate of the GPU and the input frame rate of the decoded video data. For example, if the mapping processing rate of the GPU is greater than or equal to the input frame rate, it is considered that the real-time requirement can be met; Otherwise, it can be considered that the real-time requirements cannot be met.
- whether the real-time requirement is satisfied may also be reflected in the status of the cache queue of the video data obtained by the buffer decoding, wherein the decoded video data is sent to the cache queue, and then the general computing module of the GPU is from the cache queue. The video data is read for mapping processing, and if the congestion status of the cache queue reaches a certain level, it can be considered that the real-time requirement cannot be met.
- the value of M can be flexibly determined by the CPU according to the actual processing capability of the GPU. For example, the CPU monitors the resource occupancy rate of the GPU, and if the GPU maps the M views, the resource is made. If the occupancy rate reaches the preset resource occupancy threshold, the CPU will take over the mapping process of the remaining NM perspectives.
- whether the real-time requirement is met can be reflected in the relationship between the encoding processing rate of the GPU and the input frame rate of the mapped video data. For example, if the encoding processing rate of the GPU is greater than or equal to the input frame rate, it is considered that the real-time requirement can be met. Otherwise, it can be considered that the real-time requirements cannot be met.
- whether the real-time requirement is met may also be reflected in the status of the cache queue of the video data obtained by the cache mapping, wherein the mapped video data is sent to the cache queue, and then the GPU's encoding hardware acceleration module is from the cache queue. The video data is read in units of frames for encoding processing. If the congestion status of the buffer queue reaches a certain level, it can be considered that the real-time requirement cannot be met.
- the value of P can be flexibly determined by the CPU according to the actual processing capability of the GPU. For example, the CPU monitors the resource occupancy rate of the GPU, and if the GPU encodes the video data of the P channel, If the resource occupancy rate reaches the preset resource occupancy threshold, the CPU will take over the encoding process of the remaining NP channel video data.
- the decoding process of the panoramic video is performed by the CPU, and the video data obtained by the decoding is mapped to N viewing angles, and the N-channel video data obtained by the mapping is separately encoded by the GPU, and each channel after encoding is performed.
- the video stream is sliced and packed and processed by the CPU.
- the above transcoding method can be applied to a panoramic video system such as a VR video system.
- the panoramic video system is mainly composed of a client and a service provider, wherein the service provider is responsible for sending video panoramic video data to the client, and the client is responsible for receiving and playing the panoramic video data.
- the above transcoding method provided by the embodiment of the present invention is implemented at a service providing end.
- the foregoing service provider may be located on the server side, that is, the panoramic video data is sent by the server side, or may be located on the user equipment side.
- the user equipment may also be provided by the user equipment.
- Provide panoramic video data as a service provider.
- FIG. 5 is a structural diagram of a device according to an embodiment of the present invention.
- the device may include: a decoding module 01, a mapping module 02, an encoding module 03, and a slice packing module 04.
- Some or all of the decoding module 01, the mapping module 02, and the encoding module 03 are implemented by the GPU.
- the part of the decoding module 01, the mapping module 02, and the encoding module 03 is determined by the GPU to be determined according to video attributes or GPU processing capabilities. .
- the mapping module 02 is responsible for mapping the decoded video data to N views to obtain N channels of video data.
- the N viewing angles are preset, and N is a preset positive integer, and is usually a value of 2 or more.
- the one-way panoramic video is converted into N-channel video data at different viewing angles.
- the encoding module 03 is responsible for respectively encoding the N channels of video data to obtain N channels of video streams.
- the slice packing module 04 is responsible for separately slicing and packing the N video streams.
- the first mapping module 021 is responsible for mapping the decoded video data to M views, MN.
- the second mapping module 022 is responsible for mapping the decoded video data to the remaining N-M views.
- the first mapping module 021 is implemented by a GPU, and the second mapping module 022 is implemented by a CPU.
- the value of M can be flexibly determined by the CPU according to the actual processing capability of the GPU. For example, the CPU monitors the resource occupancy rate of the GPU, and if the GPU maps the M views, the resource is made. If the occupancy rate reaches the preset resource occupancy threshold, the CPU will take over the mapping process of the remaining NM perspectives.
- the encoding module 03 can be implemented by an encoding hardware acceleration module of the GPU. Since the resolution of each video data is greatly reduced after mapping, it usually does not exceed the processing power of the GPU. However, if the resolution of each video data after mapping still exceeds the processing power of the GPU, the CPU performs encoding of each video data. In the embodiment of the present invention, whether the resolution of each video data after mapping is higher than a preset second threshold may be determined, where the second threshold refers to the highest resolution of the video data that the GPU can encode, by the GPU. The actual processing power is determined.
- the encoding module 03 may include: a first encoding module 031 and a second encoding module 032.
- the first encoding module 031 is responsible for encoding the P channel video data, P ⁇ N; the second encoding module 032 is responsible for encoding the remaining NP channel video data; wherein the first encoding module 031 is implemented by the GPU, and the second encoding module 032 is implemented by the CPU.
- the value of P can be flexibly determined by the CPU according to the actual processing capability of the GPU. For example, the CPU monitors the resource occupancy rate of the GPU, and if the GPU encodes the video data of the P channel, If the resource occupancy rate reaches the preset resource occupancy threshold, the CPU will take over the encoding process of the remaining NP channel video data.
- the above-described slice packing module 04 is implemented by a CPU.
- the VR video server uses a combination of CPU and GPU to transcode the VR panoramic video, and then sends the transcoded VR video stream to the VR video client.
- the decoding process, the mapping process, and the encoding process are allocated reasonably, and are executed by the GPU or jointly executed by the GPU and the CPU, thereby satisfying the real-time performance of the VR panoramic video live broadcast.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
- the above-described integrated unit implemented in the form of a software functional unit can be stored in a computer readable storage medium.
- the above software functional unit is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to perform the methods of the various embodiments of the present invention. Part of the steps.
- the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program codes. .
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
本发明提供了一种全景视频转码方法、装置和设备,其中方法包括:对全景视频进行解码处理;将解码得到的视频数据映射至N个视角,得到N路视频数据,所述N为预设的正整数;分别对N路视频数据进行编码,得到N路视频流;对N路视频流分别进行切片和打包处理;其中所述解码、映射和编码中的部分或全部处理由图形处理器GPU执行。本发明将全景视频转码中,解码、映射和编码中的部分或全部处理由GPU执行,这种利用GPU资源对全景视频转码进行加速的方式,相比较现有技术中采用纯CPU架构对全景视频进行转码的方式,提高了实时性,从而满足VR视频直播的需求。
Description
本申请要求2017年01月09日递交的申请号为201710013873.X、发明名称为“一种全景视频转码方法、装置和设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本发明涉及计算机应用技术领域,特别涉及一种全景视频转码方法、装置和设备。
随着用户对VR(virtual Reality,虚拟现实)全景视频的清晰度和流畅性的要求越来越高,如何在保证全景视频分辨率的同时降低码率成为了在VR技术领域中亟待解决的一个问题。
用户在观看全景视频的时候,在用户的视角往往只能够看到球型全景视频中的一小部分的内容。由于通常的全景视频在转码时,所有视角都保持了相同的分辨率和清晰度,这就导致了即便是用户无法看到的视角也保持了高分辨率,这就造成了转码时码率的浪费。针对这种情况,现有技术中提出了一种将全景视频映射到多路不同的视角上,使得每一路映射的视频在特定视角上具有高清晰度,在越远离这个视角的部分渐渐降低清晰度。这样,每一路映射的视频的分辨率相比较原始的全景视频得到大大降低,从而使得转码的码率也得到降低。
现有技术中的上述方法可以如图1中所示,首先对原始全景视频进行解码;然后将解码后的全景视频分别映射到N个视角上,得到N路全景视频,N为正整数;在将N路全景视频分别进行编码,再将编码后得到的视频流进行切片和打包后进行输出。然而,上述处理过程中对多路全景视频分别进行映射和编码,需要消耗巨大的计算资源,对目前部署在CPU上的转码系统带来了巨大压力,很难做到实时的处理。因此仅能够用于VR视频点播,无法满足VR视频直播的需求。
发明内容
有鉴于此,本发明提供了一种全景视频转码方法、装置和设备,从而能够满足VR视频直播的需求。
具体技术方案如下:
本发明提供了一种全景视频转码方法,该方法包括:
对全景视频进行解码处理;
将解码得到的视频数据映射至N个视角,得到N路视频数据,所述N为预设的正整数;
分别对N路视频数据进行编码,得到N路视频流;
对N路视频流分别进行切片和打包处理;
其中所述解码、映射和编码中的部分或全部处理由图形处理器GPU执行。
本发明还提供了一种全景视频转码装置,该装置包括:
解码模块,用于对全景视频进行解码处理;
映射模块,用于将解码得到的视频数据映射至N个视角,得到N路视频数据,所述N为预设的正整数;
编码模块,用于分别对N路视频数据进行编码,得到N路视频流;
切片打包模块,用于对N路视频流分别进行切片和打包处理;
其中所述解码模块、映射模块、编码模块中的部分或全部由GPU实现。
本发明还提供了一种设备,包括:
图形处理器GPU和中央处理器CPU;
存储器;
一个或者多个程序,所述一个或者多个程序存储在所述存储器中,被所述GPU或CPU执行以实现如下操作:
对全景视频进行解码处理;
将解码得到的视频数据映射至N个视角,得到N路视频数据,所述N为预设的正整数;
分别对N路视频数据进行编码,得到N路视频流;
对N路视频流分别进行切片和打包处理;
其中,所述解码、映射编码中的部分或全部处理由GPU执行。
由以上技术方案可以看出,本发明将全景视频转码中,解码、映射和编码中的部分或全部处理由GPU执行,这种利用GPU资源对全景视频转码进行加速的方式,相比较现有技术中采用纯CPU架构对全景视频进行转码的方式,提高了实时性,从而满足VR视频直播的需求。
图1为现有技术中全景视频转码的过程示意图;
图2为本发明实施例提供的第一种视频转码的示意图;
图3为本发明实施例提供的第二种视频转码的示意图;
图4为本发明实施例提供的第三种视频转码的示意图;
图5为本发明实施例提供的装置结构图;
图6为本发明实施例提供的设备结构图。
为了使本发明的目的、技术方案和优点更加清楚,下面结合附图和具体实施例对本发明进行详细描述。
在本发明实施例中使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本发明。在本发明实施例和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。
应当理解,本文中使用的术语“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
取决于语境,如在此所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”或“响应于检测”。类似地,取决于语境,短语“如果确定”或“如果检测(陈述的条件或事件)”可以被解释成为“当确定时”或“响应于确定”或“当检测(陈述的条件或事件)时”或“响应于检测(陈述的条件或事件)”。
本发明的核心思想在于,将现有技术中基于纯CPU的架构实现转变成基于CPU和GPU联合的架构实现。
全景视频的转码指的是将视频码流转换成另一个视频码流的处理,以适应不同的网络带宽、不同的终端处理能力和不同的用户需求。在全景视频转码的过程中主要包括以下几个处理过程:
解码处理,即对全景视频进行解码处理,将全景视频解码成一帧一帧的图像。该全景视频可以从全景视频的视频源获取,也可以是本地存储的全景视频数据。
映射处理,即将解码得到的视频数据映射至N个视角,得到N路视频数据。其中这 N个视角是预先设定的,N为预设的正整数,通常为2以上的值。映射的过程主要是将一帧一帧球型的全景视频图像映射至二维的平面图像,在此映射过程中,采用不同视角的模型,使得映射过程中,在特定视角上映射的图像在视角范围内的部分分辨率较高,越远离视角范围的部分分辨率越低。经过N个视角的映射,将一路全景视频转化为N路分别在不同视角上的视频数据。
编码处理,即分别对N路视频数据进行编码,得到N路视频流。
切片和打包处理,即分别对N路视频流进行切片和打包处理后,进行输出。
其中解码、映射和编码处理会带来巨大的计算压力,特别是映射和编码需要执行N路处理,因此在本发明实施例中由GPU执行上述解码、映射和编码中的部分或全部处理。若GPU执行上述解码、映射和编码中的部分处理,则剩余处理仍有CPU执行,但其中的切片和打包处理则由CPU执行。
对于GPU而言,其具有强大的并行计算能力,串行计算能力较弱,特别适合于本发明中涉及的映射和编码的处理,因此,映射和编码的处理可以优先采用GPU执行,但切片和打包处理就非常不适合GPU执行。对于CPU而言,其具有强大的串行计算,并行计算能力较弱,因此切片和打包处理非常CPU执行。当然除了CPU之外,还可以采用其他类型的处理器来代替CPU的工作,例如DSP(数字信号处理器)等。
对于解码处理而言,可以由GPU中的解码硬件加速模块来执行。但由
于GPU的处理能力有限,如果全景视频的分辨率超出GPU的处理能力,则需要由CPU进行解码。在本发明实施例中,可以对全景视频的分辨率是否高于预设的第一阈值进行判断,这里的第一阈值指的是GPU能够解码的全景视频的最高分辨率,由GPU的实际处理能力决定。
对于映射处理而言,可以由GPU中的通用计算模块来执行。这里的映射是将解码得到的视频数据通过计算和转换,映射到特定视角上,其会产生很大的计算量。而通常GPU对于图像的计算能力是非常强大的,因此相比较同等价格或者几倍价格的CPU而言,对图像的计算能力更强。因此,映射处理优选由GPU执行。
但可能存在这样的情况,需要映射的视角数量很大,将解码得到的视频数据映射到其中M个视角就能够将GPU通用计算模块的资源几乎消耗完,那么剩余的N-M个视角的映射处理可以由CPU来执行。其中M的取值可以采用经验值或试验值,例如经过试验后,确定将解码得到的视频数据映射到其中M个视角时尚能够满足实时性要求,若映射到M+1个视角时就无法满足实时性要求,则取M值作为GPU进行映射处理的视角数 量。或者经过试验后,确定由GPU将解码得到的视频数据映射到其中M个视角,其余视角由CPU进行映射时,总的处理速率最大,则可以取M值作为GPU进行映射处理的视角数量。当然还可以采用其他方式,在此不再一一穷举。
其中是否满足实时性要求可以体现在GPU的映射处理速率和解码得到的视频数据的输入帧率的关系上,例如若GPU的映射处理速率大于或等于输入帧率,则认为可以满足实时性要求;否则可以认为无法满足实时性要求。或者,是否满足实时性要求也可以体现在缓存解码得到的视频数据的缓存队列的状况上,其中解码后得到的视频数据会被送入该缓存队列,然后GPU的通用计算模块从该缓存队列中读取视频数据进行映射处理,若缓存队列的拥塞状况达到一定程度,则可以认为无法满足实时性要求。
M的值除了取经验值或试验值之外,还可以由CPU根据GPU的实际处理能力灵活确定,例如CPU对GPU的资源占用率进行监测,若GPU对其中M个视角的映射处理就使得资源占用率达到预设的资源占用率阈值,则CPU将接管剩余N-M个视角的映射处理。
对于编码处理而言,可以由GPU的编码硬件加速模块来执行。由于经过映射后,各路视频数据的分辨率会大大降低,通常不会超过GPU的处理能力。但若映射后各路视频数据的分辨率仍超过了GPU的处理能力,则由CPU执行各路视频数据的编码。本发明实施例中,可以对映射后各路视频数据的分辨率是否高于预设的第二阈值进行判断,这里的第二阈值指的是GPU能够编码的视频数据的最高分辨率,由GPU的实际处理能力决定。
另外,若各路视频数据的分辨率在GPU的处理能力之内,但可能存在这样的情况,其中P路视频数据的编码处理就能够将GPU的编码硬件加速模块的资源几乎消耗完,那么剩余的N-P路视频数据的编码处理可以由CPU来执行。其中P的取值可以采用经验值或试验值,例如经过试验后,确定GPU编码P路视频数据尚能够满足实时性要求,若编码P+1路视频数据时就无法满足实时性要求,则取P值作为GPU进行编码处理的路数。或者经过试验后,确定GPU编码P路视频数据,CPU编码剩余N-P路视频数据时,总的处理速率最大,则可以取P值作为GPU进行编码处理的路数。当然还可以采用其他方式,在此不再一一穷举。
同样,是否满足实时性要求可以体现在GPU的编码处理速率和映射得到的视频数据的输入帧率的关系上,例如若GPU的编码处理速率大于或等于输入帧率,则认为可以满足实时性要求;否则可以认为无法满足实时性要求。或者,是否满足实时性要求也可以 体现在缓存映射得到的视频数据的缓存队列的状况上,其中映射得到的视频数据会被送入该缓存队列,然后GPU的编码硬件加速模块从该缓存队列中以帧为单位读取视频数据进行编码处理,若缓存队列的拥塞状况达到一定程度,则可以认为无法满足实时性要求。
P的值除了取经验值或试验值之外,还可以由CPU根据GPU的实际处理能力灵活确定,例如CPU对GPU的资源占用率进行监测,若GPU对其中P路视频数据的编码处理就使得资源占用率达到预设的资源占用率阈值,则CPU将接管剩余N-P路视频数据的编码处理。
由以上描述可以看出,可以依据经验或预先的试验结果,确定解码、映射和编码中由GPU处理的部分。也可以至少依据视频属性(例如视频的分辨率、格式等)或GPU处理能力,动态确定解码、映射和编码中由GPU处理的部分。前一种方式实现起来比较简单,能有效提高处理效率。后一种方式更为侧重于实时性能,能在处理过程中根据视频和/或GPU的性能更加灵活地进行动态调整,能够更加充分合理地利用GPU资源及其强大的处理能力,同时也能够最大程度的满足全景视频的实时性要求。
下面举几个实施例:
实施例1:
如图2所示,全景视频的解码处理、对解码得到的视频数据映射至N个视角、将映射后得到的N路视频数据分别进行编码处理均由GPU执行,对编码后各路视频流进行切片和打包处理由CPU执行。
实施例2:
如图3所示,全景视频的解码处理由CPU执行,对解码得到的视频数据映射至N个视角、将映射后得到的N路视频数据分别进行编码处理均由GPU执行,对编码后各路视频流进行切片和打包处理由CPU执行。
实施例3:
如图4所示,全景视频的解码处理由GPU执行;对解码得到的视频数据映射至其中M个视角由GPU执行,映射至剩余N-M个视角由CPU执行;对映射后得到的N路视频数据中的其中P路分别进行编码处理由GPU执行,对剩余N-P路进行编码处理由CPU执行;对编码后各路视频流进行切片和打包处理由CPU执行。
上述仅为列举的几个实施例,但并非所有可实现的方式。对于本发明而言,CPU和GPU联合的架构是非常灵活的,能够根据GPU和CPU的处理能力灵活进行处理任务的配置,从而最大限度的利用GPU和CPU的资源来加速全景视频的转码,提高实时性, 不仅适用于VR点播系统,也能够满足VR直播系统的实时性需求。
上述转码方法可以应用于全景视频系统,诸如VR视频系统。全景视频系统主要由客户端和服务提供端组成,其中服务提供端负责向客户端发送视频全景视频数据,客户端负责接收并播放全景视频数据。本发明实施例提供的上述转码方法在服务提供端实现。但需要说明的是,上述的服务提供端可以位于服务器侧,即由服务器侧发送全景视频数据,也可以位于用户设备侧,例如若某用户设备具备提供全景视频数据的能力,也可以由用户设备作为服务提供端来提供全景视频数据。
以上是对本发明所提供的方法进行的描述,下面结合实施例对本发明提供的装置进行详述。
图5为本发明实施例提供的装置结构图,如图5所示,该装置可以包括:解码模块01、映射模块02、编码模块03和切片打包模块04。其中解码模块01、映射模块02、编码模块03中的部分或全部由GPU实现。优选地,解码模块01、映射模块02、编码模块03中的部分由GPU实现可以依据视频属性或GPU处理能力确定。。
解码模块01负责对全景视频进行解码处理。该全景视频可以从全景视频的视频源获取,也可以是本地存储的全景视频数据。
映射模块02负责将解码得到的视频数据映射至N个视角,得到N路视频数据。其中这N个视角是预先设定的,N为预设的正整数,通常为2以上的值。也就是说,将一路全景视频转化为N路分别在不同视角上的视频数据。
编码模块03负责分别对N路视频数据进行编码,得到N路视频流。
切片打包模块04负责对N路视频流分别进行切片和打包处理。
其中,解码模块01若由GPU实现,则可以具体由GPU中的解码硬件加速模块来实现。但由于GPU的处理能力有限,如果全景视频的分辨率超出GPU的处理能力,则需要由CPU进行解码。在本发明实施例中,若全景视频的分辨率高于预设的第一阈值,则解码模块01由中央处理器CPU实现;否则,由GPU实现。
映射模块02可以由GPU中的通用计算模块来执行。但可能存在这样的情况,需要映射的视角数量很大,将解码得到的视频数据映射到其中M个视角就能够将GPU通用计算模块的资源几乎消耗完,那么剩余的N-M个视角的映射处理可以由CPU来执行。具体地,映射模块02可以具体包括:第一映射模块021和第二映射模块022。
其中,第一映射模块021负责将解码得到的视频数据映射至其中M个视角,M N。第二映射模块022负责将解码得到的视频数据映射至剩下的N-M个视角。其中第一映射 模块021由GPU实现,第二映射模块022由CPU实现。
其中M的取值可以采用经验值或试验值,例如经过试验后,确定将解码得到的视频数据映射到其中M个视角时尚能够满足实时性要求,若映射到M+1个视角时就无法满足实时性要求,则取M值作为GPU进行映射处理的视角数量。或者经过试验后,确定由GPU将解码得到的视频数据映射到其中M个视角,其余视角由CPU进行映射时,总的处理速率最大,则可以取M值作为GPU进行映射处理的视角数量。当然还可以采用其他方式,在此不再一一穷举。
M的值除了取经验值或试验值之外,还可以由CPU根据GPU的实际处理能力灵活确定,例如CPU对GPU的资源占用率进行监测,若GPU对其中M个视角的映射处理就使得资源占用率达到预设的资源占用率阈值,则CPU将接管剩余N-M个视角的映射处理。
编码模块03可以由GPU的编码硬件加速模块来实现。由于经过映射后,各路视频数据的分辨率会大大降低,通常不会超过GPU的处理能力。但若映射后各路视频数据的分辨率仍超过了GPU的处理能力,则由CPU执行各路视频数据的编码。本发明实施例中,可以对映射后各路视频数据的分辨率是否高于预设的第二阈值进行判断,这里的第二阈值指的是GPU能够编码的视频数据的最高分辨率,由GPU的实际处理能力决定。
另外,若各路视频数据的分辨率在GPU的处理能力之内,但可能存在这样的情况,其中P路视频数据的编码处理就能够将GPU的编码硬件加速模块的资源几乎消耗完,那么剩余的N-P路视频数据的编码处理可以由CPU来执行。具体地,编码模块03可以包括:第一编码模块031和第二编码模块032。
第一编码模块031负责对其中P路视频数据进行编码,P≤N;第二编码模块032负责对剩下的N-P路视频数据进行编码;其中第一编码模块031由GPU实现,第二编码模块032由CPU实现。
其中P的取值可以采用经验值或试验值,例如经过试验后,确定GPU编码P路视频数据尚能够满足实时性要求,若编码P+1路视频数据时就无法满足实时性要求,则取P值作为GPU进行编码处理的路数。或者经过试验后,确定GPU编码P路视频数据,CPU编码剩余N-P路视频数据时,总的处理速率最大,则可以取P值作为GPU进行编码处理的路数。当然还可以采用其他方式,在此不再一一穷举。
P的值除了取经验值或试验值之外,还可以由CPU根据GPU的实际处理能力灵活确定,例如CPU对GPU的资源占用率进行监测,若GPU对其中P路视频数据的编码处 理就使得资源占用率达到预设的资源占用率阈值,则CPU将接管剩余N-P路视频数据的编码处理。
上述的切片打包模块04由CPU实现。
本发明实施例提供的上述方法和装置可以以设置并运行于设备中的计算机程序体现。该设备可以包括一个或多个处理器,还包括存储器和一个或多个程序,如图6中所示。其中该一个或多个程序存储于存储器中,被上述一个或多个处理器执行以实现本发明上述实施例中所示的方法流程和/或装置操作。例如,被上述一个或多个处理器执行的方法流程,可以包括:
对全景视频进行解码处理;
将解码得到的视频数据映射至N个视角,得到N路视频数据,所述N为预设的正整数;
分别对N路视频数据进行编码,得到N路视频流;
对N路视频流分别进行切片和打包处理;
其中处理器包括CPU和GPU,所述解码、映射和编码中的部分或全部处理由GPU执行。
在此列举一个本发明实施例适用的应用场景:
在VR视频直播系统中,VR视频服务器采用CPU和GPU联合的方式,对VR全景视频进行转码处理,然后将转码处理后的VR视频流发送给VR视频客户端。在VR视频服务器中,合理地对解码处理、映射处理、编码处理进行分配,由GPU执行或者由GPU和CPU联合执行,从而满足VR全景视频直播的实时性。
在本发明所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。
上述以软件功能单元的形式实现的集成的单元,可以存储在一个计算机可读取存储介质中。上述软件功能单元存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本发明各个实施例所述方法的部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本发明保护的范围之内。
Claims (18)
- 一种全景视频转码方法,其特征在于,该方法包括:对全景视频进行解码处理;将解码得到的视频数据映射至N个视角,得到N路视频数据,N为预设的正整数;分别对N路视频数据进行编码,得到N路视频流;对N路视频流分别进行切片和打包处理;其中所述解码、映射和编码中的部分或全部处理由图形处理器GPU执行。
- 根据权利要求1所述的方法,其特征在于,该方法还包括:至少依据视频属性或GPU处理能力,确定所述解码、映射和编码中由GPU处理的部分。
- 根据权利要求1所述的方法,其特征在于,若所述全景视频的分辨率高于预设的第一阈值,则由中央处理器CPU执行对所述全景视频的解码处理;否则,由GPU执行对所述全景视频的解码处理。
- 根据权利要求1所述的方法,其特征在于,所述将解码得到的视频数据映射至N个视角包括:由GPU执行将解码得到的视频数据映射至其中M个视角,M≤N;由CPU执行将解码得到的视频数据映射至剩下的N-M个视角。
- 根据权利要求1所述的方法,其特征在于,分别对N路视频数据进行编码包括:由GPU执行对其中P路视频数据进行编码,P≤N;由CPU执行对剩下的N-P路视频数据进行编码。
- 根据权利要求1所述的方法,其特征在于,由CPU执行所述对N路视频流分别进行切片和打包处理。
- 根据权利要求4所述的方法,其特征在于,M为经验值或试验值;或者M由CPU根据GPU的处理能力确定。
- 根据权利要求5所述的方法,其特征在于,P为经验值或试验值;或者P由CPU根据GPU的处理能力确定。
- 根据权利要求1所述的方法,其特征在于,若GPU执行所述编码处理,则由所述GPU中的编码硬件加速模块执行;若GPU执行所述映射处理,则由所述GPU中的通用计算模块执行;若GPU执行所述解码处理,则由所述GPU中的解码硬件加速模块执行。
- 一种全景视频转码装置,其特征在于,该装置包括:解码模块,用于对全景视频进行解码处理;映射模块,用于将解码得到的视频数据映射至N个视角,得到N路视频数据,N为预设的正整数;编码模块,用于分别对N路视频数据进行编码,得到N路视频流;切片打包模块,用于对N路视频流分别进行切片和打包处理;其中所述解码模块、映射模块、编码模块中的部分或全部由GPU实现。
- 根据权利要求10所述的装置,其特征在于,所述解码模块、映射模块、编码模块中的部分由GPU实现依据视频属性或GPU处理能力确定。
- 根据权利要求10所述的装置,其特征在于,若所述全景视频的分辨率高于预设的第一阈值,则所述解码模块由中央处理器CPU实现;否则,由GPU实现。
- 根据权利要求10所述的装置,其特征在于,所述映射模块包括:第一映射模块,用于将解码得到的视频数据映射至其中M个视角,M≤N;第二映射模块,用于将解码得到的视频数据映射至剩下的N-M个视角;其中所述第一映射模块由GPU实现,所述第二映射模块由CPU实现。
- 根据权利要求10所述的装置,其特征在于,所述编码模块包括:第一编码模块,用于对其中P路视频数据进行编码,P≤N;第二编码模块,用于对剩下的N-P路视频数据进行编码;其中所述第一编码模块由GPU实现,所述第二编码模块由CPU实现。
- 根据权利要求10所述的装置,其特征在于,所述切片打包模块由CPU实现。
- 根据权利要求13所述的装置,其特征在于,M为经验值或试验值;或者M由CPU根据GPU的处理能力确定。
- 根据权利要求14所述的装置,其特征在于,P为经验值或试验值;或者P由CPU根据GPU的处理能力确定。
- 一种处理设备,包括:图形处理器GPU和中央处理器CPU;存储器;一个或者多个程序,所述一个或者多个程序存储在所述存储器中,被所述GPU或CPU执行以实现如下操作:对全景视频进行解码处理;将解码得到的视频数据映射至N个视角,得到N路视频数据,N为预设的正整数;分别对N路视频数据进行编码,得到N路视频流;对N路视频流分别进行切片和打包处理;其中,所述解码、映射编码中的部分或全部处理由GPU执行。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/506,870 US11153584B2 (en) | 2017-01-09 | 2019-07-09 | Methods, apparatuses and devices for panoramic video transcoding |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710013873.XA CN108289228B (zh) | 2017-01-09 | 2017-01-09 | 一种全景视频转码方法、装置和设备 |
CN201710013873.X | 2017-01-09 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/506,870 Continuation US11153584B2 (en) | 2017-01-09 | 2019-07-09 | Methods, apparatuses and devices for panoramic video transcoding |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018126975A1 true WO2018126975A1 (zh) | 2018-07-12 |
Family
ID=62789227
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2017/119195 WO2018126975A1 (zh) | 2017-01-09 | 2017-12-28 | 一种全景视频转码方法、装置和设备 |
Country Status (3)
Country | Link |
---|---|
US (1) | US11153584B2 (zh) |
CN (1) | CN108289228B (zh) |
WO (1) | WO2018126975A1 (zh) |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110008102B (zh) * | 2019-04-10 | 2020-03-06 | 苏州浪潮智能科技有限公司 | 一种基于智能视频应用的服务器性能测试方法和系统 |
CN112399252B (zh) * | 2019-08-14 | 2023-03-14 | 浙江宇视科技有限公司 | 软硬解码控制方法、装置及电子设备 |
CN110418144A (zh) * | 2019-08-28 | 2019-11-05 | 成都索贝数码科技股份有限公司 | 一种基于nvidia gpu实现一入多出转码多码率视频文件的方法 |
CN111031389B (zh) * | 2019-12-11 | 2022-05-20 | Oppo广东移动通信有限公司 | 视频处理方法、电子装置和存储介质 |
CN111050179B (zh) * | 2019-12-30 | 2022-04-22 | 北京奇艺世纪科技有限公司 | 一种视频转码方法及装置 |
CN111741343B (zh) * | 2020-06-17 | 2022-11-15 | 咪咕视讯科技有限公司 | 视频处理方法及装置、电子设备 |
CN112543374A (zh) * | 2020-11-30 | 2021-03-23 | 联想(北京)有限公司 | 一种转码控制方法、装置及电子设备 |
CN114202479A (zh) * | 2021-12-09 | 2022-03-18 | 北京达佳互联信息技术有限公司 | 一种视频处理方法、装置、服务器及存储介质 |
CN114860440B (zh) * | 2022-04-29 | 2023-01-10 | 北京天融信网络安全技术有限公司 | Gpu显存管理方法及装置 |
CN115129470A (zh) * | 2022-06-24 | 2022-09-30 | 杭州海康威视数字技术股份有限公司 | 编解码资源分配方法、装置及电子设备 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102036043A (zh) * | 2010-12-15 | 2011-04-27 | 成都市华为赛门铁克科技有限公司 | 视频数据处理方法、装置及视频监控系统 |
US20120002004A1 (en) * | 2010-06-30 | 2012-01-05 | Apple Inc. | Immersive Navigation and Rendering of Dynamically Reassembled Panoramas |
CN103905741A (zh) * | 2014-03-19 | 2014-07-02 | 合肥安达电子有限责任公司 | 超高清全景视频实时生成与多通道同步播放系统 |
CN105898315A (zh) * | 2015-12-07 | 2016-08-24 | 乐视云计算有限公司 | 视频转码方法和装置系统 |
CN106162207A (zh) * | 2016-08-25 | 2016-11-23 | 北京字节跳动科技有限公司 | 一种全景视频并行编码方法和装置 |
CN106210726A (zh) * | 2016-08-08 | 2016-12-07 | 成都佳发安泰科技股份有限公司 | 根据cpu与gpu的使用率对视频数据进行自适应解码的方法 |
Family Cites Families (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8553028B1 (en) * | 2007-10-29 | 2013-10-08 | Julian Michael Urbach | Efficiently implementing and displaying independent 3-dimensional interactive viewports of a virtual world on multiple client devices |
US8503539B2 (en) * | 2010-02-26 | 2013-08-06 | Bao Tran | High definition personal computer (PC) cam |
US10127624B1 (en) * | 2012-12-28 | 2018-11-13 | Amazon Technologies, Inc. | Block mapping in high efficiency video coding compliant encoders and decoders |
US20150124171A1 (en) * | 2013-11-05 | 2015-05-07 | LiveStage°, Inc. | Multiple vantage point viewing platform and user interface |
CN104244019B (zh) * | 2014-09-18 | 2018-01-19 | 孙轩 | 一种全景视频影像室内分屏显示方法及显示系统 |
US10750153B2 (en) * | 2014-09-22 | 2020-08-18 | Samsung Electronics Company, Ltd. | Camera system for three-dimensional video |
US10546424B2 (en) * | 2015-04-15 | 2020-01-28 | Google Llc | Layered content delivery for virtual and augmented reality experiences |
CN105700547B (zh) * | 2016-01-16 | 2018-07-27 | 深圳先进技术研究院 | 一种基于导航飞艇的空中立体视频街景系统及实现方法 |
CN105791882B (zh) * | 2016-03-22 | 2018-09-18 | 腾讯科技(深圳)有限公司 | 视频编码方法及装置 |
US10044712B2 (en) * | 2016-05-31 | 2018-08-07 | Microsoft Technology Licensing, Llc | Authentication based on gaze and physiological response to stimuli |
US11153615B2 (en) * | 2016-06-02 | 2021-10-19 | Comet Technologies, Llc | Method and apparatus for streaming panoramic video |
US10482574B2 (en) * | 2016-07-06 | 2019-11-19 | Gopro, Inc. | Systems and methods for multi-resolution image stitching |
US10958834B2 (en) * | 2016-07-22 | 2021-03-23 | Immervision, Inc. | Method to capture, store, distribute, share, stream and display panoramic image or video |
EP3293723A3 (en) * | 2016-09-09 | 2018-08-15 | Samsung Electronics Co., Ltd. | Method, storage medium, and electronic device for displaying images |
US10244215B2 (en) * | 2016-11-29 | 2019-03-26 | Microsoft Technology Licensing, Llc | Re-projecting flat projections of pictures of panoramic video for rendering by application |
US10354690B2 (en) * | 2016-12-19 | 2019-07-16 | Modulus Media Systems, Inc. | Method for capturing and recording high-definition video and audio output as broadcast by commercial streaming service providers |
US11054886B2 (en) * | 2017-04-01 | 2021-07-06 | Intel Corporation | Supporting multiple refresh rates in different regions of panel display |
US10699364B2 (en) * | 2017-07-12 | 2020-06-30 | Citrix Systems, Inc. | Graphical rendering using multiple graphics processors |
-
2017
- 2017-01-09 CN CN201710013873.XA patent/CN108289228B/zh active Active
- 2017-12-28 WO PCT/CN2017/119195 patent/WO2018126975A1/zh active Application Filing
-
2019
- 2019-07-09 US US16/506,870 patent/US11153584B2/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120002004A1 (en) * | 2010-06-30 | 2012-01-05 | Apple Inc. | Immersive Navigation and Rendering of Dynamically Reassembled Panoramas |
CN102036043A (zh) * | 2010-12-15 | 2011-04-27 | 成都市华为赛门铁克科技有限公司 | 视频数据处理方法、装置及视频监控系统 |
CN103905741A (zh) * | 2014-03-19 | 2014-07-02 | 合肥安达电子有限责任公司 | 超高清全景视频实时生成与多通道同步播放系统 |
CN105898315A (zh) * | 2015-12-07 | 2016-08-24 | 乐视云计算有限公司 | 视频转码方法和装置系统 |
CN106210726A (zh) * | 2016-08-08 | 2016-12-07 | 成都佳发安泰科技股份有限公司 | 根据cpu与gpu的使用率对视频数据进行自适应解码的方法 |
CN106162207A (zh) * | 2016-08-25 | 2016-11-23 | 北京字节跳动科技有限公司 | 一种全景视频并行编码方法和装置 |
Also Published As
Publication number | Publication date |
---|---|
US20190335187A1 (en) | 2019-10-31 |
CN108289228A (zh) | 2018-07-17 |
CN108289228B (zh) | 2020-08-28 |
US11153584B2 (en) | 2021-10-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2018126975A1 (zh) | 一种全景视频转码方法、装置和设备 | |
US20190325652A1 (en) | Information Processing Method and Apparatus | |
US8406290B2 (en) | User sensitive information adaptive video transcoding framework | |
WO2017219896A1 (zh) | 视频流的传输方法及装置 | |
JP2009506456A5 (zh) | ||
US11528308B2 (en) | Technologies for end of frame detection in streaming content | |
CN110662100A (zh) | 一种信息处理方法、装置、系统和计算机可读存储介质 | |
US11418567B2 (en) | Media data transmission method, client, and server | |
TWI806479B (zh) | 點雲編解碼方法、裝置、電腦可讀介質及電子設備 | |
US11818382B2 (en) | Temporal prediction shifting for scalable video coding | |
EP3657316A1 (en) | Method and system for displaying virtual desktop data | |
WO2023040825A1 (zh) | 媒体信息的传输方法、计算设备及存储介质 | |
WO2024041239A1 (zh) | 一种沉浸媒体的数据处理方法、装置、设备、存储介质及程序产品 | |
KR102299615B1 (ko) | 컨텐트 분산 네트워크들에서 엠펙 미디어 전송 통합을 위한 방법 및 장치 | |
CN110868610B (zh) | 流媒体传输方法、装置、服务器及存储介质 | |
Timmerer et al. | Adaptive streaming of vr/360-degree immersive media services with high qoe | |
CN116366865A (zh) | 一种视频解码方法、装置、电子设备及介质 | |
US10893303B1 (en) | Streaming chunked media segments | |
KR102153554B1 (ko) | 미디어 데이터의 처리를 위한 mmt 장치 및 방법 | |
US20240129537A1 (en) | Method and apparatus for signaling cmaf switching sets in isobmff | |
WO2018120474A1 (zh) | 一种信息的处理方法及装置 | |
CN102355491A (zh) | 基于云计算的数字校园音视频分发系统和方法 | |
US20240089460A1 (en) | Scene-change detection at server for triggering client-side temporal frame buffer reset | |
US20230217041A1 (en) | Apparatus for transmitting 3d contents and method thereof | |
US11025969B1 (en) | Video packaging system using source encoding |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17890518 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 17890518 Country of ref document: EP Kind code of ref document: A1 |