CN112153401B - Video processing method, communication device and readable storage medium - Google Patents

Video processing method, communication device and readable storage medium Download PDF

Info

Publication number
CN112153401B
CN112153401B CN202011002978.3A CN202011002978A CN112153401B CN 112153401 B CN112153401 B CN 112153401B CN 202011002978 A CN202011002978 A CN 202011002978A CN 112153401 B CN112153401 B CN 112153401B
Authority
CN
China
Prior art keywords
video stream
fov
target video
pts
terminal device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011002978.3A
Other languages
Chinese (zh)
Other versions
CN112153401A (en
Inventor
金晶
王�琦
李康敬
陶嘉伟
潘兴浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
MIGU Video Technology Co Ltd
MIGU Culture Technology Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
MIGU Video Technology Co Ltd
MIGU Culture Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, MIGU Video Technology Co Ltd, MIGU Culture Technology Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202011002978.3A priority Critical patent/CN112153401B/en
Publication of CN112153401A publication Critical patent/CN112153401A/en
Application granted granted Critical
Publication of CN112153401B publication Critical patent/CN112153401B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/114Adapting the group of pictures [GOP] structure, e.g. number of B-frames between two anchor frames
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/236Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
    • H04N21/23602Multiplexing isochronously with the video sync, e.g. according to bit-parallel or bit-serial interface formats, as SDI
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/236Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
    • H04N21/2362Generation or processing of Service Information [SI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/239Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests
    • H04N21/2393Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests involving handling client requests
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/434Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams, extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
    • H04N21/4345Extraction or processing of SI, e.g. extracting service information from an MPEG stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/643Communication protocols
    • H04N21/6437Real-time Transport Protocol [RTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments

Abstract

The invention provides a video processing method, communication equipment and a readable storage medium, which solve the problem that the calculation consumption of VR terminal equipment is increased by the existing VR user visual angle data synchronization method. The method comprises the steps of receiving a first request sent by first terminal equipment, wherein the first request is used for requesting to acquire a first target video stream of a first FOV, and the first FOV is a field angle corresponding to second terminal equipment; receiving first FOV track information of a second target video stream sent by a second terminal device; obtaining a first target video stream of a first FOV according to the first FOV track information; and transmitting the first target video stream of the first FOV to the first terminal equipment. Therefore, the first terminal equipment can acquire the video stream of the field angle of the second terminal equipment, the second terminal equipment only needs to transmit the first FOV track information to the network equipment, the direct-broadcasting data stream does not need to be calculated and transmitted, and the calculation consumption of the VR terminal equipment is effectively reduced.

Description

Video processing method, communication device and readable storage medium
Technical Field
The embodiment of the invention relates to the technical field of multimedia communication, in particular to a video processing method, communication equipment and a readable storage medium.
Background
In a live Virtual Reality (VR) scene, when another device wants to see a live data stream of a moving view of a user end, rendered data is generally directly acquired on a terminal VR device through a video image mapping algorithm such as a spherical texture mapping plane method, and then is pushed to a back-end server through a Real-Time Messaging Protocol (RTMP) or other protocols by using a compression algorithm such as H264/H265 again, and then media streams are sent to other terminals synchronously; or the terminal VR device directly synchronizes rendered data, and then synchronizes the audio and video data to other devices for playing through an IP, bluetooth or High Definition Multimedia Interface (HDMI) line connection manner by reusing a compression algorithm such as H264/H265.
By the VR user visual angle data synchronization method, calculation consumption of VR terminal equipment is greatly increased, and VR electric quantity continuous capacity with the same capacity is reduced.
Disclosure of Invention
The embodiment of the invention provides a video processing method, communication equipment and a readable storage medium, which aim to solve the problem that the calculation consumption of VR terminal equipment is increased by the existing VR user visual angle data synchronization method.
In a first aspect, an embodiment of the present invention provides a video processing method, applied to a network device, including:
receiving a first request sent by a first terminal device, wherein the first request is used for requesting to acquire a first target video stream of a first field angle FOV, and the first FOV is a field angle corresponding to a second terminal device;
receiving first FOV track information of a second target video stream sent by the second terminal equipment, wherein the first target video stream is at least part of the second target video stream;
obtaining a first target video stream of the first FOV according to the first FOV track information;
and transmitting the first target video stream of the first FOV to the first terminal equipment.
Optionally, the first FOV track information includes: and the identification of the second target video stream, the identification of the second terminal equipment and FOV coordinate information corresponding to display time stamps PTS of N image frames, wherein N is the total number of the image frames in the second target video stream.
Optionally, before receiving the first request sent by the first terminal device, the method further includes:
acquiring a second target video stream;
and acquiring corresponding relation of each group of picture (GOP) data in the second target video stream, each GOP data and a display time stamp (PTS) of a picture frame in the second target video stream, and the PTS and a recording time Tr of each picture frame in the second target video stream.
Optionally, the obtaining, according to first FOV track information of a second target video stream sent by the second terminal device, a first target video stream of the first FOV includes:
determining a starting PTS of a first target video stream of the first FOV;
in the second target video stream, sequentially reading image group data corresponding to each image frame from the starting PTS until the PTS is finished;
restoring to obtain image data of each image frame in the image group data according to the PTS of each image frame in the second target video stream;
and processing the image data of each image frame according to the first FOV track information to obtain a first target video stream of the first FOV.
Optionally, in a case that the first request is used to request to acquire a first target video stream of a first FOV in real time, the starting PTS is a starting PTS of a latest GOP, the ending PTS is a last PTS in the second target video stream, and the latest GOP is a GOP corresponding to a latest PTS in first FOV track information uploaded by the second terminal device;
or, in a case that the first request is for requesting to acquire a target video stream of a first field angle FOV which is played from a first time, the starting PTS is a PTS corresponding to the first time in the second target video stream, and the ending PTS is a last PTS in the second target video stream.
Or, in a case that the first request is for requesting to acquire a target video stream of a first field angle FOV between a second time and a third time, the starting PTS is a PTS corresponding to the second time in the second target video stream, and the ending PTS is a PTS corresponding to the third time in the second target video stream.
Optionally, the sending the first target video stream of the first FOV to the first terminal device includes:
and sending the first target video stream of the first FOV to the first terminal device through a Content Delivery Network (CDN).
According to another aspect of the present invention, there is provided a video processing method applied to a terminal device, including:
acquiring a second target video stream;
analyzing the second target video stream to obtain first FOV track information of the second target video stream;
and uploading the first FOV track information to a network device.
Optionally, the analyzing the second target video stream to obtain the first FOV track information of the second target video stream includes:
analyzing the second target video stream to obtain the PTS of each image frame;
acquiring FOV coordinate information corresponding to the PTS of each image frame;
and obtaining first FOV track information of the second target video stream according to the FOV coordinate information corresponding to the PTS of each image frame.
Optionally, the first FOV track information includes: and the identification of the second target video stream, the identification of the second terminal equipment and FOV coordinate information corresponding to the PTS of the N image frames, wherein N is the total number of the image frames in the second target video stream.
Optionally, the uploading the first FOV track information to a network device includes:
uploading the first FOV track information to network equipment according to a preset byte stream structure;
the preset byte stream structure comprises data header information and a group of data volume information, wherein the data header information comprises an identifier of a second target video stream and an identifier of a second terminal device, each data volume information comprises a PTS of an image frame, and FOV coordinate information corresponding to the PTS of the image frame.
In accordance with another aspect of the present invention, there is provided a network device, including: a processor, a memory and a computer program stored on said memory and executable on said processor, said computer program realizing the steps of the video processing method as described above when executed by said processor.
According to still another aspect of the present invention, there is provided a terminal device including: a processor, a memory and a computer program stored on the memory and executable on the processor, which computer program, when executed by the processor, carries out the steps of the video processing method as described above.
According to a further aspect of the present invention, a computer-readable storage medium is provided, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the video processing method as set forth above.
In the embodiment of the invention, the first target video stream of the first FOV can be obtained according to the first FOV track information of the second target video stream sent by the second terminal device, and the first target video stream is at least a part of the second target video stream, so that the first terminal device can obtain the video stream of the field angle of the second terminal device.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the description below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive labor.
Fig. 1 is a schematic flow chart of a video processing method according to an embodiment of the present invention;
FIG. 2 shows a schematic coordinate diagram of a first FOV in an embodiment of the invention;
fig. 3 is a schematic diagram illustrating interaction between a server and a terminal device according to an embodiment of the present invention;
FIG. 4 is a second flowchart illustrating a video processing method according to an embodiment of the invention;
FIG. 5 is a diagram illustrating header information in an embodiment of the invention;
FIG. 6 is a diagram illustrating data volume information in an embodiment of the invention;
FIG. 7 is a diagram illustrating header information and body information in an embodiment of the present invention;
fig. 8 is a schematic structural diagram of a video processing apparatus according to an embodiment of the present invention;
fig. 9 is a schematic diagram illustrating an implementation structure of a network device according to an embodiment of the present invention;
fig. 10 is a second schematic structural diagram of a video processing apparatus according to an embodiment of the invention;
fig. 11 is a schematic diagram of an implementation structure of a terminal device according to an embodiment of the present invention.
Detailed Description
In order to make the technical problems, technical solutions and advantages of the present invention more apparent, the following detailed description is given with reference to the accompanying drawings and specific embodiments. In the following description, specific details are provided, such as specific configurations and components, merely to facilitate a thorough understanding of embodiments of the invention. It will therefore be apparent to those skilled in the art that various changes and modifications can be made in the embodiments described herein without departing from the scope and spirit of the invention. In addition, descriptions of well-known functions and constructions are omitted for clarity and conciseness.
It should be appreciated that reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
In various embodiments of the present invention, it should be understood that the sequence numbers of the following processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention. In addition, the terms "system" and "network" are often used interchangeably herein.
As shown in fig. 1, an embodiment of the present invention provides a video processing method, which is applied to a network device, where the network device may specifically be a server, and the method includes the following steps:
step 101: receiving a first request sent by a first terminal device, wherein the first request is used for requesting to acquire a first target video stream of a first field angle FOV, and the first FOV is a field angle corresponding to a second terminal device.
In this step, the first terminal device may send the first request to the network device through the CDN, so as to obtain a first target video stream of the FOV of the second terminal device.
Step 102: receiving first FOV track information of a second target video stream sent by the second terminal device, wherein the first target video stream is at least a partial video stream of the second target video stream.
The first FOV trace information is indicative of position information, such as coordinate information, of an image frame in the second target video stream at the second terminal device field angle.
The first target video stream may be all video streams of the second target video stream, that is, the first target video stream is the same as the second target video stream, or may be a partial video stream of the second target video stream, for example, the first target video stream is a video stream in a certain time period of the second target video stream.
The first target video stream and the second target video stream have the same identification.
It should be noted that there is no restriction on the order between the step 102 and the step 101, that is, the step 102 may be executed first and then the step 101 is executed, or the step 101 may be executed first and then the step 102 is executed.
Step 103: and obtaining a first target video stream of the first FOV according to the first FOV track information.
Here, the network device obtains the first target video stream of the first FOV by combining the second target video stream according to the first FOV track information sent by the second terminal device.
Step 104: and transmitting the target video stream of the first FOV to the first terminal equipment.
It should be noted that, in the embodiment of the present invention, the first target video stream may be transmitted in real time, for example, a part of video data of the first target video stream is obtained according to the first FOV track information, and then the obtained part of video data is sent to the first terminal device, without obtaining a complete first target video stream and then sending the complete first target video stream to the first terminal device.
According to the video processing method provided by the embodiment of the invention, the first target video stream of the first FOV can be obtained according to the first FOV track information of the second target video stream sent by the second terminal device, and the first target video stream is at least part of the second target video stream, so that the first terminal device can obtain the video stream of the field angle of the second terminal device.
Optionally, the first FOV track information includes: and the identification of the second target video stream, the identification of the second terminal equipment and FOV coordinate information corresponding to display time stamps PTS of N image frames, wherein N is the total number of the image frames in the second target video stream.
In the embodiment of the invention, the second terminal device plays the second target video stream through the VR player, renders a corresponding picture in an FOV mode, and simultaneously constructs a corresponding relation among the identifier of the second target video stream, the identifier of the second terminal device and FOV coordinate information corresponding to the display time stamps PTS of the N image frames to obtain the first FOV track information.
Specifically, the VR player of the second terminal device obtains a playing address of the second target video stream (for example, a target website, returns to the ts list, and then continues to request ts fragment data), thereby obtaining a VR live broadcast real-time data stream (the second target video stream) at the back end, and analyzing the PTS of each frame (I frame, B frame, and P frame): t _ pts (m) and restoring FOV view information according to the spherical texture mapping plane method view angle, as shown in fig. 2, after decoding, acquiring the T _ pts (m) time image shift left position Vp _ x (T _ pts (m)) (distance from left edge of full depth stream picture), shift bottom position Vp _ y (T _ pts (m)) (distance from bottom edge of full depth stream picture), FOV view height Vp _ h (T _ pts (m)) (height of FOV picture identified in fig. 2, i.e. distance between bottom edge and fixed edge of first FOV picture) and FOV view width Vp _ w (T _ pts (m)) (width of first FOV picture identified in fig. 2, i.e. distance between left edge and right edge of first FOV picture), and recording play frame m at the same time, and constructing ID (identification of second terminal device), and, S (identification of the second object video stream) and FOV coordinate information of the PTS of the image frame, to obtain first FOV track information:
Figure BDA0002694952560000071
the second terminal equipment uploads to the back-end server in a byte stream mode, after long connection is established between the second terminal equipment and the back-end server, the S and SID are pushed firstly, then Qp structure byte stream is continuously pushed in real time, first FOV track information is sent to the server, the server analyzes S and SID user attributes, then a relation between a subsequent splicing stream and S, SID is bound based on the long connection, a new index is established locally, and the new index relation is as follows:
the data structure is referred to as a filename by S _ ID, and,
Figure BDA0002694952560000072
optionally, in this embodiment of the present invention, before receiving the first request sent by the first terminal device, the method further includes:
acquiring a second target video stream;
and acquiring corresponding relation of each group of picture (GOP) data in the second target video stream, each GOP data and a display time stamp (PTS) of each image frame in the second target video stream, and the PTS and a recording time Tr of each image frame in the second target video stream.
In this application embodiment, the network equipment high in the clouds gathers in real time and records the live VR stream, and is specific, with the live environment real-time live stream of input VR, the server is through partitioning again, encoding after handling the video packing, synthesizes the live VR stream of the clear VR of one way superelevation. VR real-time live stream is time synchronized with ntp server. And the VR live recording server analyzes the data set D of each GOP, stores the data set D, and simultaneously analyzes the name S of the live stream, the PTS time T _ PTS of each frame and the recording time Tr of each frame. The data set D contains data for each GOP (a group of I, B, P frames). The data set D constructs each frame data of a set of GOPs where the frames are located, and marks the GOP frame set a as D (T _ pts). That is, a group of media data of a GOP at the T _ pts (i) th time is set a (T _ pts (i)). All media streams of the set of I frames, B frames, and P frames are recorded in the set a. Because the PTS of the data in the GOP is different for each frame, a relationship is established:
D(T_pts(x0))=D(T_pts(x0+1))=D(T_pts(x0+2))=……=D(T_pts(x0+x))=A(T_pts(x0));
wherein T _ PTS (x0) represents the PTS of the first frame of the GOP, and T _ PTS (x0+ x) represents the PTS of the last frame of the GOP, that is, it is clear that T _ PTS can obtain corresponding a media data, that is, each image frame in the GOP corresponds to the same group of media data, for example, the media data corresponding to each image frame in GOP0 is a 0.
In the embodiment of the invention, the relation Tr (x), T _ pts (x) is also recorded, and a relation II of a VR live broadcast recording set is constructed:
Figure BDA0002694952560000081
wherein: i represents a picture frame corresponding to the current recording time, and i starts counting from 1 and starts recording from recording; n represents that n frames of pictures exist, and n is increased along with recording; qr (S, n) represents a collection set of n frames of pictures recorded by S stream; t _ PTS (i) is the PTS of the ith frame, and a media data can be derived from T _ PTS (i) according to the above-mentioned relation.
Optionally, the obtaining, according to first FOV track information of a second target video stream sent by the second terminal device, a first target video stream of the first FOV includes:
determining a starting PTS of a first target video stream of the first FOV;
in the second target video stream, sequentially reading image group data corresponding to each image frame from the starting PTS until the PTS is finished;
restoring to obtain image data of each image frame in the image group data according to the PTS of each image frame in the second target video stream;
and processing the image data of each image frame according to the first FOV track information to obtain a first target video stream of the first FOV.
In the embodiment of the invention, according to a first request, a network device determines an initial PTS of a first target video stream of a first FOV, then, according to the first relation, sequentially reads image group data corresponding to each image frame from the initial PTS in a second target video stream until the PTS is finished, and restores and obtains image data of each image frame in the image group data according to the PTS of each image frame, the audio data is extracted according to original file audio data, processed according to a spherical texture mapping plane method and the like according to first FOV track information, corresponding FOV data is cut out, after a new GOP is cached (the size of the new GOP is consistent with the size of the corresponding original GOP by default), the new GOP is compressed by using H264/H265 again, and an RTMP stream is output in real time according to PTS intervals and accelerated to a first terminal device through a CDN network.
Optionally, when the first request is used to request to obtain a first target video stream of a first FOV in real time, the starting PTS is a starting PTS of a latest GOP, the ending PTS is a last PTS in the second target video stream, and the latest GOP is a GOP corresponding to a latest PTS in first FOV track information uploaded by the second terminal device;
or, in a case that the first request is for requesting to acquire a target video stream of a first field angle FOV that is played from a first time, the starting PTS is a PTS corresponding to the first time in the second target video stream, and the ending PTS is a last PTS in the second target video stream.
Or, in a case that the first request is for requesting to acquire a target video stream of a first field angle FOV between a second time and a third time, the starting PTS is a PTS corresponding to the second time in the second target video stream, and the ending PTS is a PTS corresponding to the third time in the second target video stream.
Optionally, in this embodiment of the application, the sending the first target video stream of the first FOV to the first terminal device includes:
and sending the first target video stream of the first FOV to the first terminal device through a Content Delivery Network (CDN).
Here, the CDN may distribute the first target video stream of the first FOV to the plurality of terminal apparatuses, thereby providing a capability of distributing the synchronized video stream to the plurality of terminal apparatuses.
The following describes a video processing method according to an embodiment of the present invention with reference to a specific application scenario.
As shown in fig. 3, assuming that a user K (ID 123) sends an original request requesting a video stream name S of live1, the network device constructs media data a per GOP of the stream S and PTS of each frame according to the request. And the user K acquires the VR live stream (S) through the HLS. The user K renders a visual angle corresponding to the first FOV on the VR device of the user K, live1 and 123, PTS information of each frame and first FOV coordinate information are uploaded, and the network device always records the FOV coordinate information and PTS information of the S stream requested by the user K. When other users (e.g., K1 user, K2 user, Kn user) initiate a request, the network device provides the live1 of the first FOV to the other users based on coordinate information of the first FOV, and the like. And after other users are accelerated through the CDN, obtaining S and SID stream information corresponding to the user K, and rendering a motion track shown by the user SID in real time according to the PTS.
Scene 1: the K1 user requests a stream of view trajectories for the K user FOVs.
The server resolves, from the first request, the stream name S (in this case live1) and the second terminal device ID, i.e. SID (in this case 123), according to the user view track set Qp (n) recorded by the VR recording server for continuously reading the stream S and the SID by the live1_123 server, the server side acquires the latest GOP start PTS time, meanwhile, the GOP information and the visual angle coordinate position information corresponding to the PTS are searched according to the live 1-123 index file, the A media data of each frame are read and analyzed frame by frame, the A media data restore the media data of the corresponding frame according to the PTS, the audio data are extracted according to the original file audio data, and (3) according to a spherical texture mapping plane method and the like, data corresponding to the FOV are cut off, after a new GOP1 is cached (the size of the GOP1 is consistent with that of the original GOP by default), H264/H265 is used for compression again, and a real-time message transport protocol RTMP stream is output in real time at pts intervals and accelerated to a K1 user through a CDN.
Scene 2: the K2 user requests the K user to specify a time to begin viewing the user's FOV data.
The server analyzes the stream name S (live 1 in this example) and the ID of the second terminal device, i.e. SID (123 in this example) and the play start time T1 according to the first request, acquires the corresponding PTS from the Qr (S, n) data according to live1 and T1, and takes the left value T _ PTS (i) if T _ PTS (i) < ═ T1< T _ PTS (i + 1).
And the recording server searches GOP information and view coordinate position information corresponding to the PTS according to T _ PTS (i) and live1_123 index files, starts to read the A media data of each frame by frame, analyzes the A media data, restores the media data of the corresponding frame according to the PTS, extracts the audio data according to the original file audio data, clips the data corresponding to the FOV according to a spherical texture mapping plane method and the like, caches a new GOP1 (the size of the GOP1 is consistent with the original GOP by default), compresses the data by H264/H265 again, and outputs RTMP stream to a K2 user in real time according to PTS intervals and a CDN network.
Scene 3: the K3 user requests the K user to view the FOV data for the user for a specified period of time.
The streaming media server parses the stream name S (live 1 in this case), SID (123 in this case), start playing time t1 and end playing time t2 according to the first request, first obtains the corresponding PTS1 and PTS2 from Qr (S, n) data according to live1, t1 and t2,
t _ pts (i) < ═ T1< T _ pts (i +1), taking the left value T _ pts (i).
T _ pts (j) < ═ T2< T _ pts (j +1), and the right value T _ pts (j +1) is taken.
And the recording server records the start GOP information, the end GOP information and the view coordinate position information corresponding to the PTS according to the T _ PTS (i) and the live1_123 index file, starts reading the A media data of each frame by frame, analyzes the A media data, restores the media data of the corresponding frame according to the PTS, extracts the audio data according to the original file audio data, clips the data of the corresponding FOV according to a spherical texture mapping plane method and the like, caches the new GOP1 (the size of the GOP1 is consistent with the original GOP by default), compresses the audio data by H264/H265, outputs RTMP stream to a K3 user in real time according to PTS intervals, passes through a CDN network until the end GOP information is read, and ends the output.
In the embodiment of the application, the capability of the rear end of the server is fully utilized to carry out FOV live broadcast real-time sharing, the uploading bandwidth of the second terminal equipment is reduced, and the consumption caused by secondary coding calculation of the second terminal equipment is reduced. The cloud end is used for solving the problem in real time in a unified mode, so that the calculated amount of the terminal equipment side is reduced.
According to the video processing method provided by the embodiment of the invention, the first target video stream of the first FOV can be obtained according to the first FOV track information of the second target video stream sent by the second terminal device, and the first target video stream is at least part of the second target video stream, so that the first terminal device can obtain the video stream of the field angle of the second terminal device.
As shown in fig. 4, an embodiment of the present application further provides a video processing method, which is applied to a terminal device, where the terminal device may specifically be the second terminal device, and the method includes:
step 401: a second target video stream is obtained.
Here, the second target video stream is a VR live stream acquired and recorded by the network device in real time. And the network equipment sends the collected and recorded VR live stream to the second terminal equipment.
Step 402: and analyzing the second target video stream to obtain first FOV track information of the second target video stream.
The VR player of the second terminal device obtains the playing address of the second target video stream (such as a target website, returns to the ts list, and continues to request ts fragment data), so as to obtain the rear VR live broadcast real-time data stream (the second target video stream), and analyzes the PTS of each frame (I frame, B frame and P frame): and T _ PTS (m), simultaneously restoring FOV (field of view) visual angle information according to a spherical texture mapping plane method visual angle, decoding and then acquiring a left position Vp _ x (T _ PTS (m)) of image shift at the time, a bottom position Vp _ y (T _ PTS (m)), a high FOV visual angle Vp _ h (T _ PTS (m) and a FOV visual angle width Vp _ w (T _ PTS (m)), simultaneously recording a playing frame m at the time, and constructing a corresponding relation among ID (identification of a second terminal device), S (identification of a second target video stream) and FOV coordinate information of PTS (partial description) of an image frame to obtain first FOV track information:
Figure BDA0002694952560000121
in this step, when the second terminal device plays the video stream based on the first field angle, the user ID requests the S stream address to generate a Qp (ID, S, n) set in real time, so as to form the first FOV track information.
Step 403: and uploading the first FOV track information to a network device.
The network device may be a cloud server. Here, the first FOV track information may be uploaded to the network device in a byte stream.
According to the video processing method, the first FOV track information of the second target video stream is uploaded to the network equipment, and the network equipment can obtain the first target video stream of the first FOV according to the first FOV track information, so that the first terminal equipment can obtain the video stream of the field angle of the second terminal equipment.
Optionally, the analyzing the second target video stream to obtain the first FOV track information of the second target video stream includes:
analyzing the second target video stream to obtain the PTS of each image frame;
acquiring FOV coordinate information corresponding to the PTS of each image frame;
and obtaining first FOV track information of the second target video stream according to the FOV coordinate information corresponding to the PTS of each image frame.
In the embodiment of the invention, the second terminal device plays the second target video stream through the VR player, renders a corresponding picture in an FOV mode, and simultaneously constructs a corresponding relation among the identifier of the second target video stream, the identifier of the second terminal device and FOV coordinate information corresponding to the display time stamps PTS of the N image frames to obtain the first FOV track information.
Specifically, the VR player of the second terminal device obtains the playing address of the second target video stream (e.g., the target website, returns to the ts list, and then continues to request ts fragment data), so as to obtain the VR live broadcast real-time data stream (the second target video stream) at the back end, and analyzes the PTS of each frame (I frame, B frame, and P frame): t _ pts (m), and simultaneously restoring FOV view information according to the spherical texture mapping plane method view angle, as shown in fig. 2, after decoding, acquiring the position Vp _ x (T _ pts (m)) of the left side of the image shift at time T _ pts (m) (distance from the left edge of the full depth stream picture), the position Vp _ y (T _ pts (m)) of the shifted bottom edge (distance from the bottom edge of the full depth stream picture), the FOV view height Vp _ h (T _ pts (m) (height of the FOV picture identified in fig. 2, i.e., distance between the bottom edge and the fixed edge of the first FOV picture), and the FOV view width Vp _ w (T _ pts (m) (width of the first FOV picture identified in fig. 2, i.e., distance between the left edge and the right edge of the first FOV picture), recording the play frame m at this time, and constructing ID (identification of the second terminal device), as well as, The corresponding relation between S (the identification of the second object video stream) and the FOV coordinate information of the PTS of the image frame, obtains the first FOV track information:
Figure BDA0002694952560000131
optionally, the first FOV track information includes: and the identification of the second target video stream, the identification of the second terminal equipment and FOV coordinate information corresponding to the PTS of the N image frames, wherein N is the total number of the image frames in the second target video stream.
The first FOV track information is described in detail in the embodiment on the network device side, and is not described herein again.
Optionally, the uploading the first FOV track information to a network device includes:
uploading the first FOV track information to network equipment according to a preset byte stream structure;
the preset byte stream structure comprises data header information and a group of data volume information, wherein the data header information comprises an identifier of a second target video stream and an identifier of a second terminal device, each data volume information comprises a PTS of an image frame, and FOV coordinate information corresponding to the PTS of the image frame.
As shown in fig. 5, the header information includes: FOV structure flag bit, extension flag bit, total length, SID, and stream name (S). As shown in fig. 6, the volume information includes FOV structure flag bits, extension flag bits, current packet length, time stamp Tp, frame Pts, Vx, Vy, resolution high (i.e., Vp _ h (T _ Pts (m)), resolution wide (Vp _ w (T _ Pts (m)), as shown in fig. 7, is a combination of header information and a set of volume information, and the header information appears once every 1 second.
The video processing method of the embodiment of the application constructs a specific index format based on the user visual angle coordinate, PTS of I, B, P frames in the visual angle stream and the like, and the specific index format is returned according to the real-time stream and recorded by the back-end server in real time. After the VR display terminal receives a VR panoramic deep video display instruction, the server can provide real-time recorded FOV service according to the request characteristics, a solution for low-delay broadcasting of a specific user visual angle to a multi-user audio and video acquisition transmission distribution layer, which is acquired and recorded by a user visual angle cloud, is provided, the solution is different from an original transmission mode, and the pressure of a communication network is simplified.
As shown in fig. 8, an embodiment of the present application further provides a video processing apparatus, applied to a network device, including:
a first receiving module 801, configured to receive a first request sent by a first terminal device, where the first request is used to request to acquire a first target video stream of a first field angle FOV, where the first field angle FOV is a field angle corresponding to a second terminal device;
a second receiving module 802, configured to receive first FOV track information of a second target video stream sent by the second terminal device, where the first target video stream is at least a partial video stream of the second target video stream;
a first obtaining module 803, configured to obtain a first target video stream of the first FOV according to the first FOV track information;
a first sending module 804, configured to send the first target video stream of the first FOV to the first terminal device.
In the video processing apparatus according to the embodiment of the present application, the first FOV track information includes: and the identification of the second target video stream, the identification of the second terminal equipment and FOV coordinate information corresponding to display time stamps PTS of N image frames, wherein N is the total number of the image frames in the second target video stream.
The video processing apparatus according to the embodiment of the present application further includes:
the second obtaining module is used for obtaining a second target video stream before the first receiving module receives the first request sent by the first terminal equipment;
and the third acquisition module is used for acquiring the corresponding relation between each group of picture (GOP) data in the second target video stream, each GOP data and the display time stamp (PTS) of the image frames in the second target video stream, and the PTS and the recording time Tr of each image frame in the second target video stream.
In the video processing apparatus according to the embodiment of the present application, the first obtaining module 803 includes:
a determining submodule for determining a starting PTS of a first target video stream of the first FOV;
the reading sub-module is used for sequentially reading image group data corresponding to each image frame from the starting PTS in the second target video stream until the PTS is finished;
the restoring submodule is used for restoring and obtaining image data of each image frame in the image group data according to the PTS of each image frame in the second target video stream;
and the first acquisition submodule is used for processing the image data of each image frame according to the first FOV track information to obtain a first target video stream of the first FOV.
In the video processing apparatus according to the embodiment of the application, when the first request is used to request to acquire a first target video stream of a first FOV in real time, the starting PTS is a starting PTS of a latest GOP, the ending PTS is a last PTS in the second target video stream, and the latest GOP is a GOP corresponding to a latest PTS in first FOV track information uploaded by a second terminal device;
or, in a case that the first request is for requesting to acquire a target video stream of a first field angle FOV which is played from a first time, the starting PTS is a PTS corresponding to the first time in the second target video stream, and the ending PTS is a last PTS in the second target video stream.
Or, in a case that the first request is for requesting to acquire a target video stream of a first field angle FOV between a second time and a third time, the starting PTS is a PTS corresponding to the second time in the second target video stream, and the ending PTS is a PTS corresponding to the third time in the second target video stream.
In the video processing apparatus in the embodiment of the present application, the first sending module is configured to send the first target video stream of the first FOV to the first terminal device through a content delivery network CDN.
It should be noted that the apparatus is an apparatus corresponding to the video processing method applied to the network device side, and all implementation manners in the method embodiments are applicable to the embodiment of the apparatus, and the same technical effect can be achieved.
According to the video processing device in the embodiment of the application, the first target video stream of the first FOV can be obtained according to the first FOV track information of the second target video stream sent by the second terminal device, and the first target video stream is at least part of the second target video stream, so that the first terminal device can obtain the video stream of the field angle of the second terminal device.
As shown in fig. 9, an embodiment of the present application further provides a network device, optionally, the network device is a cloud server, and the network device includes: a transceiver 903, a processor 901, a memory 902 and a computer program stored on the memory 902 and executable on the processor 901, the processor 901 implementing the steps of the video processing method described above when executing the computer program. Specifically, the transceiver 903 is configured to receive a first request sent by a first terminal device, where the first request is used to request to acquire a first target video stream of a first field angle FOV, where the first field angle FOV is a field angle corresponding to a second terminal device; receiving first FOV track information of a second target video stream sent by the second terminal equipment, wherein the first target video stream is at least part of the second target video stream; the processor 901 is configured to obtain a first target video stream of the first FOV according to the first FOV track information; the transceiver 903 is configured to transmit the first target video stream of the first FOV to the first terminal device.
Optionally, the first FOV track information includes: and the identification of the second target video stream, the identification of the second terminal equipment and FOV coordinate information corresponding to display time stamps PTS of N image frames, wherein N is the total number of the image frames in the second target video stream.
Optionally, the processor 901 is further configured to: acquiring a second target video stream; and acquiring corresponding relation of each group of picture (GOP) data in the second target video stream, each GOP data and a display time stamp (PTS) of a picture frame in the second target video stream, and the PTS and a recording time Tr of each picture frame in the second target video stream.
Optionally, the processor 901 is further configured to: determining a starting PTS of a first target video stream of the first FOV; in the second target video stream, sequentially reading image group data corresponding to each image frame from the starting PTS until the PTS is finished; restoring to obtain image data of each image frame in the image group data according to the PTS of each image frame in the second target video stream; and processing the image data of each image frame according to the first FOV track information to obtain a first target video stream of the first FOV.
Optionally, when the first request is used to request to obtain a first target video stream of a first FOV in real time, the starting PTS is a starting PTS of a latest GOP, the ending PTS is a last PTS in the second target video stream, and the latest GOP is a GOP corresponding to a latest PTS in first FOV track information uploaded by the second terminal device;
or, in a case that the first request is for requesting to acquire a target video stream of a first field angle FOV which is played from a first time, the starting PTS is a PTS corresponding to the first time in the second target video stream, and the ending PTS is a last PTS in the second target video stream.
Or, in a case that the first request is for requesting to acquire a target video stream of a first field angle FOV between a second time and a third time, the starting PTS is a PTS corresponding to the second time in the second target video stream, and the ending PTS is a PTS corresponding to the third time in the second target video stream.
Optionally, the processor 901 is further configured to: and sending the first target video stream of the first FOV to the first terminal device through a Content Delivery Network (CDN).
The bus architecture may include any number of interconnected buses and bridges, with one or more processors 901, represented by processor 901, and various circuits of memory 902, represented by memory 902, being linked together. The bus architecture may also link together various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. The bus interface provides an interface. The transceiver 903 may be a number of elements including a transmitter and a transceiver providing a means for communicating with various other apparatus over a transmission medium. The processor 901 is responsible for managing the bus architecture and general processing, and the memory 902 may store data used by the processor in performing operations.
Those skilled in the art will appreciate that all or part of the steps for implementing the above embodiments may be performed by hardware, or may be instructed to be performed by associated hardware by a computer program that includes instructions for performing some or all of the steps of the above methods; and the computer program may be stored in a readable storage medium, which may be any form of storage medium.
As shown in fig. 10, an embodiment of the present application further provides a video processing apparatus, which is applied to a terminal device, and includes:
a fourth obtaining module 1001, configured to obtain a second target video stream;
a fifth obtaining module 1002, configured to analyze the second target video stream to obtain first FOV track information of the second target video stream;
an uploading module 1003, configured to upload the first FOV track information to a network device.
In the video processing apparatus according to the embodiment of the present application, the fifth obtaining module includes:
the analysis submodule is used for analyzing the second target video stream to obtain the PTS of each image frame;
the second acquisition submodule is used for acquiring FOV coordinate information corresponding to the PTS of each image frame;
and the third acquisition submodule is used for acquiring first FOV track information of the second target video stream according to the FOV coordinate information corresponding to the PTS of each image frame.
In the video processing apparatus according to the embodiment of the present application, the first FOV track information includes: and the identification of the second target video stream, the identification of the second terminal equipment and FOV coordinate information corresponding to PTS of N image frames, wherein N is the total number of the image frames in the second target video stream.
In the video processing apparatus according to the embodiment of the application, the uploading module is configured to upload the first FOV track information to a network device according to a preset byte stream structure;
the preset byte stream structure comprises data header information and a group of data volume information, wherein the data header information comprises an identifier of a second target video stream and an identifier of a second terminal device, each data volume information comprises a PTS of an image frame, and FOV coordinate information corresponding to the PTS of the image frame.
It should be noted that the apparatus is an apparatus corresponding to the video processing method applied to the terminal device, and all implementation manners in the method embodiments are applicable to the embodiment of the apparatus, and the same technical effect can be achieved.
According to the video processing device, the first FOV track information of the second target video stream is uploaded to the network equipment, the network equipment can obtain the first target video stream of the first FOV according to the first FOV track information, so that the first terminal equipment can obtain the video stream of the field angle of the second terminal equipment, and in the method, the second terminal equipment only needs to transmit the first FOV track information to the network equipment, the direct-broadcast data stream does not need to be calculated and transmitted, and the calculation consumption of VR terminal equipment is effectively reduced.
As shown in fig. 11, an embodiment of the present application further provides a terminal device, where the terminal device is a second terminal device, and the terminal device includes: a transceiver 1104, a processor 1101, a memory 1103 and a computer program stored on the memory 1102 and operable on the processor 1101, the processor 1101 implementing the steps of the video processing method described above when executing the computer program. In particular, the transceiver 1104 is configured to obtain a second target video stream; the processor 1101 is configured to parse the second target video stream to obtain first FOV track information of the second target video stream; the transceiver 1104 is configured to upload the first FOV trace information to a network device.
Optionally, the processor 1101 is further configured to:
analyzing the second target video stream to obtain the PTS of each image frame;
acquiring FOV coordinate information corresponding to the PTS of each image frame;
and obtaining first FOV track information of the second target video stream according to the FOV coordinate information corresponding to the PTS of each image frame.
Optionally, the first FOV track information includes: and the identification of the second target video stream, the identification of the second terminal equipment and FOV coordinate information corresponding to PTS of N image frames, wherein N is the total number of the image frames in the second target video stream.
Optionally, the transceiver 1104 is configured to upload the first FOV track information to a network device according to a preset byte stream structure;
the preset byte stream structure comprises data header information and a group of data volume information, wherein the data header information comprises an identifier of a second target video stream and an identifier of a second terminal device, each data volume information comprises a PTS of an image frame, and FOV coordinate information corresponding to the PTS of the image frame.
It is noted that in FIG. 11, the bus architecture may include any number of interconnected buses and bridges, with one or more processors represented by the processor 1101 and various circuits of memory represented by the memory 1103 linked together. The bus architecture may also link together various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. The bus interface 1102 provides an interface. The transceiver 1104 may be a number of elements, including a transmitter and a transceiver, providing a means for communicating with various other apparatus over a transmission medium. For different terminals, the user interface 1105 may also be an interface capable of interfacing with a desired device, including but not limited to a keypad, display, speaker, microphone, joystick, etc. The processor 1101 is responsible for managing the bus architecture and general processing, and the memory 1103 may store data used by the processor 1101 in performing operations.
Those skilled in the art will appreciate that all or part of the steps for implementing the above embodiments may be performed by hardware, or may be instructed to be performed by associated hardware by a computer program that includes instructions for performing some or all of the steps of the above methods; and the computer program may be stored in a readable storage medium, which may be any form of storage medium.
In addition, a computer-readable storage medium is provided in a specific embodiment of the present invention, and a computer program is stored thereon, and when the computer program is executed by a processor, the steps in the video processing method are implemented, and the same technical effects can be achieved.
In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one type of logical functional division, and other divisions may be realized in practice, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may be physically included alone, or two or more units may be integrated into one unit. The integrated unit may be implemented in the form of hardware, or in the form of hardware plus a software functional unit.
The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute some steps of the transceiving method according to various embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a portable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other media capable of storing program codes.
While the foregoing is directed to the preferred embodiment of the present invention, it will be appreciated by those skilled in the art that various changes and modifications may be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (9)

1. A video processing method is applied to network equipment and is characterized by comprising the following steps:
receiving a first request sent by a first terminal device, wherein the first request is used for requesting to acquire a first target video stream of a first field angle FOV, and the first FOV is a field angle corresponding to a second terminal device;
receiving first FOV track information of a second target video stream sent by the second terminal equipment, wherein the first target video stream is at least part of the second target video stream; the first FOV track information includes: identification of a second target video stream, identification of a second terminal device and FOV coordinate information corresponding to display time stamps PTS of N image frames, wherein N is the total number of the image frames in the second target video stream; obtaining a first target video stream of the first FOV according to the first FOV track information;
transmitting a first target video stream of the first FOV to the first terminal device;
wherein the obtaining a first target video stream of the first FOV according to the first FOV track information includes:
determining a starting PTS of a first target video stream of the first FOV;
in the second target video stream, sequentially reading image group data corresponding to each image frame from the starting PTS until the PTS is finished;
restoring to obtain image data of each image frame in the image group data according to the PTS of each image frame in the second target video stream;
and processing the image data of each image frame according to the first FOV track information to obtain a first target video stream of the first FOV.
2. The video processing method according to claim 1, wherein before receiving the first request sent by the first terminal device, the method further comprises:
acquiring a second target video stream;
and acquiring corresponding relation of each group of picture (GOP) data in the second target video stream, each GOP data and a display time stamp (PTS) of a picture frame in the second target video stream, and the PTS and a recording time Tr of each picture frame in the second target video stream.
3. The video processing method according to claim 1,
under the condition that the first request is used for requesting to acquire a first target video stream of a first FOV in real time, the starting PTS is a starting PTS of a latest GOP, the ending PTS is a last PTS in the second target video stream, and the latest GOP is a GOP corresponding to a latest PTS in first FOV track information uploaded by a second terminal device;
or, in a case that the first request is for requesting to acquire a target video stream of a first field angle FOV which is played from a first time, the starting PTS is a PTS corresponding to the first time in the second target video stream, and the ending PTS is a last PTS in the second target video stream;
or, in a case that the first request is for requesting to acquire a target video stream of a first field angle FOV between a second time and a third time, the starting PTS is a PTS corresponding to the second time in the second target video stream, and the ending PTS is a PTS corresponding to the third time in the second target video stream.
4. The video processing method according to claim 1, wherein said transmitting the first target video stream of the first FOV to the first terminal device comprises:
and sending the first target video stream of the first FOV to the first terminal device through a Content Delivery Network (CDN).
5. A video processing method is applied to terminal equipment and is characterized by comprising the following steps:
acquiring a second target video stream;
analyzing the second target video stream to obtain first FOV track information of the second target video stream; the first FOV track information comprises: identification of a second target video stream, identification of a second terminal device and FOV coordinate information corresponding to PTSs (partial field of view) of N image frames, wherein N is the total number of the image frames in the second target video stream;
uploading the first FOV track information to a network device;
wherein the analyzing the second target video stream to obtain the first FOV track information of the second target video stream includes:
analyzing the second target video stream to obtain the PTS of each image frame;
acquiring FOV coordinate information corresponding to the PTS of each image frame;
and obtaining first FOV track information of the second target video stream according to the FOV coordinate information corresponding to the PTS of each image frame.
6. The video processing method of claim 5, wherein uploading the first FOV trajectory information to a network device comprises:
uploading the first FOV track information to network equipment according to a preset byte stream structure;
the preset byte stream structure comprises data header information and a group of data volume information, wherein the data header information comprises an identifier of a second target video stream and an identifier of a second terminal device, each data volume information comprises a PTS of an image frame, and FOV coordinate information corresponding to the PTS of the image frame.
7. A network device, comprising: processor, memory and a computer program stored on the memory and executable on the processor, which computer program, when executed by the processor, carries out the steps of the video processing method according to any one of claims 1 to 4.
8. A terminal device, comprising: processor, memory and a computer program stored on the memory and executable on the processor, which computer program, when executed by the processor, carries out the steps of the video processing method according to any one of claims 5 to 6.
9. A computer-readable storage medium, characterized in that a computer program is stored thereon, which computer program, when being executed by a processor, realizes the video processing method according to one of the claims 1 to 4 or the steps of the video processing method according to one of the claims 5 to 6.
CN202011002978.3A 2020-09-22 2020-09-22 Video processing method, communication device and readable storage medium Active CN112153401B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011002978.3A CN112153401B (en) 2020-09-22 2020-09-22 Video processing method, communication device and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011002978.3A CN112153401B (en) 2020-09-22 2020-09-22 Video processing method, communication device and readable storage medium

Publications (2)

Publication Number Publication Date
CN112153401A CN112153401A (en) 2020-12-29
CN112153401B true CN112153401B (en) 2022-09-06

Family

ID=73896176

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011002978.3A Active CN112153401B (en) 2020-09-22 2020-09-22 Video processing method, communication device and readable storage medium

Country Status (1)

Country Link
CN (1) CN112153401B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115278193A (en) * 2021-04-30 2022-11-01 中国移动通信集团河北有限公司 Panoramic video distribution method, device, equipment and computer storage medium
CN113992976B (en) * 2021-10-19 2023-10-20 咪咕视讯科技有限公司 Video playing method, device, equipment and computer storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106572359A (en) * 2016-10-27 2017-04-19 乐视控股(北京)有限公司 Method and device for synchronously playing panoramic video on multiple terminals
CN110149542A (en) * 2018-02-13 2019-08-20 华为技术有限公司 Transfer control method
CN110163943A (en) * 2018-11-21 2019-08-23 深圳市腾讯信息技术有限公司 The rendering method and device of image, storage medium, electronic device
WO2020043104A1 (en) * 2018-08-30 2020-03-05 华为技术有限公司 Video screen projection method, device, computer equipment and storage medium
CN111432223A (en) * 2020-04-21 2020-07-17 烽火通信科技股份有限公司 Method, terminal and system for realizing multi-view video transmission and playing

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3535644B1 (en) * 2016-11-04 2023-02-22 Koninklijke KPN N.V. Streaming virtual reality video
CN110784740A (en) * 2019-11-25 2020-02-11 北京三体云时代科技有限公司 Video processing method, device, server and readable storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106572359A (en) * 2016-10-27 2017-04-19 乐视控股(北京)有限公司 Method and device for synchronously playing panoramic video on multiple terminals
CN110149542A (en) * 2018-02-13 2019-08-20 华为技术有限公司 Transfer control method
WO2020043104A1 (en) * 2018-08-30 2020-03-05 华为技术有限公司 Video screen projection method, device, computer equipment and storage medium
CN110163943A (en) * 2018-11-21 2019-08-23 深圳市腾讯信息技术有限公司 The rendering method and device of image, storage medium, electronic device
CN111432223A (en) * 2020-04-21 2020-07-17 烽火通信科技股份有限公司 Method, terminal and system for realizing multi-view video transmission and playing

Also Published As

Publication number Publication date
CN112153401A (en) 2020-12-29

Similar Documents

Publication Publication Date Title
US11330311B2 (en) Transmission device, transmission method, receiving device, and receiving method for rendering a multi-image-arrangement distribution service
US8776150B2 (en) Implementation method and system for a media-on-demand frame-spanning playing mode in a peer-to-peer network
CN115379244B (en) 360-degree video playback space enhancement self-adaptive bit rate direct-play streaming method and device
CN107615756B (en) Video server, method and video system for realizing rapid and smooth viewpoint switching
KR100742674B1 (en) Image data delivery system, image data transmitting device thereof, and image data receiving device thereof
WO2016150317A1 (en) Method, apparatus and system for synthesizing live video
EP2582142B1 (en) Method for providing fragment-based multimedia streaming service and device for same, and method for receiving fragment-based multimedia streaming service and device for same
CN101917613B (en) Acquiring and coding service system of streaming media
CN111372145B (en) Viewpoint switching method and system for multi-viewpoint video
CN107634930B (en) Method and device for acquiring media data
EP1009140A2 (en) Data transmission method, data transmission system, data receiving method, and data receiving apparatus
CN112153401B (en) Video processing method, communication device and readable storage medium
CN111479162B (en) Live data transmission method and device and computer readable storage medium
CN112019905A (en) Live broadcast playback method, computer equipment and readable storage medium
US8023560B2 (en) Apparatus and method for processing 3d video based on MPEG-4 object descriptor information
CN111447503A (en) Viewpoint switching method, server and system for multi-viewpoint video
CN112019877A (en) Screen projection method, device and equipment based on VR equipment and storage medium
JPWO2019031469A1 (en) Transmission device, transmission method, reception device, and reception method
EP3099069B1 (en) Method for processing video, terminal and server
CN112351307A (en) Screenshot method, server, terminal equipment and computer readable storage medium
CN110139128B (en) Information processing method, interceptor, electronic equipment and storage medium
US10104142B2 (en) Data processing device, data processing method, program, recording medium, and data processing system
CN112954374B (en) Video data processing method and device, electronic equipment and storage medium
KR100640918B1 (en) Method for manufacturing Stream File in Internet Streaming Service
CN113973215A (en) Data deduplication method and device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant