CN113194326A - Panoramic live broadcast method and device, computer equipment and computer readable storage medium - Google Patents

Panoramic live broadcast method and device, computer equipment and computer readable storage medium Download PDF

Info

Publication number
CN113194326A
CN113194326A CN202110468970.4A CN202110468970A CN113194326A CN 113194326 A CN113194326 A CN 113194326A CN 202110468970 A CN202110468970 A CN 202110468970A CN 113194326 A CN113194326 A CN 113194326A
Authority
CN
China
Prior art keywords
controlling
video data
live broadcast
playing end
sphere model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110468970.4A
Other languages
Chinese (zh)
Inventor
邹志鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An International Smart City Technology Co Ltd
Original Assignee
Ping An International Smart City Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An International Smart City Technology Co Ltd filed Critical Ping An International Smart City Technology Co Ltd
Priority to CN202110468970.4A priority Critical patent/CN113194326A/en
Publication of CN113194326A publication Critical patent/CN113194326A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/233Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/233Processing of audio elementary streams
    • H04N21/2335Processing of audio elementary streams involving reformatting operations of audio signals, e.g. by converting from one coding standard to another
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23406Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving management of server-side video buffer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23412Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs for generating or manipulating the scene composition of objects, e.g. MPEG-4 objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234309Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234318Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by decomposing into objects, e.g. MPEG-4 objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/242Synchronization processes, e.g. processing of PCR [Program Clock References]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4392Processing of audio elementary streams involving audio buffer management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4394Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4398Processing of audio elementary streams involving reformatting operations of audio signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44004Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving video buffer management, e.g. video decoder buffer or video display buffer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44012Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving rendering scenes according to scene graphs, e.g. MPEG-4 scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440218Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/44213Monitoring of end-user related data
    • H04N21/44218Detecting physical presence or behaviour of the user, e.g. using sensors to detect if the user is leaving the room or changes his face expression during a TV program
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/8146Monomedia components thereof involving graphical data, e.g. 3D object, 2D graphics

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Social Psychology (AREA)
  • Databases & Information Systems (AREA)
  • Computer Graphics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The invention relates to the technical field of image processing, and provides a panoramic live broadcast method and device, computer equipment and a computer readable storage medium. The panoramic live broadcast method comprises the following steps: controlling a live broadcast end to collect video data; controlling a live broadcast end to transmit the video data to a streaming media server in real time; controlling a playing end to pull the video data from the streaming media server, and decoding the video data to obtain an image sequence; controlling the playing end to create a sphere model; controlling the playing end to render the image sequence as texture frame by frame to the sphere model; controlling the playing end to display a target image in the sphere model based on the selection of a user; and controlling the playing end to synchronously play the audio data synchronous with the video data. The invention carries out panoramic live broadcast according to video data.

Description

Panoramic live broadcast method and device, computer equipment and computer readable storage medium
Technical Field
The invention relates to the technical field of image processing, in particular to a panoramic live broadcast method and device, computer equipment and a computer readable storage medium.
Background
In general, live broadcasting is mainly based on a flat image under a two-dimensional plane. The video live broadcast is evolved on the basis of video on demand, and is carried out by the internet and a streaming media technology, so that video content can be transmitted in real time. For example, a user shoots a beautiful scene in a tour by using a mobile phone, transmits video data to a server through a mobile network, processes the video data by the server, and shares the video data with terminals of other users in a live broadcast manner.
However, there are problems that when a user watches a live video through a terminal, the user can only watch a part of area shot by a camera of the live video user, and the watched video picture can only change along with the movement of the camera, and a good visual experience cannot be provided for the user. The live broadcast based on the two-dimensional plane has no sense of reality, no substitution sense and no immersive impression. Namely, the direct broadcasting based on the two-dimensional plane has the disadvantages that the visual participation sense of the audience is insufficient, and the audience has no immersive impression.
Disclosure of Invention
In view of the above, there is a need for a panoramic live broadcast method, apparatus, computer device and computer readable storage medium, which can perform panoramic live broadcast according to video data.
A first aspect of the present application provides a panoramic live broadcast method, where the panoramic live broadcast method includes:
controlling a live broadcast end to collect video data;
controlling a live broadcast end to transmit the video data to a streaming media server in real time;
controlling a playing end to pull the video data from the streaming media server, and decoding the video data to obtain an image sequence;
controlling the playing end to create a sphere model;
controlling the playing end to render the image sequence as texture frame by frame to the sphere model;
controlling the playing end to display a target image in the sphere model based on the selection of a user;
and controlling the playing end to synchronously play the audio data synchronous with the video data.
In another possible implementation manner, the controlling the playing end to create the sphere model includes:
acquiring a preset radius and a preset angle increment;
iteratively calculating a plurality of first angle values according to the preset angle increment, wherein the value range of each first angle value is 0-180 degrees;
iteratively calculating a plurality of second angle values according to the preset angle increment, wherein the value range of each second angle value is 0-360 degrees;
calculating a vertex coordinate according to the preset radius, each first angle value and each second angle value to obtain a plurality of vertex coordinates;
and determining a plurality of triangles according to 3 adjacent coordinates in the vertex coordinates, wherein the plurality of triangles form a spherical surface of the spherical model.
In another possible implementation manner, the controlling the playing end to render the image sequence as a texture frame by frame to the sphere model includes:
reading image data in the image sequence frame by frame;
determining a plurality of corresponding coordinates of the image data according to the number of vertex coordinates of the spherical model;
rendering the image data to the sphere model according to a plurality of vertex coordinates of the sphere model and a plurality of corresponding coordinates of the image data.
In another possible implementation manner, the controlling, by the play end, the target image in the sphere model based on the selection of the user includes:
judging the display mode of the playing end;
when the display mode of the playing end is a full-screen mode, displaying the image in the sphere model in a full screen mode at a default view angle;
receiving a gesture instruction of a user;
correcting a visual angle according to the gesture instruction, and displaying a target image in the sphere model through the corrected visual angle.
In another possible implementation manner, when the display mode of the playing end is a split-screen mode, the image in the sphere model is displayed in a split-screen mode.
In another possible implementation manner, the displaying the image in the sphere model in a split screen includes:
creating two sphere sub-models according to the sphere model;
controlling the playing end to respectively render the image sequences to the two sphere sub-models;
and adjusting the camera angles of the two sphere submodels to enable the angles of the two cameras of the two sphere submodels to be equal to 15 degrees.
In another possible implementation manner, the controlling the live broadcast end to collect the video data includes:
controlling the live broadcast end to perform projection transformation;
controlling the live broadcast end to carry out view conversion;
and controlling the live broadcast end to collect video data.
A second aspect of the present application provides a live device of panorama, the live device of panorama includes:
the acquisition module is used for controlling the live broadcast end to acquire video data;
the transmission module is used for controlling the live broadcast end to transmit the video data to the streaming media server in real time;
the decoding module is used for controlling a playing end to pull the video data from the streaming media server and decoding the video data to obtain an image sequence;
the creating module is used for controlling the playing end to create a sphere model;
the rendering module is used for controlling the playing end to render the image sequence as texture frame by frame to the sphere model;
the display module is used for controlling the playing end to display the target image in the sphere model based on the selection of a user;
and the playing module is used for controlling the playing end to synchronously play the audio data synchronous with the video data.
A third aspect of the application provides a computer device comprising a processor for implementing the panoramic live method when executing computer readable instructions stored in a memory.
A fourth aspect of the present application provides a computer readable storage medium having computer readable instructions stored thereon, which when executed by a processor, implement the panoramic live method.
The method comprises the steps of controlling a playing end to create a sphere model; controlling the playing end to render the image sequence as texture frame by frame to the sphere model; and controlling the playing end to display the target image in the sphere model based on the selection of the user. Live content is displayed in a three-dimensional mode through a sphere model, so that the scene portability is stronger; live content can also be displayed in a variety of ways, increasing scene adaptability. The method comprises the steps that the playing end is controlled to render the image sequence as textures to the sphere model frame by frame, live broadcast content can be displayed in a three-dimensional mode through the sphere model, the reality and the integrity of the live broadcast content are enhanced, and meanwhile the playing end can be controlled to display target images in the sphere model at different angles and in different modes based on the selection of a user; and controlling the playing end to synchronously play the audio data synchronous with the video data. The method and the device can perform panoramic live broadcast according to the video data, and avoid that the watched video picture can only change along with the movement of the camera, so that the audio-visual experience of a user is improved.
Drawings
Fig. 1 is a flowchart of a panoramic live broadcast method according to an embodiment of the present invention.
Fig. 2 is a structural diagram of a panoramic live broadcast apparatus according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of a computer device provided by an embodiment of the present invention.
Detailed Description
In order that the above objects, features and advantages of the present invention can be more clearly understood, a detailed description of the present invention will be given below with reference to the accompanying drawings and specific embodiments. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.
In the following description, numerous specific details are set forth to provide a thorough understanding of the present invention, and the described embodiments are merely a subset of the embodiments of the present invention, rather than a complete embodiment.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.
Preferably, the panoramic live broadcast method of the invention is applied to one or more computer devices. The computer device is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and the hardware includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.
The computer device can be a desktop computer, a notebook, a palm computer, a cloud server and other computing devices. The computer equipment can carry out man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch panel or voice control equipment and the like.
Example one
Fig. 1 is a flowchart of a panoramic live broadcast method according to an embodiment of the present invention. The panoramic live broadcast method is applied to computer equipment, and panoramic live broadcast can be carried out according to video data.
As shown in fig. 1, the panoramic live broadcasting method includes:
and 101, controlling a live broadcast end to collect video data.
The professional panoramic camera can be used for collecting pictures, and 360-degree panoramic capture can be performed. A camera of the mobile device may also be used to capture 360 degrees of video data.
In a specific embodiment, the controlling the live broadcast end to collect the video data includes:
controlling the live broadcast end to perform projection transformation;
controlling the live broadcast end to carry out view conversion;
and controlling the live broadcast end to collect video data.
Specifically, the projective transformation includes: selecting a view body to be displayed; a three-dimensional view volume is mapped in some projection manner into a two-dimensional image. The purpose of projective transformation is to define a scene so that the extra portion outside the scene is cut out and only the relevant portion inside the scene is finally entered into the image. That is, the projective transformation may determine how an imaged scene of the 3D space is projected onto the 2D plane, thereby forming a 2D image, which is rendered to the screen via the viewport transformation. Wherein, the view object refers to the set of the space where the imaging scene is located.
The Projection includes both perspective Projection (perspective Projection) and Orthographic Projection (ortho Projection). According to the difference between the two projection modes, the mapped two-dimensional image is different: perspective projection can have a perspective effect of large and small; the orthographic projection does not affect the relative size of the object, i.e. the same long side, at distance is the same as seen at near.
Perspective projection functions in OpenGL include glfront (left, right, bottom, top, zNear, zFar); where left, right, bootom, top defines the near crop plane size, and zNear and zFar define the distance from Camera/Viewer to the near and far crop planes, respectively (both distances may be positive values). The six parameters can define a cone formed by six cutting surfaces, and the cone is a view cone or a view body. Only the imaged scene in this cone is visible, and the imaged scene not in this cone is equivalent to no longer being in the line of sight and thus will be cut off, and OpenGL will not render these objects.
For perspective projection, let l ═ left, r ═ right, b ═ bottom, t ═ top, n ═ zNear, f ═ zFar, the projection matrix of glfront be:
Figure BDA0003044587970000061
alternatively, for perspective projection, the above mentioned left, right, top, bottom, zNear, zFar six parameters can be obtained by the following method. The viewing angle of the Camera in the y direction can be defined by fovy (between 0-180), aspect defines the aspect ratio of the near clipping plane, i.e. aspect is w/h, and zNear and zFar define the distance from the Camera/Viewer to the far and near clipping planes (both distances can be positive values). The four parameters also define a view frustum. In OpenGL ES 2.0, we can calculate h by trigonometric formula tan (fovy/2) ═ h/2)/zNear, and then calculate w according to w ═ h aspect, so that six parameters of left, right, top, bottom, zNear and zFar can be obtained.
For the orthographic projection, the orthographic projection is realized by glOrtho in OpenGL and OpenGL ES 1.0, the orthographic projection can be a special form of perspective projection, and the Z positions of a near cutting surface and a far cutting surface are different, so that the object always keeps the same size and is irrelevant to the distance. The orthographic projection function may include glOrtho (left, right, bottom, top, zNear, zFar). left, right, bootom, top defines the near crop plane size, and zNear and zFar define the distance from Camera/Viewer to the near and far crop planes, respectively (both distances may be positive values). Setting: x is the number ofmax=right,xmin=left,ymax=top,ymin=bottom,zmax=far,zminNear. The orthogonal projection matrix is:
Figure BDA0003044587970000062
for view transformation, the purpose is to let the user observe a scene at a certain angle (from the perspective of the viewer); or to convert the imaged scene from world coordinates into the view space in which the camera's line of sight is located (from a 3D object perspective). This can be achieved by setting the position and orientation of the viewer; or 3D transformation of the object. The world coordinate can be represented by an xyz coordinate axis, a view space is represented by any view body, the view transformation is to convert an imaging scene from the world space to a coordinate system of the view space, then projection normalization is carried out, and the imaging scene is rendered on a screen after viewport conversion. In OpenGL, the view transformation can be implemented by a gluLookAt function. glulooka at (eyex, eyey, eyez, centerx, centery, centerz, upx, upy, upz); wherein eye represents the position of camera/viewer; center represents the focal point of the camera or eye (which together with eye determines the orientation of eye); up represents the direct upward direction of eye, up represents only the direction, and is independent of the size. By calling gluloogat to set the scene observed, the imaged scene in this scene is processed by OpenGL. In OpenGL, eye is at the origin, pointing in the negative direction of the Z-axis, and up is in the positive direction of the Y-axis.
And 102, controlling the live broadcast end to transmit the video data to a streaming media server in real time.
In a specific embodiment, the controlling the live broadcast end to transmit the video data to the streaming media server in real time includes:
compressing the video data;
and transmitting the compressed video data to a streaming media server in real time through an RTMP protocol.
RTMP is an acronym for Real Time Messaging Protocol. The compressed video data can also be transmitted to the streaming media server in real time by using an open source library ffmpeg, wherein ffmpeg is a set of open source computer programs which can be used for recording and converting digital audio and video, and converting the digital audio and video into streams and pushing the streams.
In a specific embodiment, the compressing the video data includes:
performing DCT transformation on the video data to eliminate redundant data; and/or
Performing intra-frame compression on the video data; and/or
And performing interframe compression on the video data.
The intra-frame compression is to compress the inside of a frame of image in a mode of intra-frame prediction and residual value recording; the inter-frame compression is performed in a GOP by combining a forward reference frame and a forward and backward reference frame with motion estimation and motion compensation. The GOP divides the video into a plurality of groups in a grouping manner according to the principle of strongly correlated frames.
The video data may be compressed using H264 or H265 compression techniques. Among them, H264 is a highly compressed digital Video codec standard proposed by Joint Video Team (JVT) consisting of ITU-T Video Coding Experts Group (VCEG) and ISO/IEC Moving Picture Experts Group (MPEG) jointly. H265 is a new video coding standard established by ITU-T VCEG following H264. The H.265 standard surrounds the existing video coding standard H.264, retains certain original technologies, and achieves the purpose of optimizing compression by improving the relationship among code stream, coding quality, time delay and algorithm complexity.
In another embodiment, the video data is encoded into video data in YUV format.
The video data in YUV format requires less storage space than the data in RGB format. The video data is coded into the video data in the YUV format, so that the transmission of the video data can be facilitated, and the transmission quantity can be reduced.
And 103, controlling a playing end to pull the video data from the streaming media server, and decoding the video data to obtain an image sequence.
The streaming media server needs to receive the video stream pushed from the live broadcast end and then push the video stream to the playing end. The streaming media server has live broadcast content and can pull the live broadcast content through a specified address. And the control playing end pulls the video stream on the streaming media server by using the specified address, namely, the control playing end establishes connection with the server according to protocol types (such as RTMP, RTP, RTSP, HTTP and the like) and receives data.
Analyzing binary video data to find out video stream information; demultiplexing (demux) according to different encapsulation formats (e.g. FLV, TS); obtaining coded H.264 video data; decompressing the video data using hard decoding (API of the corresponding system) or soft decoding (FFMpeg); and original video data (YUV) is obtained after decoding.
The video data may be parsed by a player using an x264 decoder to obtain image data in the image sequence.
The playing end can be controlled to pull the video data from the streaming media server, and demultiplex the video data into video to be decoded in a video buffer (videobuffer) and audio to be decoded in an audio buffer (audiobuffer); performing video decoding on a video to be decoded in a video buffer to obtain an image sequence in an image buffer (picturebuffer), and performing audio decoding on an audio to be decoded in an audio buffer to obtain an audio sequence in a buffer (samplebuffer) corresponding to the decoded audio; the player can be controlled to synchronously control the playing of the audio data in the audio sequence and the playing of the image data in the image sequence, so that the image data and the audio data are synchronously played.
Wherein the video buffer may include a video queue (videoqueue), the audio buffer may include an audio queue (audioqueue), the picture buffer may include a picture queue (picturequeue), and the corresponding buffer of decoded audio may include a decoded audio queue (samplequeue).
After receiving video data through a network (or a streaming media server), demultiplexing is carried out to split the video data and the audio data, and corresponding sub-thread decoding is respectively established for carrying out frame-by-frame decoding. Where the video data is decoded using an x264 decoder and the audio data is decoded using aac. And respectively putting the decoded frame data into respective buffer, and then carrying out audio and video synchronization. When the audio and video are synchronous, the audio is preferentially selected as a main clock source, and the video data synchronizes the audio clock.
And 104, controlling the playing end to create a sphere model.
In a specific embodiment, the controlling the playing end to create the sphere model includes:
acquiring a preset radius and a preset angle increment;
iteratively calculating a plurality of first angle values according to the preset angle increment, wherein the value range of each first angle value is 0-180 degrees;
iteratively calculating a plurality of second angle values according to the preset angle increment, wherein the value range of each second angle value is 0-360 degrees;
calculating a vertex coordinate according to the preset radius, each first angle value and each second angle value to obtain a plurality of vertex coordinates;
and determining a plurality of triangles according to 3 adjacent coordinates in the vertex coordinates, wherein the plurality of triangles form a spherical surface of the spherical model.
For example, the preset radius is r and the preset angular increment is d. Using the preset angle increment as an equal difference, and iteratively calculating an equal difference number sequence using 0 as an initial value to obtain a plurality of first angle values a1、…、an、…、aNN is more than or equal to 1 and less than or equal to N. Wherein a is1Is equal to 0, aNEqual to 180, the difference between two adjacent first angles is d. Using the preset angle increment as an equal difference, and iteratively calculating an equal difference number sequence using 0 as an initial value to obtain a plurality of second angle values b1、…、bm、…、bMM is more than or equal to 1 and less than or equal to M. Wherein b is1Is equal to 0, bMEqual to 360, the difference between two adjacent first angles is d. According to the preset radius r, each first angle value aNAnd each second angle value bMCalculating a vertex coordinate (x, y, z), wherein x is r sin (a)N)·cos(bM);y=r·sin(aN)·sin(bM);z=cos(aN). The adjacent coordinates may be determined based on the index of the angle value (i.e., the size of the angle value). E.g. coordinate O(a1,b1),O(a2,b1),O(a1,b2)Three adjacent coordinates.
105, controlling the playing end to render the image sequence as a texture frame by frame to the sphere model.
In a specific embodiment, the controlling the playing end to render the image sequence as a texture to the sphere model frame by frame includes:
reading image data in the image sequence frame by frame;
determining a plurality of corresponding coordinates of the image data according to the number of vertex coordinates of the spherical model;
rendering the image data to the sphere model according to a plurality of vertex coordinates of the sphere model and a plurality of corresponding coordinates of the image data.
For example, the number of vertex coordinates of the sphere model is 66, and 66 corresponding coordinates are determined in the image data such that the 66 corresponding coordinates are uniformly distributed in a rectangular shape (i.e., 6 rows and 11 columns of 11 coordinate points each, and 6 coordinate points each column). The 66 coordinate points may divide the image data into 50 sub-image data, each of which is a rectangle, and may be divided into two triangles, each of which may be determined by 3 corresponding coordinates. And sequentially determining a target triangle from the sphere model, determining a corresponding triangle of the target triangle from the image data according to the vertex coordinates of the target triangle, and rendering the data of the corresponding triangle to the target triangle.
The player terminal may be controlled to start a read data thread (readhread), a video thread (videotape read), an audio thread (audiotape read), and a video display thread (videooreffershthread). And storing the video to be decoded in a video queue and storing the audio to be decoded in an audio queue through a data reading thread. Performing video decoding on a video to be decoded through a video thread to obtain an image sequence in an image queue; performing audio decoding on audio to be decoded through an audio thread to obtain an audio sequence in a decoded audio queue; the sequence of images is displayed by a video display thread.
In another embodiment, the panorama live method further comprises: rendering a virtual object in the sphere model. Controlling the playing end to create a virtual object or acquiring the virtual object from the live broadcast end; acquiring preset coordinates of the virtual object; and rendering the virtual object to the sphere model according to the preset coordinates. Wherein, the virtual object can be a user image or cartoon image, etc.
And 106, controlling the playing end to display the target image in the sphere model based on the selection of the user.
In a specific embodiment, the controlling the playing end to display the target image in the sphere model based on the user's selection includes:
and displaying the image in the sphere model according to the display mode of the playing end.
In another embodiment, the displaying the image in the sphere model according to the display mode of the playing end includes:
judging the display mode of the playing end;
when the display mode of the playing end is a full-screen mode, displaying the image in the sphere model in a full screen mode at a default view angle;
receiving a gesture instruction of a user;
correcting a visual angle according to the gesture instruction, and displaying a target image in the sphere model through the corrected visual angle.
And when the display mode of the playing end is a split screen mode, displaying the image in the sphere model in a split screen mode.
The default position of the view of the playing end is that the center point of the image is coincident with the center point of the screen, and the viewpoint supports two moving modes. One is to drag a picture sphere through gestures to switch a central picture; the other type is that the left eyeball and the right eyeball are distinguished by wearing VR equipment, and different pictures are respectively displayed, so that a three-dimensional visual effect is achieved.
In another embodiment, the split-screen display of the image in the sphere model comprises:
creating two sphere sub-models according to the sphere model;
controlling the playing end to respectively render the image sequences to the two sphere sub-models;
and adjusting the camera angles of the two sphere submodels to enable the angles of the two cameras of the two sphere submodels to be equal to 15 degrees.
The full screen mode supports the gesture operation of the audience to slide the picture, and each angle and detail in the live broadcast can be freely and stereoscopically viewed in an all-round way; the split screen mode supports the audience to view in a split screen mode, the left screen and the right screen have different angles, and a more vivid effect can be brought.
Specifically, a screen gesture instruction is captured through the mobile phone device api, whether the gesture instruction is a flat-sweep gesture or a pinch gesture is judged, and corresponding processing is performed. The swipe gesture includes four gestures, left-to-right, right-to-left, top-to-bottom, and bottom-to-top. And after the corresponding sliding distance is calculated, rolling the ball body in the same direction and distance. The pinch gesture includes both zoom-in and zoom-out. And multiplying the gesture distance by a preset coefficient to obtain a scaling ratio, and scaling the sphere according to the scaling ratio.
Optionally, the target image may include the virtual object.
And 107, controlling the playing end to synchronously play the audio data synchronized with the video data.
In an embodiment, the controlling the playing end to synchronously play the audio data synchronized with the video data includes:
controlling the live broadcast end to collect audio data synchronous with the video data;
controlling the live broadcast end to transmit the audio data to the streaming media server in real time;
controlling the playing end to pull the audio data from the streaming media server;
and controlling the playing end to synchronously play the audio data while displaying the image in the sphere model.
The audio data may also be user recorded voice data or predefined voice data. For example, voice data of a scene, etc.
Audio encoding compression may be performed prior to transmission of the audio data. Audio coding compression is mainly classified into lossy compression and lossless compression according to acoustic principles and sound characteristics. Lossy compression, firstly, because the recognizable sound of human is between 20 Hz and 20000Hz, the sound outside the interval is eliminated; second, according to the time-domain masking and time-domain masking characteristics, the weaker sound is masked by the stronger sound, and the masked sound can also be rejected. Lossless compression: entropy coding, such as Huffman coding, is mainly applied.
The panoramic live broadcast method of the first embodiment controls the playing end to create a sphere model; controlling the playing end to render the image sequence as texture frame by frame to the sphere model; and controlling the playing end to display the target image in the sphere model based on the selection of the user. Live content is displayed in a three-dimensional mode through a sphere model, so that the scene portability is stronger; live content can also be displayed in a variety of ways, increasing scene adaptability. In the first embodiment, the playing end is controlled to render the image sequence as a texture frame by frame to the sphere model, live content can be displayed stereoscopically through the sphere model, the sense of reality and the integrity of the live content are enhanced, and meanwhile, the playing end can be controlled to display a target image in the sphere model at different angles and in different modes based on the selection of a user. According to the embodiment I, panoramic live broadcast can be carried out according to video data, and the condition that a watched video picture can only change along with the movement of a camera is avoided, so that the audio-visual experience of a user is improved.
Example two
Fig. 2 is a structural diagram of a panoramic live broadcast apparatus according to a second embodiment of the present invention. The panoramic live broadcast device 20 is applied to a computer device. The live device 20 of panorama is used for realizing the live of panorama.
As shown in fig. 2, the panoramic live device 20 may include a capture module 201, a transmission module 202, a decoding module 203, a creation module 204, a rendering module 205, and a display module 206.
And the acquisition module 201 is used for controlling the live broadcast end to acquire video data.
The professional panoramic camera can be used for collecting pictures, and 360-degree panoramic capture can be performed. A camera of the mobile device may also be used to capture 360 degrees of video data.
In a specific embodiment, the controlling the live broadcast end to collect the video data includes:
controlling the live broadcast end to perform projection transformation;
controlling the live broadcast end to carry out view conversion;
and controlling the live broadcast end to collect video data.
Specifically, the projective transformation includes: selecting a view body to be displayed; a three-dimensional view volume is mapped in some projection manner into a two-dimensional image. The purpose of projective transformation is to define a scene so that the extra portion outside the scene is cut out and only the relevant portion inside the scene is finally entered into the image. That is, the projective transformation may determine how an imaged scene of the 3D space is projected onto the 2D plane, thereby forming a 2D image, which is rendered to the screen via the viewport transformation. Wherein, the view object refers to the set of the space where the imaging scene is located.
The Projection includes both perspective Projection (perspective Projection) and Orthographic Projection (ortho Projection). According to the difference between the two projection modes, the mapped two-dimensional image is different: perspective projection can have a perspective effect of large and small; the orthographic projection does not affect the relative size of the object, i.e. the same long side, at distance is the same as seen at near.
Perspective projection functions in OpenGL include glfront (left, right, bottom, top, zNear, zFar); where left, right, bootom, top defines the near crop plane size, and zNear and zFar define the distance from Camera/Viewer to the near and far crop planes, respectively (both distances may be positive values). The six parameters can define a cone formed by six cutting surfaces, and the cone is a view cone or a view body. Only the imaged scene in this cone is visible, and the imaged scene not in this cone is equivalent to no longer being in the line of sight and thus will be cut off, and OpenGL will not render these objects.
For perspective projection, let l ═ left, r ═ right, b ═ bottom, t ═ top, n ═ zNear, f ═ zFar, the projection matrix of glfront be:
Figure BDA0003044587970000131
alternatively, for perspective projection, the above mentioned left, right, top, bottom, zNear, zFar six parameters can be obtained by the following method. The viewing angle of the Camera in the y direction can be defined by fovy (between 0-180), aspect defines the aspect ratio of the near clipping plane, i.e. aspect is w/h, and zNear and zFar define the distance from the Camera/Viewer to the far and near clipping planes (both distances can be positive values). The four parameters also define a view frustum. In OpenGL ES 2.0, we can calculate h by trigonometric formula tan (fovy/2) ═ h/2)/zNear, and then calculate w according to w ═ h aspect, so that six parameters of left, right, top, bottom, zNear and zFar can be obtained.
For the orthographic projection, the orthographic projection is realized by glOrtho in OpenGL and OpenGL ES 1.0, the orthographic projection can be a special form of perspective projection, and the Z positions of a near cutting surface and a far cutting surface are different, so that the object always keeps the same size and is irrelevant to the distance. The orthographic projection function may include glOrtho (left, right, bottom, top, zNear, zFar). left, right, bootom, top defines the near crop plane size, and zNear and zFar define the distance from Camera/Viewer to the near and far crop planes, respectively (both distances may be positive values). Setting: x is the number ofmax=right,xmin=left,ymax=top,ymin=bottom,zmax=far,zminNear. The orthogonal projection matrix is:
Figure BDA0003044587970000141
for view transformation, the purpose is to let the user observe a scene at a certain angle (from the perspective of the viewer); or to convert the imaged scene from world coordinates into the view space in which the camera's line of sight is located (from a 3D object perspective). This can be achieved by setting the position and orientation of the viewer; or 3D transformation of the object. The world coordinate can be represented by an xyz coordinate axis, a view space is represented by any view body, the view transformation is to convert an imaging scene from the world space to a coordinate system of the view space, then projection normalization is carried out, and the imaging scene is rendered on a screen after viewport conversion. In OpenGL, the view transformation can be implemented by a gluLookAt function. glulooka at (eyex, eyey, eyez, centerx, centery, centerz, upx, upy, upz); wherein eye represents the position of camera/viewer; center represents the focal point of the camera or eye (which together with eye determines the orientation of eye); up represents the direct upward direction of eye, up represents only the direction, and is independent of the size. By calling gluloogat to set the scene observed, the imaged scene in this scene is processed by OpenGL. In OpenGL, eye is at the origin, pointing in the negative direction of the Z-axis, and up is in the positive direction of the Y-axis.
And the transmission module 202 is configured to control the live broadcast end to transmit the video data to the streaming media server in real time.
In a specific embodiment, the controlling the live broadcast end to transmit the video data to the streaming media server in real time includes:
compressing the video data;
and transmitting the compressed video data to a streaming media server in real time through an RTMP protocol.
RTMP is an acronym for Real Time Messaging Protocol. The compressed video data can also be transmitted to the streaming media server in real time by using an open source library ffmpeg, wherein ffmpeg is a set of open source computer programs which can be used for recording and converting digital audio and video, and converting the digital audio and video into streams and pushing the streams.
In a specific embodiment, the compressing the video data includes:
performing DCT transformation on the video data to eliminate redundant data; and/or
Performing intra-frame compression on the video data; and/or
And performing interframe compression on the video data.
The intra-frame compression is to compress the inside of a frame of image in a mode of intra-frame prediction and residual value recording; the inter-frame compression is performed in a GOP by combining a forward reference frame and a forward and backward reference frame with motion estimation and motion compensation. The GOP divides the video into a plurality of groups in a grouping manner according to the principle of strongly correlated frames.
The video data may be compressed using H264 or H265 compression techniques. Among them, H264 is a highly compressed digital Video codec standard proposed by Joint Video Team (JVT) consisting of ITU-T Video Coding Experts Group (VCEG) and ISO/IEC Moving Picture Experts Group (MPEG) jointly. H265 is a new video coding standard established by ITU-T VCEG following H264. The H.265 standard surrounds the existing video coding standard H.264, retains certain original technologies, and achieves the purpose of optimizing compression by improving the relationship among code stream, coding quality, time delay and algorithm complexity.
In another embodiment, the video data is encoded into video data in YUV format.
The video data in YUV format requires less storage space than the data in RGB format. The video data is coded into the video data in the YUV format, so that the transmission of the video data can be facilitated, and the transmission quantity can be reduced.
The decoding module 203 is configured to control the playing end to pull the video data from the streaming media server, and decode the video data to obtain an image sequence.
The streaming media server needs to receive the video stream pushed from the live broadcast end and then push the video stream to the playing end. The streaming media server has live broadcast content and can pull the live broadcast content through a specified address. And the control playing end pulls the video stream on the streaming media server by using the specified address, namely, the control playing end establishes connection with the server according to protocol types (such as RTMP, RTP, RTSP, HTTP and the like) and receives data.
Analyzing binary video data to find out video stream information; demultiplexing (demux) according to different encapsulation formats (e.g. FLV, TS); obtaining coded H.264 video data; decompressing the video data using hard decoding (API of the corresponding system) or soft decoding (FFMpeg); and original video data (YUV) is obtained after decoding.
The video data may be parsed by a player using an x264 decoder to obtain image data in the image sequence.
The playing end can be controlled to pull the video data from the streaming media server, and demultiplex the video data into video to be decoded in a video buffer (videobuffer) and audio to be decoded in an audio buffer (audiobuffer); performing video decoding on a video to be decoded in a video buffer to obtain an image sequence in an image buffer (picturebuffer), and performing audio decoding on an audio to be decoded in an audio buffer to obtain an audio sequence in a buffer (samplebuffer) corresponding to the decoded audio; the player can be controlled to synchronously control the playing of the audio data in the audio sequence and the playing of the image data in the image sequence, so that the image data and the audio data are synchronously played.
Wherein the video buffer may include a video queue (videoqueue), the audio buffer may include an audio queue (audioqueue), the picture buffer may include a picture queue (picturequeue), and the corresponding buffer of decoded audio may include a decoded audio queue (samplequeue).
After receiving video data through a network (or a streaming media server), demultiplexing is carried out to split the video data and the audio data, and corresponding sub-thread decoding is respectively established for carrying out frame-by-frame decoding. Where the video data is decoded using an x264 decoder and the audio data is decoded using aac. And respectively putting the decoded frame data into respective buffer, and then carrying out audio and video synchronization. When the audio and video are synchronous, the audio is preferentially selected as a main clock source, and the video data synchronizes the audio clock.
And the creating module 204 is configured to control the playing end to create a sphere model.
In a specific embodiment, the controlling the playing end to create the sphere model includes:
acquiring a preset radius and a preset angle increment;
iteratively calculating a plurality of first angle values according to the preset angle increment, wherein the value range of each first angle value is 0-180 degrees;
iteratively calculating a plurality of second angle values according to the preset angle increment, wherein the value range of each second angle value is 0-360 degrees;
calculating a vertex coordinate according to the preset radius, each first angle value and each second angle value to obtain a plurality of vertex coordinates;
and determining a plurality of triangles according to 3 adjacent coordinates in the vertex coordinates, wherein the plurality of triangles form a spherical surface of the spherical model.
For example, the preset radius is r and the preset angular increment is d. Using the preset angle increment as an equal difference, and iteratively calculating an equal difference number sequence using 0 as an initial value to obtain a plurality of first angle values a1、…、an、…、aNN is more than or equal to 1 and less than or equal to N. Wherein a is1Is equal to 0, aNEqual to 180, the difference between two adjacent first angles is d. Using the preset angle increment as an equal difference, and iteratively calculating an equal difference number sequence using 0 as an initial value to obtain a plurality of second angle values b1、…、bm、…、bMM is more than or equal to 1 and less than or equal to M. Wherein b is1Is equal to 0, bMEqual to 360, the difference between two adjacent first angles is d. According to the preset radius r, each first angle value aNAnd each second angle value bMCalculating a vertex coordinate (x, y, z), wherein x is r sin (a)N)·cos(bM);y=r·sin(aN)·sin(bM);z=cos(aN). The adjacent coordinates may be determined based on the index of the angle value (i.e., the size of the angle value). E.g. coordinate O(a1,b1),O(a2,b1),O(a1,b2)Three adjacent coordinates.
A rendering module 205, configured to control the playing end to render the image sequence as a texture frame by frame to the sphere model.
In a specific embodiment, the controlling the playing end to render the image sequence as a texture to the sphere model frame by frame includes:
reading image data in the image sequence frame by frame;
determining a plurality of corresponding coordinates of the image data according to the number of vertex coordinates of the spherical model;
rendering the image data to the sphere model according to a plurality of vertex coordinates of the sphere model and a plurality of corresponding coordinates of the image data.
For example, the number of vertex coordinates of the sphere model is 66, and 66 corresponding coordinates are determined in the image data such that the 66 corresponding coordinates are uniformly distributed in a rectangular shape (i.e., 6 rows and 11 columns of 11 coordinate points each, and 6 coordinate points each column). The 66 coordinate points may divide the image data into 50 sub-image data, each of which is a rectangle, and may be divided into two triangles, each of which may be determined by 3 corresponding coordinates. And sequentially determining a target triangle from the sphere model, determining a corresponding triangle of the target triangle from the image data according to the vertex coordinates of the target triangle, and rendering the data of the corresponding triangle to the target triangle.
The playing end can be controlled to start a read thread (read thread), a video thread (video thread), an audio thread (audio thread) and a video display thread (video thread). And storing the video to be decoded in a video queue and storing the audio to be decoded in an audio queue through a data reading thread. Performing video decoding on a video to be decoded through a video thread to obtain an image sequence in an image queue; performing audio decoding on audio to be decoded through an audio thread to obtain an audio sequence in a decoded audio queue; the sequence of images is displayed by a video display thread.
In another embodiment, the rendering module is further for rendering a virtual object in the sphere model. Controlling the playing end to create a virtual object or acquiring the virtual object from the live broadcast end; acquiring preset coordinates of the virtual object; and rendering the virtual object to the sphere model according to the preset coordinates. Wherein, the virtual object can be a user image or cartoon image, etc.
A display module 206, configured to control the playing end to display the target image in the sphere model based on a selection of a user.
In a specific embodiment, the controlling the playing end to display the target image in the sphere model based on the user's selection includes:
and displaying the image in the sphere model according to the display mode of the playing end.
In another embodiment, the displaying the image in the sphere model according to the display mode of the playing end includes:
judging the display mode of the playing end;
when the display mode of the playing end is a full-screen mode, displaying the image in the sphere model in a full screen mode at a default view angle;
receiving a gesture instruction of a user;
correcting a visual angle according to the gesture instruction, and displaying a target image in the sphere model through the corrected visual angle.
And when the display mode of the playing end is a split screen mode, displaying the image in the sphere model in a split screen mode.
The default position of the view of the playing end is that the center point of the image is coincident with the center point of the screen, and the viewpoint supports two moving modes. One is to drag a picture sphere through gestures to switch a central picture; the other type is that the left eyeball and the right eyeball are distinguished by wearing VR equipment, and different pictures are respectively displayed, so that a three-dimensional visual effect is achieved.
In another embodiment, the split-screen display of the image in the sphere model comprises:
creating two sphere sub-models according to the sphere model;
controlling the playing end to respectively render the image sequences to the two sphere sub-models;
and adjusting the camera angles of the two sphere submodels to enable the angles of the two cameras of the two sphere submodels to be equal to 15 degrees.
The full screen mode supports the gesture operation of the audience to slide the picture, and each angle and detail in the live broadcast can be freely and stereoscopically viewed in an all-round way; the split screen mode supports the audience to view in a split screen mode, the left screen and the right screen have different angles, and a more vivid effect can be brought.
Specifically, a screen gesture instruction is captured through the mobile phone device api, whether the gesture instruction is a flat-sweep gesture or a pinch gesture is judged, and corresponding processing is performed. The swipe gesture includes four gestures, left-to-right, right-to-left, top-to-bottom, and bottom-to-top. And after the corresponding sliding distance is calculated, rolling the ball body in the same direction and distance. The pinch gesture includes both zoom-in and zoom-out. And multiplying the gesture distance by a preset coefficient to obtain a scaling ratio, and scaling the sphere according to the scaling ratio.
Optionally, the target image may include the virtual object.
And the playing module 207 is configured to control the playing end to synchronously play the audio data synchronized with the video data.
In a specific embodiment, the synchronously playing the audio data synchronized with the video data includes: controlling the live broadcast end to collect audio data synchronous with the video data; controlling the live broadcast end to transmit the audio data to the streaming media server in real time; controlling the playing end to pull the audio data from the streaming media server; and controlling the playing end to synchronously play the audio data while displaying the image in the sphere model.
The audio data may also be user recorded voice data or predefined voice data. For example, voice data of a scene, etc.
Audio encoding compression may be performed prior to transmission of the audio data. Audio coding compression is mainly classified into lossy compression and lossless compression according to acoustic principles and sound characteristics. Lossy compression, firstly, because the recognizable sound of human is between 20 Hz and 20000Hz, the sound outside the interval is eliminated; second, according to the time-domain masking and time-domain masking characteristics, the weaker sound is masked by the stronger sound, and the masked sound can also be rejected. Lossless compression: entropy coding, such as Huffman coding, is mainly applied.
The panoramic live broadcast device 20 of the second embodiment controls the playing end to create a sphere model; controlling the playing end to render the image sequence as texture frame by frame to the sphere model; and controlling the playing end to display the target image in the sphere model based on the selection of the user. Live content is displayed in a three-dimensional mode through a sphere model, so that the scene portability is stronger; live content can also be displayed in a variety of ways, increasing scene adaptability. In the second embodiment, the playing end is controlled to render the image sequence as a texture frame by frame to the sphere model, live broadcast content can be displayed stereoscopically through the sphere model, the sense of realism and the integrity of the live broadcast content are enhanced, and meanwhile, the playing end can be controlled to display a target image in the sphere model at different angles and in different modes based on the selection of a user. According to the embodiment I, panoramic live broadcast can be carried out according to video data, and the condition that a watched video picture can only change along with the movement of a camera is avoided, so that the audio-visual experience of a user is improved.
EXAMPLE III
The present embodiment provides a computer-readable storage medium, which stores computer-readable instructions, and the computer-readable instructions, when executed by a processor, implement the steps in the panoramic live broadcast method embodiment, for example, the steps 101 and 107 shown in fig. 1:
101, controlling a live broadcast end to collect video data;
102, controlling a live broadcast end to transmit the video data to a streaming media server in real time;
103, controlling a playing end to pull the video data from the streaming media server, and decoding the video data to obtain an image sequence;
104, controlling the playing end to create a sphere model;
105, controlling the playing end to render the image sequence as a texture frame by frame to the sphere model;
106, controlling the playing end to display the target image in the sphere model based on the selection of the user;
and 107, controlling the playing end to synchronously play the audio data synchronized with the video data.
Alternatively, the computer readable instructions, when executed by the processor, implement the functions of the modules in the above device embodiments, for example, the module 201 and 207 in fig. 2:
the acquisition module 201 is used for controlling the live broadcast end to acquire video data;
the transmission module 202 is configured to control the live broadcast end to transmit the video data to the streaming media server in real time;
a decoding module 203, configured to control a playing end to pull the video data from the streaming media server, and decode the video data to obtain an image sequence;
a creating module 204, configured to control the playing end to create a sphere model;
a rendering module 205, configured to control the playing end to render the image sequence as a texture to the sphere model frame by frame;
a display module 206, configured to control the playing end to display a target image in the sphere model based on a selection of a user;
and the playing module 207 is configured to control the playing end to synchronously play the audio data synchronized with the video data.
Example four
Fig. 3 is a schematic diagram of a computer device according to a third embodiment of the present invention. The computer device 30 includes a memory 301, a processor 302, and computer readable instructions, such as a panoramic live program, stored in the memory 301 and executable on the processor 302. The processor 302, when executing the computer readable instructions, implements the steps in the foregoing panoramic live broadcast method embodiment, for example, 101-:
101, controlling a live broadcast end to collect video data;
102, controlling a live broadcast end to transmit the video data to a streaming media server in real time;
103, controlling a playing end to pull the video data from the streaming media server, and decoding the video data to obtain an image sequence;
104, controlling the playing end to create a sphere model;
105, controlling the playing end to render the image sequence as a texture frame by frame to the sphere model;
106, controlling the playing end to display the target image in the sphere model based on the selection of the user;
and 107, controlling the playing end to synchronously play the audio data synchronized with the video data.
Alternatively, the computer readable instructions, when executed by the processor, implement the functions of the modules in the above device embodiments, for example, the module 201 and 207 in fig. 2:
the acquisition module 201 is used for controlling the live broadcast end to acquire video data;
the transmission module 202 is configured to control the live broadcast end to transmit the video data to the streaming media server in real time;
a decoding module 203, configured to control a playing end to pull the video data from the streaming media server, and decode the video data to obtain an image sequence;
a creating module 204, configured to control the playing end to create a sphere model;
a rendering module 205, configured to control the playing end to render the image sequence as a texture to the sphere model frame by frame;
a display module 206, configured to control the playing end to display a target image in the sphere model based on a selection of a user;
and the playing module 207 is configured to control the playing end to synchronously play the audio data synchronized with the video data.
Illustratively, the computer readable instructions may be partitioned into one or more modules that are stored in the memory 301 and executed by the processor 302 to perform the present method. The one or more modules may be a series of computer-readable instructions capable of performing certain functions and describing the execution of the computer-readable instructions in the computer device 30. For example, the computer readable instructions may be divided into an acquisition module 201, a transmission module 202, a decoding module 203, a creation module 204, a rendering module 205, a display module 206, and a playing module 207 in fig. 2, and specific functions of each module are described in embodiment two.
Those skilled in the art will appreciate that the schematic diagram 3 is merely an example of the computer device 30 and does not constitute a limitation of the computer device 30, and may include more or less components than those shown, or combine certain components, or different components, for example, the computer device 30 may also include input and output devices, network access devices, buses, etc.
The Processor 302 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor 302 may be any conventional processor or the like, the processor 302 being the control center for the computer device 30 and connecting the various parts of the overall computer device 30 using various interfaces and lines.
The memory 301 may be used to store the computer readable instructions, and the processor 302 may implement the various functions of the computer device 30 by executing or executing the computer readable instructions or modules stored in the memory 301 and invoking the data stored in the memory 301. The memory 301 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to the use of the computer device 30, and the like. In addition, the Memory 301 may include a hard disk, a Memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Memory Card (Flash Card), at least one disk storage device, a Flash Memory device, a Read-Only Memory (ROM), a Random Access Memory (RAM), or other non-volatile/volatile storage devices.
The modules integrated by the computer device 30 may be stored in a computer-readable storage medium if they are implemented in the form of software functional modules and sold or used as separate products. Based on such understanding, all or part of the flow of the method according to the above embodiments may be implemented by hardware that is configured to be instructed by computer readable instructions, which may be stored in a computer readable storage medium, and when the computer readable instructions are executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer readable instructions comprise computer readable instruction code which may be in source code form, object code form, an executable file or some intermediate form, and the like. The computer-readable medium may include: any entity or device capable of carrying the computer readable instruction code, recording medium, U disk, removable hard disk, magnetic disk, optical disk, computer memory, Read Only Memory (ROM), Random Access Memory (RAM), etc.
In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware form, and can also be realized in a form of hardware and a software functional module.
The integrated module implemented in the form of a software functional module may be stored in a computer-readable storage medium. The software functional module is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute some steps of the panoramic live broadcast method according to various embodiments of the present invention.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned. Furthermore, it is to be understood that the word "comprising" does not exclude other modules or steps, and the singular does not exclude the plural. A plurality of modules or apparatuses recited in the present invention can also be implemented by one module or apparatus through software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (10)

1. A panoramic live broadcast method is characterized by comprising the following steps:
controlling a live broadcast end to collect video data;
controlling the live broadcast end to transmit the video data to a streaming media server in real time;
controlling a playing end to pull the video data from the streaming media server, and decoding the video data to obtain an image sequence;
controlling the playing end to create a sphere model;
controlling the playing end to render the image sequence as texture frame by frame to the sphere model;
controlling the playing end to display a target image in the sphere model based on the selection of a user;
and controlling the playing end to synchronously play the audio data synchronous with the video data.
2. The panoramic live broadcasting method of claim 1, wherein the controlling the playing end to create a sphere model comprises:
acquiring a preset radius and a preset angle increment;
iteratively calculating a plurality of first angle values according to the preset angle increment, wherein the value range of each first angle value is 0-180 degrees;
iteratively calculating a plurality of second angle values according to the preset angle increment, wherein the value range of each second angle value is 0-360 degrees;
calculating a vertex coordinate according to the preset radius, each first angle value and each second angle value to obtain a plurality of vertex coordinates;
and determining a plurality of triangles according to 3 adjacent coordinates in the vertex coordinates, wherein the plurality of triangles form a spherical surface of the spherical model.
3. The panoramic live broadcast method of claim 1, wherein controlling the playback end to render the sequence of images as textures frame by frame to the sphere model comprises:
reading image data in the image sequence frame by frame;
determining a plurality of corresponding coordinates of the image data according to the number of vertex coordinates of the spherical model;
rendering the image data to the sphere model according to a plurality of vertex coordinates of the sphere model and a plurality of corresponding coordinates of the image data.
4. The panoramic live broadcast method of claim 1, wherein the controlling the playing end to display the target image in the sphere model based on the user's selection comprises:
judging the display mode of the playing end;
when the display mode of the playing end is a full-screen mode, displaying the image in the sphere model in a full screen mode at a default view angle;
receiving a gesture instruction of a user;
correcting a visual angle according to the gesture instruction, and displaying a target image in the sphere model through the corrected visual angle.
5. The panoramic live broadcast method of claim 4, wherein when the display mode of the playing end is a split screen mode, the images in the sphere model are displayed in a split screen mode.
6. A panoramic live broadcast method as recited in claim 5 wherein the split screen displaying of images in the spherical model comprises:
creating two sphere sub-models according to the sphere model;
controlling the playing end to respectively render the image sequences to the two sphere sub-models;
and adjusting the camera angles of the two sphere submodels to enable the angles of the two cameras of the two sphere submodels to be equal to 15 degrees.
7. The panoramic live broadcasting method of claim 1, wherein the controlling the live broadcasting end to collect video data comprises:
controlling the live broadcast end to perform projection transformation;
controlling the live broadcast end to carry out view conversion;
and controlling the live broadcast end to collect video data.
8. The utility model provides a live device of panorama, its characterized in that, the live device of panorama includes:
the acquisition module is used for controlling the live broadcast end to acquire video data;
the transmission module is used for controlling the live broadcast end to transmit the video data to the streaming media server in real time;
the decoding module is used for controlling a playing end to pull the video data from the streaming media server and decoding the video data to obtain an image sequence;
the creating module is used for controlling the playing end to create a sphere model;
the rendering module is used for controlling the playing end to render the image sequence as texture frame by frame to the sphere model;
the display module is used for controlling the playing end to display the target image in the sphere model based on the selection of a user;
and the playing module is used for controlling the playing end to synchronously play the audio data synchronous with the video data.
9. A computer device comprising a processor for executing computer readable instructions stored in a memory to implement the panoramic live method as recited in any one of claims 1 to 7.
10. A computer readable storage medium having computer readable instructions stored thereon, which when executed by a processor implement a panoramic live method as recited in any one of claims 1 to 7.
CN202110468970.4A 2021-04-28 2021-04-28 Panoramic live broadcast method and device, computer equipment and computer readable storage medium Pending CN113194326A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110468970.4A CN113194326A (en) 2021-04-28 2021-04-28 Panoramic live broadcast method and device, computer equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110468970.4A CN113194326A (en) 2021-04-28 2021-04-28 Panoramic live broadcast method and device, computer equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN113194326A true CN113194326A (en) 2021-07-30

Family

ID=76980155

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110468970.4A Pending CN113194326A (en) 2021-04-28 2021-04-28 Panoramic live broadcast method and device, computer equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN113194326A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114745597A (en) * 2022-02-11 2022-07-12 北京优酷科技有限公司 Video processing method and apparatus, electronic device, and computer-readable storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106101503A (en) * 2016-07-18 2016-11-09 优势拓展(北京)科技有限公司 Real time panoramic Living Network video camera and system and method
CN106101741A (en) * 2016-07-26 2016-11-09 武汉斗鱼网络科技有限公司 Internet video live broadcasting platform is watched the method and system of panoramic video
WO2017084281A1 (en) * 2015-11-18 2017-05-26 乐视控股(北京)有限公司 Method and device for displaying panoramic video
WO2017088491A1 (en) * 2015-11-23 2017-06-01 乐视控股(北京)有限公司 Video playing method and device
CN111754614A (en) * 2020-06-30 2020-10-09 平安国际智慧城市科技股份有限公司 Video rendering method and device based on VR (virtual reality), electronic equipment and storage medium
CN112533002A (en) * 2020-11-17 2021-03-19 南京邮电大学 Dynamic image fusion method and system for VR panoramic live broadcast

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017084281A1 (en) * 2015-11-18 2017-05-26 乐视控股(北京)有限公司 Method and device for displaying panoramic video
WO2017088491A1 (en) * 2015-11-23 2017-06-01 乐视控股(北京)有限公司 Video playing method and device
CN106101503A (en) * 2016-07-18 2016-11-09 优势拓展(北京)科技有限公司 Real time panoramic Living Network video camera and system and method
CN106101741A (en) * 2016-07-26 2016-11-09 武汉斗鱼网络科技有限公司 Internet video live broadcasting platform is watched the method and system of panoramic video
CN111754614A (en) * 2020-06-30 2020-10-09 平安国际智慧城市科技股份有限公司 Video rendering method and device based on VR (virtual reality), electronic equipment and storage medium
CN112533002A (en) * 2020-11-17 2021-03-19 南京邮电大学 Dynamic image fusion method and system for VR panoramic live broadcast

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114745597A (en) * 2022-02-11 2022-07-12 北京优酷科技有限公司 Video processing method and apparatus, electronic device, and computer-readable storage medium
CN114745597B (en) * 2022-02-11 2024-06-07 北京优酷科技有限公司 Video processing method and apparatus, electronic device, and computer-readable storage medium

Similar Documents

Publication Publication Date Title
US11087549B2 (en) Methods and apparatuses for dynamic navigable 360 degree environments
CN106789991B (en) Multi-person interactive network live broadcast method and system based on virtual scene
JP6410918B2 (en) System and method for use in playback of panoramic video content
CN112585978B (en) Generating a composite video stream for display in VR
CN106101741B (en) Method and system for watching panoramic video on network video live broadcast platform
US10735826B2 (en) Free dimension format and codec
CA3018600C (en) Method, apparatus and stream of formatting an immersive video for legacy and immersive rendering devices
CN113243112B (en) Streaming volumetric video and non-volumetric video
WO2019162567A1 (en) Encoding and decoding of volumetric video
CN110933461B (en) Image processing method, device, system, network equipment, terminal and storage medium
WO2022022348A1 (en) Video compression method and apparatus, video decompression method and apparatus, electronic device, and storage medium
CN110730340B (en) Virtual audience display method, system and storage medium based on lens transformation
CN113194326A (en) Panoramic live broadcast method and device, computer equipment and computer readable storage medium
US20230328329A1 (en) User-chosen, object guided region of interest (roi) enabled digital video
CN112423108B (en) Method and device for processing code stream, first terminal, second terminal and storage medium
US20210195300A1 (en) Selection of animated viewing angle in an immersive virtual environment
TWI855158B (en) Live broadcasting system for real time three-dimensional image display
TW202428024A (en) 3d imaging streaming method and electronic device and server using the same
TW202213990A (en) Live broadcasting system for real time three-dimensional image display
CN113891099A (en) Transverse and longitudinal control device for three-dimensional live broadcast image
CN113891101A (en) Live broadcast method for real-time three-dimensional image display
CN113676731A (en) Method for compressing VR video data
CN113891100A (en) Live broadcast system for real-time three-dimensional image display
Ziegler MPEG Z/Alpha and high-resolution MPEG

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210730