WO2012100537A1 - 辅助视频补充信息承载方法、处理方法、装置与系统 - Google Patents

辅助视频补充信息承载方法、处理方法、装置与系统 Download PDF

Info

Publication number
WO2012100537A1
WO2012100537A1 PCT/CN2011/079233 CN2011079233W WO2012100537A1 WO 2012100537 A1 WO2012100537 A1 WO 2012100537A1 CN 2011079233 W CN2011079233 W CN 2011079233W WO 2012100537 A1 WO2012100537 A1 WO 2012100537A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
auxiliary
auxiliary video
supplemental information
bitstream
Prior art date
Application number
PCT/CN2011/079233
Other languages
English (en)
French (fr)
Inventor
惠宇
张园园
石腾
张楚雄
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP11857363.3A priority Critical patent/EP2661090A4/en
Publication of WO2012100537A1 publication Critical patent/WO2012100537A1/zh
Priority to US13/953,326 priority patent/US20130314498A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/161Encoding, multiplexing or demultiplexing different image signal components
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234309Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/236Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
    • H04N21/23614Multiplexing of additional data and video streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/236Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
    • H04N21/2362Generation or processing of Service Information [SI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/238Interfacing the downstream path of the transmission network, e.g. adapting the transmission rate of a video stream to network bandwidth; Processing of multiplex streams
    • H04N21/2389Multiplex stream processing, e.g. multiplex stream encrypting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/238Interfacing the downstream path of the transmission network, e.g. adapting the transmission rate of a video stream to network bandwidth; Processing of multiplex streams
    • H04N21/2389Multiplex stream processing, e.g. multiplex stream encrypting
    • H04N21/23892Multiplex stream processing, e.g. multiplex stream encrypting involving embedding information at multiplex stream level, e.g. embedding a watermark at packet level
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/434Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams, extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
    • H04N21/4343Extraction or processing of packetized elementary streams [PES]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/434Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams, extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
    • H04N21/4345Extraction or processing of SI, e.g. extracting service information from an MPEG stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/816Monomedia components thereof involving special video data, e.g 3D video

Definitions

  • the present invention relates to the field of video technologies, and in particular, to a method, a method, and a system for supporting an auxiliary video supplementary information.
  • Two-dimensional video can only transmit the plane information of the object, the user can only perceive the height, width, color, texture, etc. of the object; and the three-dimensional video can also express the information such as the depth information of the object, and the user can perceive the concave and convex, the distance and the like of the object.
  • 3D video can use different data formats. 2d plus auxiliary video is a common 3D format.
  • the two-dimensional plus auxiliary video format has the advantages of bandwidth saving, backward compatibility, depth of field adjustment, etc., especially when transmitting, the bandwidth is only increased by 10-20% relative to one channel of video; it can be widely applied to a variety of bandwidth-limited environments.
  • the data representation includes: 2D video, its auxiliary video and Auxiliary video supplemental information (AVSI).
  • Auxiliary video supplemental information Auxiliary video supplemental information
  • the prior art realizes that the two-dimensional video and the auxiliary video are encoded to generate a video bit stream, which is distributed to different transmission systems and media media according to the distribution interface of the video bit stream, and the auxiliary video is supplemented by the new descriptor at the TS transmission layer. Since the bearer of the supplementary video supplemental information needs to add a new bearer structure in the transport layer or the media medium, the specific implementation schemes corresponding to different transport systems and media media are different, resulting in an increase in configuration cost and adaptation difficulty.
  • An embodiment of the present invention provides a method, a processing method, a device and a system for supporting an auxiliary video supplementary information, and provides content distribution for media content including auxiliary video, primary video corresponding to the auxiliary video, and auxiliary video supplementary information.
  • General purpose interface for supporting an auxiliary video supplementary information, and provides content distribution for media content including auxiliary video, primary video corresponding to the auxiliary video, and auxiliary video supplementary information.
  • an embodiment of the present invention provides a method for supporting an auxiliary video supplemental information, where the method includes: carrying auxiliary video supplemental information in a video bitstream; distributing the video bitstream to a transport network to generate a media stream or distributing to a media In the medium.
  • the embodiment of the present invention further provides a method for processing auxiliary video supplemental information, the method comprising: acquiring a video bitstream, where the video bitstream includes an auxiliary video, a main video corresponding to the auxiliary video, and an auxiliary Video supplemental information; decoding the video bitstream, obtaining the auxiliary video, the main video, and the auxiliary video supplemental information; synthesizing according to the auxiliary video, the main video, and the auxiliary video supplemental information Calculated and displayed.
  • the embodiment of the present invention further provides a media content server, where the server includes: a video bitstream generating unit, configured to generate a video bitstream of the media content, where the video bitstream of the media content carries the Auxiliary video supplemental information; a video bitstream distribution unit, configured to distribute the video bitstream generated by the video bitstream generating unit to a transport network to generate a media stream or distribute to a media medium.
  • a video bitstream generating unit configured to generate a video bitstream of the media content, where the video bitstream of the media content carries the Auxiliary video supplemental information
  • a video bitstream distribution unit configured to distribute the video bitstream generated by the video bitstream generating unit to a transport network to generate a media stream or distribute to a media medium.
  • the embodiment of the present invention further provides a media content display terminal, where the terminal includes: an acquiring unit, configured to acquire a video bitstream, where the video bitstream includes an auxiliary video, and a main video corresponding to the auxiliary video. And the auxiliary video supplemental information, the decoding unit, configured to decode the video bitstream acquired by the acquiring unit, to obtain the auxiliary video, the primary video, and the auxiliary video supplemental information; and a processing unit, configured to: The auxiliary video, the main video, and the auxiliary video supplemental information obtained by decoding by the decoding unit are combined and calculated and displayed.
  • an embodiment of the present invention further provides a video playing system, where the system includes: a server, configured to generate a video bitstream of media content, and carry auxiliary video supplemental information in the video bitstream, The video bit stream is distributed to the transmission network to generate a media stream or distributed to the media medium; the terminal is configured to acquire the video bit stream generated by the server, where the video bit stream includes an auxiliary video, and a main corresponding to the auxiliary video Video, and auxiliary video supplemental information; decoding the video bitstream to obtain the auxiliary video, the primary video, and the auxiliary video supplemental information; according to the auxiliary video, the primary video, and the auxiliary video The supplementary information is synthesized and displayed.
  • the scheme of this embodiment may encode the primary video, the auxiliary video, and the auxiliary video supplemental information to generate a video bitstream when encoding the media content including the auxiliary video and the primary video and the auxiliary video supplemental information corresponding to the auxiliary video.
  • the video bit stream and the transmission physical interface are then used to distribute the media content to different multimedia systems, so that the auxiliary video supplemental information can be directly carried in the video bit stream for transmission without supplementing the auxiliary video for the operation network or the media.
  • a new bearer structure is added to the medium, which reduces the cost and adaptation difficulty of content distribution.
  • the program has good Network affinity,
  • FIG. 1 is a flowchart of a method for supporting supplementary video supplementary information according to an embodiment of the present invention
  • FIG. 2 is a flowchart of a method for processing auxiliary video supplementary information according to an embodiment of the present invention
  • FIG. 2a is a schematic diagram of a connection relationship of a system according to an embodiment of the present invention.
  • FIG. 3 is a functional block diagram of a server 10 according to an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of a video bitstream generating unit 301 of the server 10 according to an embodiment of the present invention
  • FIG. 5 is a second functional block diagram of the video bitstream generating unit 301 of the server 10 according to the embodiment of the present invention.
  • FIG. 6 is a functional block diagram of a terminal 20 according to an embodiment of the present invention.
  • FIG. 7 is a block diagram of a refinement function of the decoding unit 602 of the terminal 20 according to the embodiment of the present invention
  • FIG. 8 is a second functional block diagram of the decoding unit 602 of the terminal 20 according to the embodiment of the present invention.
  • Figure 1 is a flow chart of the method. As shown in Figure 1, the method includes:
  • auxiliary video supplemental information in this embodiment is information used for performing composite calculation with the auxiliary video, including but not limited to the following.
  • auxiliary video type different auxiliary video types correspond to their respective supplementary information types.
  • the auxiliary video is a depth map
  • the corresponding supplementary information type is 1;
  • the spatial correspondence between the auxiliary video and the main video corresponding to the auxiliary video, the main video and The auxiliary video describes the spatial correspondence between two video sampling points when the sampling frequency is different.
  • one pixel in the main video corresponds to one pixel of the auxiliary video, in order to properly compress the auxiliary video, the low code rate transmission is satisfied, and the auxiliary Video uses subsampling;
  • auxiliary videos There are many types of auxiliary videos, and the types and functions are different.
  • the auxiliary video type is a depth map or a parallax map, it can be applied to the three-dimensional content display; the auxiliary video type can also be transparency information describing the main video and the like.
  • the definition of auxiliary video supplemental information also varies with the type of auxiliary video.
  • the S101 may include: performing video coding on the auxiliary video and the auxiliary video supplemental information to generate a secondary video bitstream; and performing video coding on the primary video corresponding to the secondary video to generate a primary video bitstream.
  • the Nal (Network Abstraction Layer) unit in the auxiliary video bitstream may be used to carry the auxiliary.
  • Video supplemental information when the auxiliary video and the auxiliary video supplemental information are video-encoded by using h.264, the Nal (Network Abstraction Layer) unit in the auxiliary video bitstream may be used to carry the auxiliary.
  • Video supplemental information when the auxiliary video and the auxiliary video supplemental information are video-encoded by using h.264, the Nal (Network Abstraction Layer) unit in the auxiliary video bitstream may be used to carry the auxiliary.
  • Video supplemental information when the auxiliary video and the auxiliary video supplemental information are video-encoded by using h.264, the Nal (Network Abstraction Layer) unit in the auxiliary video bitstream may be used to carry the auxiliary.
  • Video supplemental information when the auxiliary video and the auxiliary video supplemental information are video-encoded by using h.264, the Nal (Network Abstraction Layer) unit in the auxiliary video bitstream
  • the SEI in the SEI (Supplemental Enhancement Information) Nal unit of the auxiliary video bitstream may also be used.
  • the message carries the auxiliary video supplemental information.
  • the user data structure may be carried in the auxiliary video bitstream.
  • Supplementary video supplemental information may further include: performing video coding on the auxiliary video, the auxiliary video supplemental information, and the primary video corresponding to the auxiliary video to generate a video bitstream.
  • a Nal unit bearer may be used.
  • the auxiliary video supplements the information.
  • SEI Nal may also be used.
  • the SEI message in the unit carries the auxiliary video supplemental information.
  • the method of this embodiment directly carries the auxiliary video supplemental information in the video bitstream for transmission, and includes the auxiliary video, and the media content corresponding to the primary video and the auxiliary video supplemental information of the auxiliary video is encoded to generate a video bitstream, and the video bit is utilized.
  • the streaming and transport physical interface can directly distribute the media content to different multimedia systems, thereby providing a common interface for content distribution of the media content; for the same media content, it can be directly distributed to different multimedia systems through a common interface. There is no need to add a new bearer structure on the operating network or media medium for supplementary video supplemental information, which reduces the cost and difficulty of content distribution.
  • the solution has good network affinity and can be adapted to transmission on various transmission networks and storage of media media.
  • This embodiment provides a specific auxiliary video supplementary information carrying method.
  • the embodiment uses h.2647
  • the Nal unit of h.264 specifies the format of the video data, and is a general interface of the video bit stream to the transmission network or the media medium. In this embodiment, a Nal unit is added, which is used in the video bit stream. Carry auxiliary video supplemental information.
  • the method in this embodiment includes: performing video coding on the auxiliary video included in the media content, the primary video corresponding to the auxiliary video, and the auxiliary video supplemental information to generate a video bitstream, where the video bitstream includes a new An additional Nal unit for carrying auxiliary video supplemental information; distributing the video bitstream to a transport network or a media medium.
  • the terminal acquires the video bitstream through the transmission network or the media medium
  • the auxiliary video, the main video corresponding to the auxiliary video, and the auxiliary video supplemental information can be obtained from the video bitstream, and synthesized and displayed.
  • the embodiment can be further subdivided into the following two cases.
  • the auxiliary video supplemental information is carried in the bitstream.
  • the video bitstream output by the h.264 encoder includes a series of Nal units that provide a common interface between the codec and the transport network or media medium.
  • H.264 defines various types of Nal units, which can be used to carry video frames, and can also carry information related to video frame encoding/decoding/display.
  • Table 1 shows some of the Nal units contained in an h.264 video bitstream and their ordering.
  • Access Unit SPS SEI PPS Slice Nal Slice Nal Unit delimiter Nal Nal Nal Unit Nal Unit (Redundant
  • Unit Unit (Primary coded picture,
  • the contents of the newly added Nal unit in this embodiment are as shown in Table 2.
  • the "MPEG C Part-3" standard defines auxiliary video supplemental information, and the defined structure is "SI-rbsp".
  • the supplementary information structure "Si-rbsp" defined by "MPEG C Part 3" is used as the implementation.
  • the video frame is carried as a basic encoded image by the Nal unit.
  • Auxiliary video supplemental information is transmitted with at least one IDR (Instantaneous Decoding Refresh) image or RAP (Random Access Point).
  • IDR Intelligent Decoding Refresh
  • RAP Random Access Point
  • a Nal unit is added for carrying auxiliary video supplemental information in the auxiliary video bitstream.
  • the terminal After receiving the auxiliary video bitstream containing the auxiliary video and the auxiliary video supplemental information, the terminal needs to perform the combined calculation of the auxiliary video supplemental information and the primary coded picture in the auxiliary video bitstream.
  • a Nal unit is added for carrying auxiliary video supplemental information in a video bitstream, and the receiving terminal needs to synthesize the auxiliary video supplemental information and the auxiliary encoded image in the video bitstream.
  • the format definition of the newly added Nal unit is shown in Table 4.
  • the specific "nal-unit-type" can use the reserved value according to the definition of the h.264 specification.
  • the auxiliary video supplemental information needs to be synthesized and calculated with the basic coded image in the video stream.
  • the auxiliary video supplemental information needs to be combined with the auxiliary coded image in the video stream, and the terminal can distinguish between two cases.
  • the nal_unit_type values are different in the two cases, and the terminal can determine the value of the nal_unit_type that carries the supplementary video supplemental information.
  • the nal_unit_type takes the same value, and the terminal can carry according to whether the video stream is carried.
  • Auxiliary coded images are used for decision.
  • the method in this embodiment carries the auxiliary video supplemental information by using the newly added Nal unit, so as to carry the auxiliary video supplemental information in the video bitstream, where the method includes the auxiliary video, the primary video and the auxiliary video corresponding to the auxiliary video.
  • the media content of the supplemental information provides a common interface for content distribution. For the same media content, it can be directly distributed to different multimedia systems through a common interface, without adding a new bearer structure on the operation network or the media medium for the auxiliary video supplemental information, thereby reducing the cost and difficulty of content distribution.
  • the program has good network affinity and can be adapted to various transmissions. Transport of transport and media media on the transport network.
  • This embodiment still uses the h.264 bearer to include the auxiliary video, the primary video corresponding to the auxiliary video, and the media content of the auxiliary video supplemental information.
  • the method of this embodiment is to define a new auxiliary enhancement information SEI to carry auxiliary video supplemental information.
  • the SEI message plays a supporting role in decoding, display or other processes.
  • an SEI Nal unit may include one or more SEI messages. Each SEI message is distinguished by a different payload type, and the SEI message is encapsulated in the Nal unit for transmission as part of the video bitstream.
  • the first case is that the primary video and the secondary video are used as two-way h.264 video bitstreams, and the auxiliary video supplemental information is carried in the secondary video bitstream.
  • This embodiment defines a new SEI message carrying auxiliary video supplemental information.
  • Table 5 is a type of SEI message used to carry auxiliary video supplemental information defined in this embodiment, where the payload type may take a type value reserved by the SEI message, such as 46.
  • Table 6 is a specific definition of the new SEI message structure in this embodiment.
  • the auxiliary video supplemental information is defined by taking the auxiliary video as a depth map or a disparity map as an example, but the auxiliary video may be of various types, including without being limited thereto.
  • the “generic_params” in Table 6 describes the spatial correspondence between the auxiliary video sampling points and the main video sampling points, as defined in Table 7.
  • the "depth_params” in Table 6 is used to synthesize with the depth map to calculate the parallax, which is defined as shown in Table 8.
  • the "Parallax_params” in Table 6 is used to convert the disparity map (reference parallax at the time of recording production), and calculate the true parallax at the time of viewing, as defined in Table 9.
  • the "reserved-si-message” reservation in Table 6 extends the definition of additional types of auxiliary video supplemental information.
  • Is_avsi FALSE u(l) Auxiliary video supplemental information
  • Generic_params( ) The spatial correspondence between the auxiliary video sample point and the main video sample point when the auxiliary video is subsampled
  • Depth—par ams( ) contains parameters that are calculated with the depth map
  • Parallax_params( ) contains parameters for calculations with disparity maps
  • the individual fields correspond to the sampling points of the entire main video frame.
  • the primary video, the auxiliary video, and the auxiliary video supplemental information are jointly video encoded to generate a video bitstream.
  • the SEI message in the video bitstream is used to carry auxiliary video supplemental information in the SEI Nal unit, and the primary encoded image unit is used to carry the primary video.
  • Frame, using auxiliary coded picture elements to carry auxiliary video frames may be the same as the first case, wherein the value of the payload type may also be different from the first case.
  • the auxiliary video supplemental information needs to be synthesized and calculated with the basic coded image in the video stream.
  • the auxiliary video supplemental information needs to be synthesized and calculated with the auxiliary coded image in the video stream, and the terminal distinguishes.
  • the terminal may determine the auxiliary video supplemental information according to the value of the payload type of the SEI message and the type of the video frame to be synthesized. The method in this embodiment uses the SEI message added in the SEI Nal unit.
  • the auxiliary video supplemental information is carried to carry the auxiliary video supplemental information in the auxiliary video bitstream, and the method provides the content distribution, and the media content including the auxiliary video corresponding to the primary video and the auxiliary video supplemental information provides content distribution.
  • General purpose interface For the same media content, it can be directly distributed to different multimedia systems through a common interface, without adding a new bearer structure on the operation network or the media medium for the auxiliary video supplemental information, thereby reducing the cost and difficulty of content distribution.
  • the program has good network affinity.
  • the mpeg2 bearer includes the auxiliary video, the main video corresponding to the auxiliary video, and the media content of the auxiliary video supplemental information.
  • the specific method is: the auxiliary video and the main video corresponding to the auxiliary video are encoded to generate two mpeg2 video bit streams, that is, a main video bit stream and an auxiliary video bit stream; correspondingly, the auxiliary video bit stream carries auxiliary video supplementary information.
  • the auxiliary video supplemental information can be carried by extending the user data structure.
  • the mpeg2 video bitstream is divided into six levels, a video sequence layer, a group of picture (GOP), an image layer, a slice layer, a macro block layer, and an image.
  • a block, starting from a sequence header, can optionally follow a set of headers, followed by one or more encoded frames.
  • User data e.g., user_data
  • auxiliary display carrying information such as subtitles, display parameters, and may be located at different levels of the video bitstream.
  • the difference in i in extension_and_user_data(i) means that user_data is located at a different location in the video bitstream. If i corresponding to extension_and_user_data after the video sequence layer is 0, i corresponding to extension_and_user_data after the image layer is 2, as shown in Table 10. Table 10:
  • This embodiment carries the supplementary information by extending the user data structure.
  • the user_data structure is as shown in Table 11, where user_data_identifier is a global identifier used to distinguish different user-structures. For example, ATSC registers "0x47413934" to identify ATSC_user_data, and implements user_data.
  • Table 12 defines an example of user-structure.
  • User_data_type_code is used to extend the user_data in the mpeg system.
  • Table 13 defines the different user_data_type_code types to distinguish extended user data types.
  • user_data_type_code indicates a supplementary information type
  • the corresponding extended user data is supplementary information.
  • Table 14 specifically defines the structure of the supplementary video supplemental information.
  • the supplementary information structure "Si-rbsp" defined in “MPEG C Part 3" is specifically used as an example structure of the supplementary information.
  • the auxiliary video supplemental information provides a general interface for content distribution for the media content including the auxiliary video, the primary video corresponding to the auxiliary video, and the auxiliary video supplemental information.
  • the solution has good network affinity and can be adapted to transmission on various transmission networks and storage of media media.
  • FIG. 1 is a flow chart of the method of the embodiment. As shown in Figure 2, the method includes:
  • S203 Perform composite calculation and display according to the auxiliary video, the main video, and the auxiliary video supplemental information.
  • the obtained video bitstream includes a primary video bitstream and a secondary video bitstream; at this time, S202 may include: decoding the auxiliary video bitstream to obtain auxiliary video and auxiliary video supplemental information; and decoding the primary video bit. Stream, get the main video.
  • the obtained video bit stream is a video bit stream; at this time, S202 may include: decoding the one video bit stream to obtain the main video, the auxiliary video, and the auxiliary video supplemental information.
  • S202 may specifically include: a Nal unit that carries auxiliary video supplemental information from the auxiliary video bitstream.
  • the auxiliary video supplemental information is obtained by parsing; the auxiliary video may also be parsed from the Nal unit carrying the auxiliary video in the auxiliary video bitstream.
  • S203 may specifically include: synthesizing the auxiliary video supplemental information and the basic encoded image in the auxiliary video bitstream.
  • the S202 may further include: parsing the auxiliary video supplement from the Nal unit that carries the auxiliary video supplemental information in the video bitstream.
  • Information may also be obtained by parsing the auxiliary video from the Nal unit carrying the auxiliary video in the one video bitstream, and parsing the main video from the Nal unit carrying the main video in the one video bitstream.
  • S203 may specifically include: synthesizing the auxiliary video supplemental information and the auxiliary encoded image in the video bitstream.
  • S202 may further include: decoding the primary video bitstream to obtain a primary video; and the secondary video bitstream.
  • the Nal unit carrying the auxiliary video is parsed to obtain the auxiliary video, and the supplementary video supplementary information is parsed from the SEI message of the SEI Nal unit carrying the auxiliary video supplementary information in the auxiliary video bit stream;
  • S203 may further include: supplementing the auxiliary video The information and the basic coded image in the auxiliary video bitstream are synthesized.
  • the S202 may further include: an SEI message of the SEI Nal unit that carries the auxiliary video supplemental information from the one video bitstream.
  • the auxiliary video supplemental information is obtained by parsing; the auxiliary video may be parsed from the Nal unit carrying the auxiliary video in the one video bitstream, and the main component is parsed from the Nal unit carrying the main video in the one video bitstream.
  • S203 may specifically include: synthesizing the auxiliary video supplemental information and the auxiliary encoded image in the video bitstream.
  • the S202 may further include: decoding the primary video bitstream to obtain a primary video; and decoding the secondary video bit, if the obtained video bitstream includes a primary video bitstream and a secondary video bitstream.
  • the stream is used to obtain the auxiliary video and the auxiliary video supplemental information, wherein the auxiliary video supplemental information may be parsed from the user data structure of the auxiliary video bitstream that carries the auxiliary video supplemental information;
  • S203 may specifically include: supplementing the auxiliary video and Video frames in the auxiliary video bitstream are synthesized.
  • the method of this embodiment provides a common interface for content acquisition including the auxiliary video, the main video corresponding to the auxiliary video, and the auxiliary video supplemental information; and has good network affinity, Embodiment 6
  • the embodiment provides a video playing system to implement the bearer and processing method of the auxiliary video supplemental information described in the foregoing embodiments.
  • 2a is a connection diagram of the system.
  • the system includes: a server 10, configured to generate a video bitstream of media content, and carry auxiliary video supplemental information in a video bitstream to which the video bit The stream is distributed to the transport network to generate a media stream or distributed to the media medium; the terminal 20 is configured to acquire the video bitstream generated by the server 10, where the video bitstream includes an auxiliary video, and a main corresponding to the auxiliary video.
  • Video and auxiliary video supplemental information decoding the video bitstream, obtaining the auxiliary video, the primary video, and the auxiliary video supplement Charging information; performing synthesis calculation and display according to the auxiliary video, the main video, and the auxiliary video supplemental information.
  • the auxiliary video supplemental information of this embodiment is information for performing composite calculation with the auxiliary video, including but not limited to one or more of the following types of information: auxiliary video type; auxiliary video and the auxiliary video Corresponding video spatial correspondence; specific calculation parameters corresponding to different types of auxiliary video.
  • FIG. 3 is a functional block diagram of the server 10.
  • the server 10 includes: a video bitstream generating unit 301, configured to generate a video bitstream of media content, where the auxiliary video supplemental information is carried in a video bitstream of the media content; and the video bitstream is distributed.
  • the unit 302 is configured to distribute the video bit stream generated by the video bit stream generating unit 301 to the transport network to generate a media stream or distribute the medium to the media medium.
  • FIG. 4 is one of the detailed functional block diagrams of the video bit stream generating unit 301.
  • the video bitstream generating unit 301 further includes: a first encoding unit 401, configured to perform video encoding on the auxiliary video and the auxiliary video supplemental information to generate a secondary video bitstream; And performing video coding on the main video corresponding to the auxiliary video to generate a main video bit stream.
  • FIG. 5 is a second functional block diagram of the video bit stream generating unit 301.
  • the video bitstream generating unit 301 further includes: a third encoding unit 501, configured to jointly video the auxiliary video, the auxiliary video supplemental information, and the main video corresponding to the auxiliary video. Encoding, generating a video bitstream.
  • the first coding unit 401 is specifically configured to perform the video coding by using h.264.
  • the network extraction layer Nal unit is used to carry the auxiliary video supplemental information. .
  • the third coding unit 501 is specifically configured to perform the video coding by using h.264, and when the auxiliary video, the auxiliary video supplemental information, and the main video corresponding to the auxiliary video are jointly video encoded,
  • the auxiliary video supplemental information is carried using a Nal unit.
  • the first coding unit 401 is specifically configured to perform the video coding by using h.264, and use the SEI message bearer in the auxiliary enhancement information SEI Nal unit when performing video coding on the auxiliary video and the auxiliary video supplemental information. Supplementary video supplemental information.
  • the third encoding unit 501 is specifically configured to perform the video encoding by using h.264, where the auxiliary is The video, and the auxiliary video supplemental information, and the primary video corresponding to the auxiliary video are jointly video encoded, and the auxiliary video supplemental information is carried by using an SEI message in the SEI Nal unit.
  • the first coding unit 401 is specifically configured to perform the video coding by using the mpeg2 standard, and when the video is encoded by the auxiliary video and the auxiliary video supplemental information, the auxiliary video supplemental information is carried by using a user data structure.
  • FIG. 6 is a functional block diagram of the terminal 20.
  • the terminal 20 includes: an obtaining unit 601, configured to acquire a video bitstream, where the video bitstream includes an auxiliary video, a main video corresponding to the auxiliary video, and auxiliary video supplemental information; and a decoding unit 602, For decoding the video bitstream acquired by the obtaining unit 601, obtaining the auxiliary video, the primary video, and the auxiliary video supplemental information; the processing unit 603, configured to decode the obtained according to the decoding unit 602
  • the auxiliary video, the main video, and the auxiliary video supplemental information are synthesized and displayed.
  • FIG. 7 is one of the detailed functional block diagrams of the decoding unit 602.
  • the decoding unit 602 of this embodiment includes: a first decoding unit 701, configured to decode the auxiliary video bitstream, to obtain The auxiliary video and the auxiliary video supplemental information; the second decoding unit 702 is configured to decode the main video bitstream to obtain a main video.
  • FIG. 8 is a second functional block diagram of the decoding unit 602.
  • the decoding unit 602 of this embodiment may further include: a third decoding unit 801, configured to decode the one video bitstream to obtain the main video. , auxiliary video and auxiliary video supplemental information.
  • the terminal 20 when the server 10 performs video encoding using h.264 and the main video and the auxiliary video are video-encoded independently, the terminal 20 also uses h.264 for video decoding.
  • the first decoding unit 701 is configured to parse the auxiliary video supplemental information from the Nal unit that carries the auxiliary video supplemental information in the auxiliary video bitstream, and the processing unit 603 is configured to supplement the auxiliary video and the auxiliary video.
  • the basic coded image in the bitstream is synthesized.
  • the terminal 20 when the server 10 uses h.264 for video encoding, and the primary video and the auxiliary video are jointly video encoded to generate one video bitstream, the terminal 20 also uses h.264 for video decoding.
  • the third decoding unit 801 is configured to parse the auxiliary video supplemental information from the Nal unit that carries the auxiliary video supplemental information in the one video bitstream, and the processing unit 603 is configured to supplement the auxiliary video and the video bit.
  • the auxiliary coded images in the stream are synthesized.
  • the terminal 20 when the server 10 performs video encoding using h.264, and the main video and the auxiliary video are video-encoded independently, the terminal 20 also performs video decoding using h.264.
  • the first decoding unit 701 is further configured to parse and obtain the auxiliary video supplemental information from the SEI message of the SEI Nal unit that carries the auxiliary video supplemental information in the auxiliary video bitstream; the processing unit 603 is further configured to supplement the auxiliary video.
  • the information and the basic coded image in the auxiliary video bitstream are synthesized.
  • the terminal 20 when the server 10 uses h.264 for video encoding, and the main video and the auxiliary video are combined for video encoding to generate one video bit stream, the terminal 20 also uses h.264 for video decoding.
  • the third decoding unit 801 is configured to decode the one video bitstream to obtain the primary video, the auxiliary video, and the auxiliary video supplemental information, where the auxiliary video supplement may be specifically carried from the one video bitstream.
  • the auxiliary video supplemental information is parsed in the SEI message of the SEI Nal unit of the information; the processing unit 603 is configured to synthesize the auxiliary video supplemental information and the auxiliary encoded image in the video bitstream.
  • the terminal 20 when the server 10 performs video encoding using the mpeg2 standard, the terminal 20 also performs video decoding using the mpeg2 standard.
  • the first decoding unit 701 is configured to parse and obtain the auxiliary video supplemental information from the user data structure that carries the auxiliary video supplemental information in the auxiliary video bitstream; and the processing unit 603 is configured to use the auxiliary video supplemental information and the auxiliary video bit.
  • the video frames in the stream are synthesized.
  • the server side produces three-dimensional data content.
  • the data representation of the three-dimensional content based on the two-dimensional plus auxiliary video format includes two-dimensional video, its auxiliary video and auxiliary video supplemental information, for example, a depth map can be regarded as an auxiliary video of the two-dimensional video.
  • a depth map can be regarded as an auxiliary video of the two-dimensional video.
  • One pixel in the depth map represents a depth value, and a depth value corresponds to the depth of one pixel describing the two-dimensional video. It is represented by a value of N bits.
  • N is 8
  • the depth map can be regarded as a monochrome video for processing.
  • a parallax map is also an auxiliary video for two-dimensional video because parallax is inversely proportional to depth.
  • auxiliary video can describe the transparency information of the main video for two-dimensional display, so the auxiliary video is not limited to the depth map, disparity map, and transparency map mentioned here; The definition varies depending on the type of auxiliary video.
  • the terminal acquires the three-dimensional content represented by the two-dimensional plus auxiliary video format from the received media stream or from the media medium.
  • the terminal synthesizes the three-dimensional content based on the two-dimensional plus auxiliary video, and needs to calculate the left and right eye video frames with parallax according to the two-dimensional video and the auxiliary video.
  • the actual display disparity is calculated based on the auxiliary video and the auxiliary video supplemental information (for example, the auxiliary video is a depth map, and the actual display disparity of each pixel is calculated according to the depth value), and the parallax directly reflects the user's perception of the depth.
  • the depth perceived by the positive parallax user is behind the screen, and the depth perceived by the negative parallax user is in front of the screen, and the zero parallax is on the screen.
  • the left and right eye video frames with parallax are calculated from the two-dimensional video and the actual display disparity of each pixel.
  • the left and right views are alternately or separately displayed on the screen.
  • the special left-eye view mirror or the special display system allows the left eye to see only the left view and the right eye only to view the right view, thereby allowing the user to view the video content. Produce depth perception.
  • the system in this embodiment directly carries the auxiliary video supplemental information to the video bitstream, and is the media content including the auxiliary video, the primary video corresponding to the auxiliary video, and the auxiliary video supplemental information; for the same media content, the common interface is available.
  • Direct distribution to different multimedia systems without adding new bearer structure to the operating network or media medium for supplementary video supplemental information, reducing the cost and difficulty of content distribution.
  • the solution has good network affinity and can be adapted to transmission on various transmission networks and storage of media media.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Description

辅助视频补充信息承载方法、 处理方法、 装置与系统 本申请要求于 2011 年 01 月 28 日提交中国专利局、 申请号为 201110031704.1、 发明名称为"辅助视频补充信息承载方法、 处理方法、 装置 与系统"的中国专利申请的优先权, 其全部内容通过引用结合在本申请中。
技术领域
本发明涉及视频技术领域,特别涉及一种辅助视频补充信息承载方法、处 理方法、 装置与系统。
背景技术
二维视频仅能传送物体的平面信息, 用户只能感知物体的高度、 宽度、 颜 色、 纹理等; 而三维视频还可表达物体的深度信息等信息, 用户可感知物体的 凹凸、远近距离等。 3D视频可以采用不同的数据格式。二维加辅助视频( 2d plus auxiliary video )是一种常见的 3D格式。
二维加辅助视频格式具有节省带宽, 向后兼容, 景深调节等优点, 特别是 传输时相对于一路视频, 带宽只增加 10-20%; 可广泛适用于多种带宽受限的 环境。其数据表示包含:二维视频、其辅助视频和辅助视频补充信息( Auxiliary video supplemental information, AVSI )。 三维显示终端在获取到基于二维力口辅 助视频格式表示的三维内容时, 需要获得二维视频、辅助视频以及辅助视频补 充信息。现有技术实现了将二维视频和辅助视频编码生成视频比特流,根据视 频比特流的分发接口分发到不同传输系统和媒体介质, 并且在 TS传输层通过 新增描述符承载辅助视频补充信息。由于辅助视频补充信息的承载需要在传输 层或者媒体介质中增加新的承载结构,不同传输系统和媒体介质对应的具体实 现方案不同, 造成配置成本和适配难度的增加。
发明内容
本发明实施例提供一种辅助视频补充信息承载方法、处理方法、装置与系 统, 为包含了辅助视频、 与所述辅助视频对应的主视频、 以及辅助视频补充信 息的媒体内容提供了内容分发的通用接口。
一方面, 本发明实施例提供一种辅助视频补充信息承载方法, 所述方法包 括: 在视频比特流中承载辅助视频补充信息; 将所述视频比特流分发到传输网 络生成媒体流或者分发到媒体介质中。 另一方面, 本发明实施例还提供一种辅助视频补充信息处理方法, 所述方 法包括: 获取视频比特流, 所述视频比特流包括辅助视频、 与所述辅助视频对 应的主视频、 以及辅助视频补充信息; 解码所述视频比特流, 获得所述辅助视 频、 所述主视频、 以及所述辅助视频补充信息; 根据所述辅助视频、 所述主视 频、 以及所述辅助视频补充信息进行合成计算并显示。
又一方面, 本发明实施例还提供一种媒体内容服务器, 所述服务器包括: 视频比特流生成单元, 用于生成媒体内容的视频比特流, 所述媒体内容的视频 比特流中承载了所述辅助视频补充信息; 视频比特流分发单元,用于将所述视 频比特流生成单元生成的述视频比特流分发到传输网络生成媒体流或者分发 到媒体介质中。
再一方面, 本发明实施例还提供一种媒体内容显示终端, 所述终端包括: 获取单元, 用于获取视频比特流, 所述视频比特流包括辅助视频、 与所述辅助 视频对应的主视频、 以及辅助视频补充信息; 解码单元, 用于解码所述获取单 元获取的视频比特流, 获得所述辅助视频、 所述主视频、 以及所述辅助视频补 充信息; 处理单元, 用于根据所述解码单元解码获得的所述辅助视频、 所述主 视频、 以及所述辅助视频补充信息进行合成计算并显示。
还有一方面, 本发明实施例还提供一种视频播放系统, 所述系统包括: 服 务器, 用于生成媒体内容的视频比特流,在所述视频比特流中承载辅助视频补 充信息,将所述视频比特流分发到传输网络生成媒体流或者分发到媒体介质中; 终端, 用于获取所述服务器生成的所述视频比特流, 所述视频比特流包括辅助 视频、 与所述辅助视频对应的主视频、 以及辅助视频补充信息; 解码所述视频 比特流, 获得所述辅助视频、 所述主视频、 以及所述辅助视频补充信息; 根据 所述辅助视频、所述主视频、以及所述辅助视频补充信息进行合成计算并显示。
本实施例的方案在对包含了辅助视频,及该辅助视频对应的主视频以及辅 助视频补充信息的媒体内容进行编码时, 可以将主视频, 辅助视频, 辅助视频 补充信息编码生成视频比特流,再利用视频比特流与传输物理的接口将媒体内 容分发到不同的多媒体系统上,从而可以将辅助视频补充信息直接携带于视频 比特流进行传输,而不需要针对辅助视频补充信息在运营网络或者媒体介质上 增加新的承载结构, 降低了内容分发的成本和适配的难度。该方案具有良好的 网络亲和性, 可
附图说明
图 1为本发明实施例辅助视频补充信息承载方法流程图;
图 2为本发明实施例辅助视频补充信息处理方法流程图;
图 2a为本发明实施例系统的连接关系示意图;
图 3为本发明实施例服务器 10的功能框图;
图 4为本发明实施例服务器 10的视频比特流生成单元 301的细化功能框 图之一;
图 5为本发明实施例服务器 10的视频比特流生成单元 301的细化功能框 图之二;
图 6为本发明实施例终端 20的功能框图;
图 7为本发明实施例终端 20的解码单元 602的细化功能框图之一; 图 8为本发明实施例终端 20的解码单元 602的细化功能框图之二。
具体实施方式
为使本发明实施例的目的、 技术方案和优点更加清楚, 下面将结合本发明 实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然, 所描述的实施例是本发明一部分实施例, 而不是全部的实施例。基于本发明中 的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其 他实施例, 都属于本发明保护的范围。
实施例 1 :
本实施例提供了一种辅助视频补充信息承载方法。图 1为该方法的流程图, 如图 1所示, 该方法包括:
S101、 在视频比特流中承载辅助视频补充信息;
S102、将该视频比特流分发到传输网络生成媒体流或者分发到媒体介质中 可选地,本实施例的辅助视频补充信息是用于和辅助视频进行合成计算的 信息, 包括但不限于下述信息中的一种或多种组合:
1、 辅助视频类型, 不同辅助视频的类型对应各自的补充信息类型, 如辅 助视频是深度图时, 对应补充信息类型为 1 ;
2、 辅助视频和与所述辅助视频对应的主视频的空间对应关系, 主视频和 辅助视频在采样频率不同时,描述两个视频采样点的空间对应关系,通常情况 下, 主视频中一个像素对应辅助视频的一个像素, 为了适当压缩辅助视频, 满 足低码率传输, 可对辅助视频使用子采样;
3、 不同类型的辅助视频对应的计算参数。
辅助视频类型众多, 种类不同作用也不同。 当辅助视频类型是深度图、 视 差图时, 可以应用于三维内容显示; 辅助视频类型还可以是描述主视频的透明 度信息等等。 辅助视频补充信息的定义也随着辅助视频类型的不同而不同。
可选地, S101可以包括:对所述辅助视频和所述辅助视频补充信息进行视 频编码,生成辅助视频比特流;对与所述辅助视频对应的主视频进行视频编码, 生成主视频比特流。
具体地, 当采用 h.264对所述辅助视频和所述辅助视频补充信息进行视频 编码时, 可以使用所述辅助视频比特流中的 Nal ( Network Abstraction Layer, 网络提取层)单元承载所述辅助视频补充信息。
具体地, 当采用 h.264对所述辅助视频和所述辅助视频补充信息进行视频 编码时, 也可以使用所述辅助视频比特流的 SEI ( Supplemental Enhancement Information,辅助增强信息)Nal单元中的 SEI消息承载所述辅助视频补充信息。
具体地, 当采用 mpeg2 ( Motion Picture Expert Group , 活动图像专家组) 标准对所述辅助视频和所述辅助视频补充信息进行视频编码时,可以在所述辅 助视频比特流中通过用户数据结构承载所述辅助视频补充信息。 可选地, S101还可以包括: 将所述辅助视频、 和所述辅助视频补充信息、 和与所述辅助视频对应的主视频联合进行视频编码, 生成一路视频比特流。
具体地, 当采用 h.264进行所述视频编码, 在将所述辅助视频、 和所述辅 助视频补充信息、和与所述辅助视频对应的主视频联合进行视频编码时, 可以 使用 Nal单元承载所述辅助视频补充信息。
具体地, 当采用 h.264进行所述视频编码, 在将所述辅助视频、 和所述辅 助视频补充信息、和与所述辅助视频对应的主视频联合进行视频编码时,也可 以使用 SEI Nal单元中的 SEI消息承载所述辅助视频补充信息。
这种具体的实现方式也将在后续实施例中展开详细的描述。 本实施例的方法将辅助视频补充信息直接携带于视频比特流进行传输,将 包含了辅助视频,所述辅助视频对应的主视频以及辅助视频补充信息的媒体内 容编码生成视频比特流,利用视频比特流与传输物理的接口可将媒体内容直接 分发到不同的多媒体系统上,从而为所述媒体内容提供了内容分发的通用接口; 对于同一媒体内容, 可通过通用接口直接分发到不同的多媒体系统上, 而不需 要针对辅助视频补充信息在运营网络或媒体介质上增加新的承载结构,降低了 内容分发的成本和难度。 该方案具有良好的网络亲和性, 可适应于各种传输网 络上的传输和媒体介质的存储。
实施例 2:
本实施例提供了一种具体的辅助视频补充信息承载方法。 本实施例使用 h.2647|载包含了辅助视频, 所述辅助视频对应的主视频, 以及辅助视频补充 信息的媒体内容。 h.264的 Nal单元规范了视频数据的格式, 是视频比特流到传 输网络或者媒体介质的通用接口, 本实施例通过新增一种类型的 Nal单元, 该 Nal单元用于在视频比特流中承载辅助视频补充信息。
具体地, 本实施例的方法包括: 对媒体内容所包含的辅助视频、 所述辅助 视频对应的主视频、 以及辅助视频补充信息进行视频编码, 生成视频比特流, 该视频比特流中包含了新增的 Nal单元, 用于承载辅助视频补充信息; 将视频 比特流分发到传输网络上或者媒体介质上。这样, 终端通过传输网络或者媒体 介质获取到该视频比特流时, 就可以从视频比特流中获得辅助视频、所述辅助 视频对应的主视频、 以及辅助视频补充信息, 并进行合成计算再显示。 根据对 主视频和辅助视频所采用的不同编码方式,本实施例又可以细分为以下两种情 况。
( 1 ) 第一种情况: 将辅助视频和所述辅助视频对应的主视频独立地进行 视频编码, 得到两路 h.264视频比特流, 即主视频比特流和辅助视频比特流, 在辅助视频比特流中携带辅助视频补充信息。
h.264编码器输出的视频比特流包括一系列的 Nal单元, 提供了编解码器与 传输网络或者媒体介质的通用接口。 h.264中定义了多种类型的 Nal单元, 可用 于承载视频帧, 也可承载与视频帧编解码 /显示有关的信息。 表 1显示了一个 h.264视频比特流包含的一些 Nal单元及其排列顺序。 Access Unit SPS SEI PPS Slice Nal Slice Nal Unit delimiter Nal Nal Nal Unit Nal Unit Unit (Redundant
Unit Unit ( 辅助 ( 图像 (Primary coded picture ,
(访问单元 (序列 增强信 参数集 coded 冗余编码图像) 分隔符 Nal单 参数集 息 Nal Nal picture , 基
元) Nal 单 单元) 单元) 本编码 图
元) 像)
本实施例新增的 Nal单元的内容如表 2所示。 "MPEG C Part-3"标准对辅助 视频补充信息进行了定义, 定义的结构为 "SI— rbsp", 本实施例以" MPEG C Part 3"定义的补充信息结构 "Si— rbsp"作为本实施例补充信息的一个示例。在辅助视 频比特流中, 视频帧作为基本编码图像由 Nal单元承载。 辅助视频补充信息至 少随每个 IDR ( Instantaneous Decoding Refresh解码即时刷新) 图像或 RAP (随 机接入点 Random access point )传输。 具体的 "nal— unit— type"可才艮据 h.264规范 的定义使用预留值。
Figure imgf000008_0001
Figure imgf000008_0002
本实施例新增一种 Nal单元, 用于在辅助视频比特流中承载辅助视频补充 信息。终端在接收到包含了辅助视频和辅助视频补充信息的辅助视频比特流后, 需要将辅助视频补充信息和辅助视频比特流中的基本编码图像 ( primary coded picture )进行合成计算。
( 2 )第二种情况: 使用 h.264"Auxiliary Picture"语法, 将辅助视频和所述 辅助视频对应的主视频进行视频编码生成一路 h.264视频比特流。 表 3显示了一 个携带 Auxiliary Picture 的 h.264视频比特流包含的一些 Nal单元及其排列顺序。 如表 3所示, 主视频帧作为基本编码图像由 Nal单元承载,辅助视频帧作为辅助 编码图像 ( auxiliary coded picture )由" nal unit type"为 19的 Nal单元承载, 根据 h.264的定义, 辅助视频和主视频具有相同的大小。 Access SPS SEI PPS Slice Nal Slice Nal Slice Nal
Unit Nal Nal Nal Unit Unit Unit delimiter Unit Unit Unit (Primary (Redundant (Auxiliary
Nal (序歹' J (辅助 ( 图 像 coded coded coded
Unit (访问 参 数 增强信 参数集 picture , 基 picture , 冗 picture , 辅 单元分隔 集 Nal 息 Nal Nal 单 本编码图 余编码图 助编码图 符 Nal单 单元) 单元) 元) 像) 像) 像)
元)
本实施例新增一种 Nal单元,用于在视频比特流中承载辅助视频补充信息, 接收终端需要将辅助视频补充信息和视频比特流中的辅助编码图像进行合成。 新增的 Nal单元的格式定义如表 4所示, 具体的 "nal— unit— type"可根据 h.264规范 的定义使用预留值。
Figure imgf000009_0001
第一种情况中辅助视频补充信息需要和视频流中的基本编码图像进行合 成计算,第二种情况中辅助视频补充信息需要和视频流中的辅助编码图像进行 合成计算, 终端区分两种情况可有多种方式, 例如两种情况下 nal_unit_type取 值不同, 终端可根据承载辅助视频补充信息的 nal_unit_type值来判定, 又例如 两种情况下 nal_unit_type取值相同, 终端可根据该视频流中是否携带了辅助编 码图像来判定。
本实施例的方法通过新增的 Nal单元来承载辅助视频补充信息, 以实现在 视频比特流中携带辅助视频补充信息, 该方法为包含了辅助视频, 所述辅助视 频对应的主视频以及辅助视频补充信息的媒体内容提供了内容分发的通用接 口。 对于同一媒体内容, 可通过通用接口直接分发到不同的多媒体系统上, 而 不需要针对辅助视频补充信息在运营网络或媒体介质上增加新的承载结构,降 低了内容分发的成本和难度。该方案具有良好的网络亲和性, 可适应于各种传 输网络上的传输和媒体介质的存储。
实施例 3 :
本实施例仍然使用 h.264承载包含了辅助视频, 所述辅助视频对应的主视 频以及辅助视频补充信息的媒体内容。 所不同的是, 本实施例的方法是定义新 的辅助增强信息 SEI承载辅助视频补充信息。 SEI消息在解码、显示或其它过程 中起到辅助作用, 如表 1所示, 一个 SEI Nal单元可包括一个或多个 SEI消息。 通过不同的净荷类型 (payload Type ) 区分每个 SEI消息, SEI消息封装在 Nal 单元作为视频比特流的一部分进行传输。
本实施例中, 第一种情况是主视频和辅助视频作为两路 h.264视频比特流, 在辅助视频比特流中携带辅助视频补充信息。 本实施例定义新的 SEI消息携带 辅助视频补充信息。
表 5为本实施例定义的一种用于承载辅助视频补充信息的 SEI消息类型,其 中 payload Type可取 SEI消息预留的类型值, 如 46。表 6为本实施例中新增的 SEI 消息结构的一种具体定义。这里以辅助视频是深度图或者视差图为例定义了辅 助视频补充信息, 但辅助视频可有多种类型, 包括不限于此。 表 6中的 "generic_params"描述辅助视频采样点和主视频采样点的空间对应关系, 其定 义如表 7所示。 表 6中的" depth_params"用于和深度图进行合成, 计算出视差, 其定义如表 8所示。 表 6中的" Parallax_params"用于对视差图 (记录制作时参考 视差) 进行转换, 计算出观看时的真实视差, 其定义如表 9所示。 表 6中的 "reserved— si— message"预留扩展其他类 辅助视频补充信息的定义。
Figure imgf000010_0001
Figure imgf000010_0003
Figure imgf000010_0002
aux_pic_si(payloadSize ) { 描述符 描述信息
is_avsi = FALSE u(l) 辅助视频补充信息的标
志位
auxpicType u(8) 辅助视频的类型 if(auxpicType == 0 II auxpicType
== i ) {
is_avsi = TRUE
generic_params( ) 辅助视频进行子采样时, 辅助视频采样点和主视 频采样点的空间对应关 系
}
if(auxpicType == 0 )
depth—par ams( ) 包含和深度图进行计算 的参数
else if(auxpicType == 1 )
parallax_params( ) 包含和视差图进行计算 的参数
Else
r eser ved_si_message( payloadSize ) 预留其他类型的辅助视 频补充信息的定义
}
表 7:
generic_params( ) { 描述符
Figure imgf000011_0001
else aux_is_interlaced u(l) 辅助视频采样点和主视频两
个场单独对应还是和整个主 视频帧的采样点对应
}
position_offset_h u(8) 对辅助视频进行子采样时,
辅助视频采样点和主视频采 样点的水平位置偏移值
position_offset_v u(8) 对辅助视频进行子采样时,
辅助视频采样点和主视频采 样点的垂直位置偏移值
}
Figure imgf000012_0001
Figure imgf000012_0002
本实施例的第二种情况是将主视频、辅助视频、和辅助视频补充信息联合 进行视频编码,生成一路视频比特流。视频比特流中使用辅助增强信息 SEI Nal 单元中的 SEI消息承载辅助视频补充信息, 使用基本编码图像单元承载主视频 帧, 使用辅助编码图像单元承载辅助视频帧。 SEI消息的具体定义的示例可以 和第一种情况相同, 其中 payload type取值也可以与第一种情况不同。
需要说明的是第一种情况中辅助视频补充信息需要和视频流中的基本编 码图像进行合成计算,第二种情况中辅助视频补充信息需要和视频流中的辅助 编码图像进行合成计算,终端区分两种情况可有多种方式,例如终端可根据 SEI 消息的 payload type取值不同判定辅助视频补充信息和哪类视频帧进行合成计 本实施例的方法通过在 SEI Nal单元中新增的 SEI消息来承载辅助视频补 充信息, 以实现在辅助视频比特流中携带辅助视频补充信息, 该方法为包含了 辅助视频,所述辅助视频对应的主视频以及辅助视频补充信息的媒体内容提供 了内容分发的通用接口。对于同一媒体内容, 可通过通用接口直接分发到不同 的多媒体系统上,而不需要针对辅助视频补充信息在运营网络或者媒体介质上 增加新的承载结构, 降低了内容分发的成本和难度。该方案具有良好的网络亲 和性,
Figure imgf000013_0001
实施例 4:
本实施例使用 mpeg2承载包含了辅助视频, 所述辅助视频对应的主视频以 及辅助视频补充信息的媒体内容。具体方法为: 辅助视频和所述辅助视频对应 的主视频经过编码生成两路 mpeg2视频比特流, 即主视频比特流和辅助视频比 特流; 相应的, 在辅助视频比特流中携带辅助视频补充信息。 具体地, 可通过 扩展用户数据结构来承载辅助视频补充信息。
mpeg2视频比特流分为 6个层次, 视频序列层 (Sequence), 图像组层 (Group of Picture, GOP), 图像层 (Picture), 像条层 (Slice), 宏块层 (Macro Block)和像 块层 (Block), 从一个序列头开始, 后面可以任选地跟随一组图头, 随后是一个 或者多个编码帧。
用户数据 (如, user_data ) 结构通常被扩展用于辅助显示, 携带如字幕, 显示参数等信息, 并可位于视频比特流不同层次。 extension_and_user_data(i) 中 i取值不同表示 user_data位于视频比特流的不同位置。如在视频序列层之后的 extension_and_user_data对应的 i为 0 , 在图像层之后的 extension_and_user_data 对应的 i为 2, 具体定义表 10所示。 表 10:
extension_and_user_data(i){
while((nextbits()==extension_start_code) II
(nextbits()==user_data_start_code)){
if(nextbits()==extension_start_code)
extension_data(i)
if(nextbits()==user_data_start_code)
user_data()
本实施例通过扩展用户数据结构来携带补充信息, user_data结构如表 11所 示, 其中 user_data_identifier是一个全局标识用于区分不同的 user—structure , 如 ATSC注册了 "0x47413934" 标识 ATSC_user_data, 实现了对 user_data多种用途 的扩展。 为了避免和其他系统扩展的用户数据冲突, user_data_identifier可以使 用 MPEG注册值 "0x4D504547"。
表 11 :
Figure imgf000014_0001
表 12定义了 user—structure的一个示例。 其中 user_data_type_code用以区 mpeg系统下对 user_data的不同扩展。
表 12:
语法 比 特 格式
user—structure () {
user_data_type_code
If ( user_data_type_code 8 无符号
==xx )
{
User_data_type_structure () 8 无符号
}
}
表 13定义了不同的 user_data_type_code类型区分出扩展的用户数据类型。 当 user_data_type_code表示补充信息类型时,对应的扩展用户数据为补充信息。
表 13:
Figure imgf000015_0001
表 14具体定义了辅助视频补充信息的结构, 本实施例中具体以" MPEG C Part 3"中定义的补充信息结构" Si— rbsp"作为补充信息的一个示例结构。
表 14:
Figure imgf000015_0002
辅助视频补充信息, 为包含了辅助视频, 所述辅助视频对应的主视频以及辅助 视频补充信息的媒体内容提供了内容分发的通用接口。对于同一媒体内容, 可 通过通用接口直接分发到不同的多媒体系统上,而不需要针对辅助视频补充信 息在运营网络或媒体介质上增加新的承载结构,降低了内容分发的成本和难度。 该方案具有良好的网络亲和性,可适应于各种传输网络上的传输和媒体介质的 存储。
实施例 5:
本实施例提供了一种辅助视频补充信息处理方法, 该方法与实施例 1-4的 承载方法相对应。 图 2为本实施例方法的流程图。 如图 2所示, 该方法包括:
S201、 获取视频比特流, 所述视频比特流包括辅助视频、 与所述辅助视频 对应的主视频、 以及辅助视频补充信息;
S202、 解码所述视频比特流, 获得所述辅助视频、 所述主视频、 以及所述 辅助视频补充信息;
S203、根据所述辅助视频、 所述主视频、 以及所述辅助视频补充信息进行 合成计算并显示。
可选地, 获取的视频比特流包括主视频比特流和辅助视频比特流; 此时, S202可以包括:解码所述辅助视频比特流,获得辅助视频和辅助视频补充信息; 解码所述主视频比特流, 获得主视频。
可选地, 获取的视频比特流为一路视频比特流; 此时, S202可以包括: 解 码所述一路视频比特流, 获得所述主视频、 辅助视频和辅助视频补充信息。
当采用 h.264进行视频解码时, 如果获取的视频比特流包括主视频比特流 和辅助视频比特流, S202可以具体包括:从所述辅助视频比特流中承载了辅助 视频补充信息的 Nal单元中解析获得辅助视频补充信息; 也可以从所述辅助视 频比特流中承载了辅助视频的 Nal单元中解析获得辅助视频。 S203可以具体包 括: 将辅助视频补充信息和辅助视频比特流中的基本编码图像进行合成。
当采用 h.264进行视频解码时, 如果获取的视频比特流为一路视频比特流, S202还可以具体包括: 从该一路视频比特流中承载了辅助视频补充信息的 Nal 单元中解析获得辅助视频补充信息;也可以从所述一路视频比特流中承载了辅 助视频的 Nal单元中解析获得辅助视频, 以及从所述一路视频比特流中承载了 主视频的 Nal单元中解析获得主视频。 S203可以具体包括: 将辅助视频补充信 息和视频比特流中的辅助编码图像进行合成。 当采用 h.264进行视频解码时, 如果获取的视频比特流包括主视频比特流 和辅助视频比特流, S202还可以具体包括: 解码所述主视频比特流以获得主视 频; 从辅助视频比特流中承载了辅助视频的 Nal单元解析获得辅助视频, 从辅 助视频比特流中承载了辅助视频补充信息的 SEI Nal单元的 SEI消息中解析获 得辅助视频补充信息; S203还可以具体包括: 将辅助视频补充信息和辅助视频 比特流中的基本编码图像进行合成。
当采用 h.264进行视频解码时, 如果获取的视频比特流为一路视频比特流, S202还可以具体包括: 从所述一路视频比特流中承载了辅助视频补充信息的 SEI Nal单元的 SEI消息中解析获得辅助视频补充信息; 也可以从所述一路视频 比特流中承载了辅助视频的 Nal单元中解析获得辅助视频, 以及从所述一路视 频比特流中承载了主视频的 Nal单元中解析获得主视频。 S203可以具体包括: 将辅助视频补充信息和视频比特流中的辅助编码图像进行合成。
当采用 mpeg2标准进行视频解码时, 如果获取的视频比特流包括主视频比 特流和辅助视频比特流, S202还可以具体包括: 解码所述主视频比特流以获得 主视频;解码所述辅助视频比特流以获得辅助视频和辅助视频补充信息,其中, 具体可以从辅助视频比特流中承载了辅助视频补充信息的用户数据结构中解 析获得辅助视频补充信息; S203可以具体包括: 将辅助视频补充信息和辅助视 频比特流中的视频帧进行合成。
本实施例的方法对包含了辅助视频,所述辅助视频对应的主视频以及辅助 视频补充信息的媒体内容提供了内容获取的通用接口;具有良好的网络亲和性, 实施例 6
本实施例提供一种视频播放系统,以实现前述实施例所述的辅助视频补充 信息的承载与处理方法。 图 2a为该系统的连接关系图, 如图 2a所示, 该系统包 括: 服务器 10, 用于生成媒体内容的视频比特流, 在所属视频比特流中承载辅 助视频补充信息,将所述视频比特流分发到传输网络生成媒体流或者分发到媒 体介质中; 终端 20, 用于获取所述服务器 10生成的所述视频比特流, 所述视频 比特流包括辅助视频、与所述辅助视频对应的主视频、以及辅助视频补充信息; 解码所述视频比特流, 获得所述辅助视频、 所述主视频、 以及所述辅助视频补 充信息; 根据所述辅助视频、 所述主视频、 以及所述辅助视频补充信息进行合 成计算并显示。
本实施例的辅助视频补充信息是用于和辅助视频进行合成计算的信息,包 括但不限于下面定义的几类信息中的一种或几种: 辅助视频类型; 辅助视频和 与所述辅助视频对应的视频的空间对应关系;不同类型的辅助视频对应的具体 计算参数。
图 3为服务器 10的功能框图。 如图 3所示, 该服务器 10包括: 视频比特流生 成单元 301 , 用于生成媒体内容的视频比特流, 所述媒体内容的视频比特流中 承载了所述辅助视频补充信息; 视频比特流分发单元 302, 用于将视频比特流 生成单元 301生成的视频比特流分发到传输网络生成媒体流或者分发到媒体介 质中。
图 4为视频比特流生成单元 301的细化功能框图之一。 如图 4所示, 视频比 特流生成单元 301进一步包括: 第一编码单元 401 , 用于对所述辅助视频和所述 辅助视频补充信息进行视频编码, 生成辅助视频比特流; 第二编码 402, 用于 对与所述辅助视频对应的主视频进行视频编码, 生成主视频比特流。
图 5为视频比特流生成单元 301的细化功能框图之二。 如图 5所示, 视频比 特流生成单元 301或者包括: 第三编码单元 501 , 用于将所述辅助视频、 和所述 辅助视频补充信息、和与所述辅助视频对应的主视频联合进行视频编码, 生成 一路视频比特流。
第一编码单元 401 , 具体用于采用 h.264进行所述视频编码, 在对所述辅助 视频和所述辅助视频补充信息进行视频编码时, 使用网络提取层 Nal单元承载 所述辅助视频补充信息。
第三编码单元 501 , 具体用于采用 h.264进行所述视频编码, 在将所述辅助 视频、和所述辅助视频补充信息、和与所述辅助视频对应的主视频联合进行视 频编码时, 使用 Nal单元承载所述辅助视频补充信息。
第一编码单元 401 , 具体用于采用 h.264进行所述视频编码, 在对所述辅助 视频和所述辅助视频补充信息进行视频编码时, 使用辅助增强信息 SEI Nal单 元中的 SEI消息承载所述辅助视频补充信息。
第三编码单元 501 , 具体用于采用 h.264进行所述视频编码, 在将所述辅助 视频、和所述辅助视频补充信息、和与所述辅助视频对应的主视频联合进行视 频编码时, 使用 SEI Nal单元中的 SEI消息承载所述辅助视频补充信息。
第一编码单元 401 , 具体用于采用 mpeg2标准进行所述视频编码, 在对所 述辅助视频和所述辅助视频补充信息进行视频编码时,使用用户数据结构承载 所述辅助视频补充信息。
图 6为终端 20的功能框图。 如图 6所示, 终端 20包括: 获取单元 601 , 用于 获取视频比特流, 所述视频比特流包括辅助视频、 与所述辅助视频对应的主视 频、 以及辅助视频补充信息; 解码单元 602, 用于解码所述获取单元 601获取的 视频比特流, 获得所述辅助视频、 所述主视频、 以及所述辅助视频补充信息; 处理单元 603 , 用于根据所述解码单元 602解码获得的所述辅助视频、所述主视 频、 以及所述辅助视频补充信息进行合成计算并显示。
图 7为解码单元 602的细化功能框图之一。 如图 7所示, 当获取的视频比特 流包括主视频比特流和辅助视频比特流时, 本实施例的解码单元 602包括: 第 一解码单元 701 , 用于解码所述辅助视频比特流, 获得辅助视频和辅助视频补 充信息; 第二解码单元 702, 用于解码所述主视频比特流, 获得主视频。
图 8为解码单元 602的细化功能框图之二。 如图 8所示, 当所获取的视频比 特流为一路视频比特流时, 本实施例的解码单元 602或者包括: 第三解码单元 801 , 用于解码所述一路视频比特流, 获得所述主视频、 辅助视频和辅助视频 补充信息。
具体地, 当服务器 10采用 h.264进行视频编码, 且将主视频和辅助视频独 立地进行视频编码时, 终端 20也采用 h.264进行视频解码。 此时, 第一解码单 元 701 ,用于从所述辅助视频比特流中承载了辅助视频补充信息的 Nal单元中解 析获得辅助视频补充信息; 处理单元 603 , 用于将辅助视频补充信息和辅助视 频比特流中的基本编码图像进行合成。
具体地, 当服务器 10采用 h.264进行视频编码, 且将主视频和辅助视频联 合进行视频编码生成一路视频比特流时, 终端 20也采用 h.264进行视频解码。 此时, 第三解码单元 801 , 用于从所述一路视频比特流中承载了辅助视频补充 信息的 Nal单元中解析获得辅助视频补充信息; 处理单元 603 , 用于将辅助视频 补充信息和视频比特流中的辅助编码图像进行合成。 具体地, 当服务器 10采用 h.264进行视频编码, 且将主视频和辅助视频独 立地进行视频编码时, 终端 20也采用 h.264进行视频解码。 此时, 第一解码单 元 701 , 还用于从辅助视频比特流中承载了辅助视频补充信息的 SEI Nal单元的 SEI消息中解析获得辅助视频补充信息; 处理单元 603 ,还用于将辅助视频补充 信息和辅助视频比特流中的基本编码图像进行合成。
具体地, 当服务器 10采用 h.264进行视频编码, 且将主视频和辅助视频联 合进行视频编码生成一路视频比特流时时, 终端 20也采用 h.264进行视频解码。 此时, 第三解码单元 801 , 用于解码所述一路视频比特流以获得所述主视频、 辅助视频和辅助视频补充信息, 其中, 具体可以从所述一路视频比特流中承载 了辅助视频补充信息的 SEI Nal单元的 SEI消息中解析获得辅助视频补充信息; 处理单元 603 , 用于将辅助视频补充信息和视频比特流中的辅助编码图像进行 合成。
具体地, 当服务器 10当采用 mpeg2标准进行视频编码时, 终端 20也采用 mpeg2标准进行视频解码。 此时, 第一解码单元 701 , 用于从辅助视频比特流 中承载了辅助视频补充信息的用户数据结构中解析获得辅助视频补充信息;处 理单元 603 ,用于将辅助视频补充信息和辅助视频比特流中的视频帧进行合成。
下面以三维电视系统为例,对本实施例系统的工作原理进行说明,但是下 的保护范围进行限定。 除了三维电视系统之外的其他视频播放系统, 只要能够 实现本发明实施例的功能, 都在权利要求的保护范围之内。
采用三维电视系统来实现本发明实施例的视频播放过程如下:
( 1 )服务器端制作出三维数据内容。
基于二维加辅助视频格式的三维内容的数据表示包含二维视频、 其辅助视 频和辅助视频补充信息,例如深度图(depth map)可看作二维视频的一种辅助视 频 (auxiliary video)。 深度图中一个像素表示一个深度值, 一个深度值对应描述 二维视频一个像素的深度, 使用一个 N位比特的值表示, 通常 N取为 8, 深度图 可看作一路单色视频进行处理。 在三维系统中, 由于视差和深度成反比, 视差 图(parallax map)也是二维视频的一种辅助视频。
使用现有视频编码标准对三维视频内容进行编码并传输。 辅助视频种类众多, 种类不同, 作用也不同, 例如辅助视频可以描述主视 频的透明度信息用于二维显示, 因此辅助视频不限于这里提到的深度图,视差 图, 透明度图; 辅助视频补充信息定义随辅助视频类型的不同而不同。
( 2 )终端从接收到的媒体流或者从媒体介质中获取基于二维加辅助视频 格式表示的三维内容。
终端合成基于二维加辅助视频的三维内容, 需要根据二维视频和辅助视频 计算得到具有视差的左右眼视频帧。 首先,根据辅助视频和辅助视频补充信息 计算出实际显示视差(例如辅助视频是深度图,根据深度值计算出每个像素的 实际显示视差), 视差直接反应出用户对深度的感知。 对于正视差用户感知到 的深度在屏幕后方,对于负视差用户感知到的深度在屏幕前方,零视差位于屏 幕上。其次,根据二维视频和各像素的实际显示视差计算得到具有视差的左右 眼视频帧。
终端显示时, 屏幕上交替或者分离显示左视图和右视图, 通过特制的三维 目艮镜或者特制的显示系统, 让左眼只看左视图、 右眼只看右视图, 从而让用户 对视频内容产生深度知觉。
本实施例的系统将辅助视频补充信息直接携带于视频比特流, 为包含了辅 助视频、 所述辅助视频对应的主视频、 以及辅助视频补充信息的媒体内容; 对 于同一媒体内容, 可通过通用接口直接分发到不同的多媒体系统上, 而不需要 针对辅助视频补充信息在运营网络或媒体介质上增加新的承载结构,降低了内 容分发的成本和难度。该方案具有良好的网络亲和性, 可适应于各种传输网络 上的传输和媒体介质的存储。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程, 是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算 机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。 其中,所述的存储介质可为磁碟、光盘、只读存储记忆体 ( Read-Only Memory , ROM )或随机存储记忆体 ( Random Access Memory, RAM )等。 照前述实施例对本发明实施例进行了详细的说明,本领域的普通技术人员应当 理解: 其依然可以对前述各实施例所记载的技术方案进行修改, 或者对其中部 分技术特征进行等同替换; 而这些修改或者替换, 并不使相应技术方案的本质 脱离本发明实施例各实施例技术方案的精神和范围。

Claims

权 利 要 求
1、 一种辅助视频补充信息承载方法, 其特征在于, 所述方法包括: 在视频比特流中承载辅助视频补充信息;
将所述视频比特流分发到传输网络生成媒体流或者分发到媒体介质中。
2、 根据权利要求 1所述的方法, 其特征在于, 所述在视频比特流中承载辅 助视频补充信息包括:
对所述辅助视频和所述辅助视频补充信息进行视频编码,生成辅助视频比 特流; 对与所述辅助视频对应的主视频进行视频编码, 生成主视频比特流。
3、 根据权利要求 1所述的方法, 其特征在于, 所述在视频比特流中承载辅 助视频补充信息包括:
将所述辅助视频、和所述辅助视频补充信息、和与所述辅助视频对应的主 视频联合进行视频编码, 生成一路视频比特流。
4、 根据权利要求 2所述的方法, 其特征在于, 采用 h.264进行所述视频编 码,在对所述辅助视频和所述辅助视频补充信息进行视频编码时,使用网络提 取层 Nal单元承载所述辅助视频补充信息。
5、 根据权利要求 3所述的方法, 其特征在于, 采用 h.264进行所述视频编 码, 在将所述辅助视频、 和所述辅助视频补充信息、 和与所述辅助视频对应的 主视频联合进行视频编码时, 使用 Nal单元承载所述辅助视频补充信息。
6、 根据权利要求 2所述的方法, 其特征在于, 采用 h.264进行所述视频编 码 ,在对所述辅助视频和所述辅助视频补充信息进行视频编码时,使用辅助增 强信息 SEI Nal单元中的 SEI消息承载所述辅助视频补充信息。
7、 根据权利要求 3所述的方法, 其特征在于, 采用 h.264进行所述视频编 码, 在将所述辅助视频、 和所述辅助视频补充信息、 和与所述辅助视频对应的 主视频联合进行视频编码时, 使用 SEI Nal单元中的 SEI消息承载所述辅助视频 补充信息。
8、 根据权利要求 2所述的方法, 其特征在于, 采用 mpeg2标准进行所述视 频编码,在对所述辅助视频和所述辅助视频补充信息进行视频编码时,使用用 户数据结构承载所述辅助视频补充信息。
9、 根据权利要求 1-8中任意一项权利要求所述的方法, 其特征在于, 所述 辅助视频补充信息包括下述信息中的一种或多种组合:
辅助视频类型;
辅助视频和与所述辅助视频对应的主视频的空间对应关系;
不同类型的辅助视频对应的计算参数。
10、 一种辅助视频补充信息处理方法, 其特征在于, 所述方法包括: 获取视频比特流, 所述视频比特流包括辅助视频、 与所述辅助视频对应的 主视频、 以及辅助视频补充信息;
解码所述视频比特流, 获得所述辅助视频、 所述主视频、 以及所述辅助视 频补充信息;
根据所述辅助视频、所述主视频、 以及所述辅助视频补充信息进行合成计 算并显示。
11、 根据权利要求 10所述的方法, 其特征在于, 获取的视频比特流包括主 视频比特流和辅助视频比特流;
解码所述视频比特流, 获得所述辅助视频、 所述主视频、 以及所述辅助视 频补充信息包括:
解码所述辅助视频比特流, 获得辅助视频和辅助视频补充信息; 解码所述主视频比特流, 获得主视频。
12、 根据权利要求 10所述的方法, 其特征在于, 获取的视频比特流为一路 视频比特流;
解码所述视频比特流, 获得所述辅助视频、 所述主视频、 以及所述辅助视 频补充信息包括:
解码所述一路视频比特流, 获得所述主视频、辅助视频和辅助视频补充信 息。
13、 根据权利要求 11所述的方法, 其特征在于, 采用 h.264进行所述视频 解码;
解码所述辅助视频比特流, 获得辅助视频和辅助视频补充信息包括: 从所述辅助视频比特流中承载了辅助视频补充信息的网络提取层 Nal单元 中解析获得辅助视频补充信息。
14、 根据权利要求 12所述的方法, 其特征在于, 采用 h.264进行所述视频 解码;
解码所述一路视频比特流, 获得所述主视频、辅助视频和辅助视频补充信 息包括:
从所述一路视频比特流中承载了辅助视频补充信息的 Nal单元中解析获得 辅助视频补充信息。
15、 根据权利要求 11所述的方法, 其特征在于, 采用 h.264进行所述视频 解码;
解码所述辅助视频比特流, 获得辅助视频和辅助视频补充信息包括: 从所述辅助视频比特流中承载了辅助视频补充信息的辅助增强信息 SEI Nal单元的 SEI消息中解析获得辅助视频补充信息。
16、 根据权利要求 12所述的方法, 其特征在于, 采用 h.264进行所述视频 编码;
解码所述一路视频比特流, 获得所述主视频、辅助视频和辅助视频补充信 息包括:
从所述一路视频比特流中承载了辅助视频补充信息的 SEI Nal单元的 SEI 消息中解析获得辅助视频补充信息。
17、 根据权利要求 11所述的方法, 其特征在于, 采用 mpeg2标准进行所述 视频解码;
解码所述辅助视频比特流, 获得辅助视频和辅助视频补充信息包括: 从所述辅助视频比特流中承载了辅助视频补充信息的用户数据结构中解 析获得辅助视频补充信息。
18、 一种媒体内容服务器, 其特征在于, 所述服务器包括:
视频比特流生成单元, 用于生成媒体内容的视频比特流, 所述媒体内容的 视频比特流中承载了辅助视频补充信息;
视频比特流分发单元,用于将所述视频比特流生成单元生成的视频比特流 分发到传输网络生成媒体流或者分发到媒体介质中。
19、 根据权利要求 18所述的媒体内容服务器, 其特征在于, 所述视频比特 流生成单元包括:
第一编码单元, 用于对辅助视频和所述辅助视频补充信息进行视频编码, 生成辅助视频比特流;
第二编码单元, 用于对与所述辅助视频对应的主视频进行视频编码, 生成 主视频比特流。
20、 根据权利要求 18所述的媒体内容服务器, 其特征在于, 所述视频比特 流生成单元包括:
第三编码单元, 用于将辅助视频、 和所述辅助视频补充信息、 和与所述辅 助视频对应的主视频联合进行视频编码, 生成一路视频比特流。
21、 根据权利要求 19所述的媒体内容服务器, 其特征在于,
所述第一编码单元, 具体用于采用 h.264进行所述视频编码, 在对所述辅 助视频和所述辅助视频补充信息进行视频编码时, 使用网络提取层 Nal单元承 载所述辅助视频补充信息。
22、 根据权利要求 20所述的媒体内容服务器, 其特征在于,
所述第三编码单元, 具体用于采用 h.264进行所述视频编码, 在将所述辅 助视频、和所述辅助视频补充信息、和与所述辅助视频对应的主视频联合进行 视频编码时, 使用 Nal单元承载所述辅助视频补充信息。
23、 根据权利要求 19所述的媒体内容服务器, 其特征在于,
所述第一编码单元, 具体用于采用 h.264进行所述视频编码, 在对所述辅 助视频和所述辅助视频补充信息进行视频编码时, 使用辅助增强信息 SEI Nal 单元中的 SEI消息承载所述辅助视频补充信息。
24、 根据权利要求 20所述的媒体内容服务器, 其特征在于,
所述第三编码单元, 具体用于采用 h.264进行所述视频编码, 在将所述辅 助视频、和所述辅助视频补充信息、和与所述辅助视频对应的主视频联合进行 视频编码时, 使用 SEI Nal单元中的 SEI消息承载所述辅助视频补充信息。
25、 根据权利要求 19所述的媒体内容服务器, 其特征在于,
所述第一编码单元, 具体用于采用 mpeg2标准进行所述视频编码, 在对所 述辅助视频和所述辅助视频补充信息进行视频编码时,使用用户数据结构承载 所述辅助视频补充信息。
26、 一种媒体内容显示终端, 其特征在于, 所述终端包括:
获取单元, 用于获取视频比特流, 所述视频比特流包括辅助视频、 与所述 辅助视频对应的主视频、 以及辅助视频补充信息;
解码单元,用于解码所述获取单元获取的视频比特流,获得所述辅助视频、 所述主视频、 以及所述辅助视频补充信息;
处理单元,用于根据所述解码单元解码获得的所述辅助视频、所述主视频、 以及所述辅助视频补充信息进行合成计算并显示。
27、 根据权利要求 26所述的终端, 其特征在于, 当获取的视频比特流包括 主视频比特流和辅助视频比特流时, 所述解码单元包括:
第一解码单元, 用于解码所述辅助视频比特流, 获得辅助视频和辅助视频 补充信息;
第二解码单元, 用于解码所述主视频比特流, 获得主视频。
28、 根据权利要求 26所述的终端, 其特征在于, 当所获取的视频比特流为 一路视频比特流时, 所述解码单元包括:
第三解码单元, 用于解码所述一路视频比特流, 获得所述主视频、 辅助视 频和辅助视频补充信息。
29、 根据权利要求 27所述的终端, 其特征在于, 当采用 h.264进行视频解 码时:
所述第一解码单元,具体用于从所述辅助视频比特流中承载了辅助视频补 充信息的网络提取层 Nal单元中解析获得辅助视频补充信息。
30、 根据权利要求 28所述的终端, 其特征在于, 当采用 h.264进行视频解 码时:
所述第三解码单元,具体用于从所述一路视频比特流中承载了辅助视频补 充信息的 Nal单元中解析获得辅助视频补充信息。
31、 根据权利要求 27所述的终端, 其特征在于, 当采用 h.264进行视频解 码时:
所述第一解码单元,具体用于从所述辅助视频比特流中承载了辅助视频补 充信息的辅助增强信息 SEI Nal单元的 SEI消息中解析获得辅助视频补充信息。
32、 根据权利要求 28所述的终端, 其特征在于, 当采用 h.264进行视频解 码时;
所述第三解码单元,具体用于从所述一路视频比特流中承载了辅助视频补 充信息的辅助增强信息 SEI Nal单元的 SEI消息中解析获得辅助视频补充信息。
33、 根据权利要求 27所述的终端, 其特征在于, 当采用 mpeg2标准进行视 频解码时,
所述第一解码单元,具体用于从所述辅助视频比特流中承载了辅助视频补 充信息的用户数据结构中解析获得辅助视频补充信息。
34、 一种视频播放系统, 其特征在于, 包括如权利要求 18至 25任一所述 的媒体内容服务器, 和 /或包括如权利要求 26至 33任一所述的终端。
PCT/CN2011/079233 2011-01-28 2011-09-01 辅助视频补充信息承载方法、处理方法、装置与系统 WO2012100537A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP11857363.3A EP2661090A4 (en) 2011-01-28 2011-09-01 TRANSPORT METHOD, AND METHOD, DEVICE AND SYSTEM FOR PROCESSING ADDITIONAL SECONDARY VIDEO DATA
US13/953,326 US20130314498A1 (en) 2011-01-28 2013-07-29 Method for bearing auxiliary video supplemental information, and method, apparatus, and system for processing auxiliary video supplemental information

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201110031704.1 2011-01-28
CN201110031704.1A CN102158733B (zh) 2011-01-28 2011-01-28 辅助视频补充信息承载方法、处理方法、装置与系统

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US13/953,326 Continuation US20130314498A1 (en) 2011-01-28 2013-07-29 Method for bearing auxiliary video supplemental information, and method, apparatus, and system for processing auxiliary video supplemental information

Publications (1)

Publication Number Publication Date
WO2012100537A1 true WO2012100537A1 (zh) 2012-08-02

Family

ID=44439870

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2011/079233 WO2012100537A1 (zh) 2011-01-28 2011-09-01 辅助视频补充信息承载方法、处理方法、装置与系统

Country Status (4)

Country Link
US (1) US20130314498A1 (zh)
EP (1) EP2661090A4 (zh)
CN (2) CN105100822B (zh)
WO (1) WO2012100537A1 (zh)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105100822B (zh) * 2011-01-28 2018-05-11 华为技术有限公司 辅助视频补充信息承载方法、处理方法、装置与系统
KR101536501B1 (ko) * 2012-04-12 2015-07-13 신라 테크놀로지스, 인크. 동화상 배포 서버, 동화상 재생 장치, 제어 방법, 기록 매체, 및 동화상 배포 시스템
CN103379354B (zh) * 2012-04-25 2015-03-11 浙江大学 立体视频对产生方法及装置
CN111031302A (zh) 2012-04-25 2020-04-17 浙江大学 三维视频序列辅助信息的解码方法、编码方法及装置
US10284858B2 (en) 2013-10-15 2019-05-07 Qualcomm Incorporated Support of multi-mode extraction for multi-layer video codecs
CN108616748A (zh) * 2017-01-06 2018-10-02 科通环宇(北京)科技有限公司 一种码流及其封装方法、解码方法及装置
CN107959879A (zh) * 2017-12-06 2018-04-24 神思电子技术股份有限公司 一种视频辅助信息处理方法
CN108965711B (zh) * 2018-07-27 2020-12-11 广州酷狗计算机科技有限公司 视频处理方法及装置
CN115868167A (zh) 2020-05-22 2023-03-28 抖音视界有限公司 子比特流提取处理中对编解码视频的处理
CN111901522A (zh) * 2020-07-10 2020-11-06 杭州海康威视数字技术股份有限公司 图像处理方法、系统、装置及电子设备
CN113206853B (zh) * 2021-05-08 2022-07-29 杭州当虹科技股份有限公司 一种视频批改结果保存改进方法
EP4297418A1 (en) * 2022-06-24 2023-12-27 Beijing Xiaomi Mobile Software Co., Ltd. Signaling encapsulated data representing primary video sequence and associated auxiliary video sequence

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090296810A1 (en) * 2008-06-03 2009-12-03 Omnivision Technologies, Inc. Video coding apparatus and method for supporting arbitrary-sized regions-of-interest
CN101816179A (zh) * 2007-09-24 2010-08-25 皇家飞利浦电子股份有限公司 用于编码视频数据信号的方法与系统、编码的视频数据信号、用于解码视频数据信号的方法与系统
WO2010134003A1 (en) * 2009-05-18 2010-11-25 Koninklijke Philips Electronics N.V. Entry points for 3d trickplay
CN102158733A (zh) * 2011-01-28 2011-08-17 华为技术有限公司 辅助视频补充信息承载方法、处理方法、装置与系统

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2008461B1 (en) * 2006-03-30 2015-09-16 LG Electronics Inc. A method and apparatus for decoding/encoding a multi-view video signal
US20080317124A1 (en) * 2007-06-25 2008-12-25 Sukhee Cho Multi-view video coding system, decoding system, bitstream extraction system for decoding base view and supporting view random access
WO2009011492A1 (en) * 2007-07-13 2009-01-22 Samsung Electronics Co., Ltd. Method and apparatus for encoding and decoding stereoscopic image format including both information of base view image and information of additional view image
CN102257818B (zh) * 2008-10-17 2014-10-29 诺基亚公司 3d视频编码中运动向量的共享
US9124874B2 (en) * 2009-06-05 2015-09-01 Qualcomm Incorporated Encoding of three-dimensional conversion information with two-dimensional video sequence
US8780999B2 (en) * 2009-06-12 2014-07-15 Qualcomm Incorporated Assembling multiview video coding sub-BITSTREAMS in MPEG-2 systems
US8411746B2 (en) * 2009-06-12 2013-04-02 Qualcomm Incorporated Multiview video coding over MPEG-2 systems
CN101945295B (zh) * 2009-07-06 2014-12-24 三星电子株式会社 生成深度图的方法和设备

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101816179A (zh) * 2007-09-24 2010-08-25 皇家飞利浦电子股份有限公司 用于编码视频数据信号的方法与系统、编码的视频数据信号、用于解码视频数据信号的方法与系统
US20090296810A1 (en) * 2008-06-03 2009-12-03 Omnivision Technologies, Inc. Video coding apparatus and method for supporting arbitrary-sized regions-of-interest
WO2010134003A1 (en) * 2009-05-18 2010-11-25 Koninklijke Philips Electronics N.V. Entry points for 3d trickplay
CN102158733A (zh) * 2011-01-28 2011-08-17 华为技术有限公司 辅助视频补充信息承载方法、处理方法、装置与系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2661090A4 *

Also Published As

Publication number Publication date
CN105100822A (zh) 2015-11-25
CN102158733B (zh) 2015-08-19
EP2661090A4 (en) 2014-07-09
CN102158733A (zh) 2011-08-17
EP2661090A1 (en) 2013-11-06
CN105100822B (zh) 2018-05-11
US20130314498A1 (en) 2013-11-28

Similar Documents

Publication Publication Date Title
CN109218734B (zh) 用于提供媒体内容的方法和装置
WO2012100537A1 (zh) 辅助视频补充信息承载方法、处理方法、装置与系统
US10129525B2 (en) Broadcast transmitter, broadcast receiver and 3D video data processing method thereof
KR101671021B1 (ko) 스테레오스코픽 영상 데이터 전송 장치 및 방법
CA2758903C (en) Broadcast receiver and 3d video data processing method thereof
US20170318276A1 (en) Broadcast receiver and video data processing method thereof
CA2808395C (en) Method for providing 3d video data in a 3dtv
KR101653319B1 (ko) 3d 영상을 위한 영상 컴포넌트 송수신 처리 방법 및 장치
US20140071232A1 (en) Image data transmission device, image data transmission method, and image data reception device
CN103190153B (zh) 用于立体感视频服务的信号传送方法和使用该方法的设备
US9693082B2 (en) Method and device for transmitting/receiving digital broadcast signal
WO2013105401A1 (ja) 送信装置、送信方法、受信装置および受信方法
US8953019B2 (en) Method and apparatus for generating stream and method and apparatus for processing stream
US9980013B2 (en) Method and apparatus for transmitting and receiving broadcast signal for 3D broadcasting service
US20140055561A1 (en) Transmitting apparatus, transmitting method, receiving apparatus and receiving method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11857363

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

REEP Request for entry into the european phase

Ref document number: 2011857363

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2011857363

Country of ref document: EP