WO2017010359A1 - 送信装置、送信方法、受信装置および受信方法 - Google Patents
送信装置、送信方法、受信装置および受信方法 Download PDFInfo
- Publication number
- WO2017010359A1 WO2017010359A1 PCT/JP2016/069955 JP2016069955W WO2017010359A1 WO 2017010359 A1 WO2017010359 A1 WO 2017010359A1 JP 2016069955 W JP2016069955 W JP 2016069955W WO 2017010359 A1 WO2017010359 A1 WO 2017010359A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- subtitle
- information
- video
- stream
- display
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/235—Processing of additional data, e.g. scrambling of additional data or processing content descriptors
- H04N21/2353—Processing of additional data, e.g. scrambling of additional data or processing content descriptors specifically adapted to content descriptors, e.g. coding, compressing or processing of metadata
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B20/00—Signal processing not specific to the method of recording or reproducing; Circuits therefor
- G11B20/10—Digital recording or reproducing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/41—Structure of client; Structure of client peripherals
- H04N21/426—Internal components of the client ; Characteristics thereof
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/4302—Content synchronisation processes, e.g. decoder synchronisation
- H04N21/4307—Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
- H04N21/43072—Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen of multiple content streams on the same device
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/435—Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/488—Data services, e.g. news ticker
- H04N21/4884—Data services, e.g. news ticker for displaying subtitles
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
- H04N21/8547—Content authoring involving timestamps for synchronizing content
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/222—Studio circuitry; Studio devices; Studio equipment
- H04N5/262—Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
- H04N5/278—Subtitling
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/08—Systems for the simultaneous or sequential transmission of more than one television signal, e.g. additional information signals, the signals occupying wholly or partially the same frequency band, e.g. by time division
- H04N7/087—Systems for the simultaneous or sequential transmission of more than one television signal, e.g. additional information signals, the signals occupying wholly or partially the same frequency band, e.g. by time division with signal insertion during the vertical blanking interval only
- H04N7/088—Systems for the simultaneous or sequential transmission of more than one television signal, e.g. additional information signals, the signals occupying wholly or partially the same frequency band, e.g. by time division with signal insertion during the vertical blanking interval only the inserted signal being digital
- H04N7/0884—Systems for the simultaneous or sequential transmission of more than one television signal, e.g. additional information signals, the signals occupying wholly or partially the same frequency band, e.g. by time division with signal insertion during the vertical blanking interval only the inserted signal being digital for the transmission of additional display-information, e.g. menu for programme or channel selection
- H04N7/0885—Systems for the simultaneous or sequential transmission of more than one television signal, e.g. additional information signals, the signals occupying wholly or partially the same frequency band, e.g. by time division with signal insertion during the vertical blanking interval only the inserted signal being digital for the transmission of additional display-information, e.g. menu for programme or channel selection for the transmission of subtitles
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/431—Generation of visual interfaces for content selection or interaction; Content or additional data rendering
- H04N21/4312—Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
Definitions
- the present technology relates to a transmission device, a transmission method, a reception device, and a reception method, and particularly to a transmission device that transmits text information together with video information.
- subtitle (caption) information is transmitted as bitmap data.
- DVB Digital Video Broadcasting
- subtitle (caption) information is transmitted as bitmap data.
- font development corresponding to the resolution is performed on the receiving side.
- the text information has timing information.
- TTML Timed Text Markup Language
- W3C World Wide Web Consortium
- the text information of the subtitle expressed in TTML is handled as a file in the form of a markup language (markup language).
- markup language markup language
- the purpose of this technology is to reduce the processing load for displaying the subtitle on the receiving side.
- a video encoding unit for generating a video stream including encoded video data
- a subtitle encoding unit for generating a subtitle stream including text information of a subtitle having display timing information, and abstract information having information corresponding to some of the plurality of pieces of information indicated by the text information
- a transmission apparatus includes a transmission unit that transmits a container of a predetermined format including the video stream and the subtitle stream.
- a video stream including encoded video data is generated by the video encoding unit.
- a subtitle stream including subtitle text information having display timing information and abstract information having information corresponding to some of the plurality of pieces of information indicated by the text information is generated by the subtitle encoder.
- the text information of the subtitle may be TTML or a derived format of this TTML.
- the container of the predetermined format containing a video stream and a subtitle stream is transmitted by a transmission part.
- the abstract information may include subtitle display timing information.
- the subtitle display timing can be controlled based on the subtitle display timing information included in the abstract information without scanning the subtitle text information.
- the display timing information of the subtitle may have information on the display start timing and the display period.
- the subtitle stream is composed of a PES packet including a PES header and a PES payload, the subtitle text information and abstract information are arranged in the PES payload, and the display start timing is the PTS inserted in the PES header. It may be set as indicated by a display offset from (Presentation Time Stamp).
- the abstract information may include display control information for controlling the display state of the subtitle.
- the receiving side can control the display state of the subtitle based on the display control information included in the abstract information without scanning the text information of the subtitle.
- the display control information may include information on at least one of the display position, color gamut, and dynamic range of the subtitle.
- the display control information may further include target video information.
- the abstract information may include notification information for notifying that there is a change in the text information element of the subtitle.
- notification information for notifying that there is a change in the text information element of the subtitle.
- the subtitle encoding unit may segment the subtitle text information and abstract information to generate a subtitle stream having a predetermined number of segments.
- the abstract information can be easily acquired by extracting the segment including the abstract information from the subtitle stream.
- the segment of the abstract information may be arranged first, followed by the segment of the text information of the subtitle.
- the segment of the abstract information is arranged first, so that the receiving side can easily and efficiently extract the segment of the abstract information from the subtitle stream.
- the subtitle stream includes the abstract information corresponding to the text information together with the text information of the subtitle. Therefore, the receiving side can perform processing for subtitle display using the abstract information, and the processing load can be reduced.
- a receiving unit for receiving a container in a predetermined format including a video stream and a subtitle stream The video stream includes encoded video data,
- the subtitle stream includes text information of a subtitle having display timing information, and abstract information having information corresponding to some information of a plurality of pieces of information indicated by the text information,
- a video decoding process for decoding the video stream to obtain video data a subtitle decoding process for decoding the subtitle stream to obtain subtitle bitmap data and extracting the abstract information, and a subtitle decoding process for the video data.
- a container of a predetermined format including a video stream and a subtitle stream is received.
- the video stream includes encoded video data.
- the subtitle stream includes text information of a subtitle having display timing information, and abstract information having information corresponding to some information among a plurality of pieces of information indicated by the text information.
- the video decoding process, the subtitle decoding process, the video superimposing process, and the bitmap data process are controlled by the control unit.
- video data is obtained by decoding a video stream.
- subtitle decoding process the subtitle stream is decoded to obtain subtitle bitmap data, and abstract information is extracted.
- subtitle bitmap data is superimposed on video data to obtain display video data.
- bitmap data processing the bitmap data of the subtitle superimposed on the video data is processed based on the abstract information.
- the abstract information includes subtitle display timing information
- the bitmap data processing controls the superimposition timing of the subtitle bitmap data on the video data based on the subtitle display timing information. May be.
- the abstract information includes display control information for controlling the display state of the subtitle.
- the state of the bitmap of the subtitle superimposed on the video data is based on the display control information. It may be made to control.
- the bitmap data of the subtitle superimposed on the video data is processed based on the abstract information extracted from the subtitle stream. Therefore, it is possible to reduce the processing load for displaying the subtitle.
- a video encoding unit for generating a video stream including encoded video data A subtitle encoding unit that generates one or a plurality of segments in which elements of text information of a subtitle having display timing information are arranged, and generates a subtitle stream including the one or more segments;
- a transmission apparatus includes a transmission unit that transmits a container of a predetermined format including the video stream and the subtitle stream.
- a video stream including encoded video data is generated by the video encoding unit.
- the subtitle encoding unit generates one or a plurality of segments in which elements of subtitle text information having display timing information are arranged, and generates a subtitle stream including the one or more segments.
- the text information of the subtitle may be TTML or a derived format of the TTML.
- the transmission unit transmits a container in a predetermined format including a video stream and a subtitle stream.
- the text information of the subtitle having the display timing information is segmented and transmitted by being included in the subtitle stream. Therefore, the receiving side can satisfactorily receive each element of the subtitle text information.
- the subtitle text information is included in the segment layer or the element layer arranged in the segment layer.
- the transmission order and / or information regarding the presence / absence of update may be inserted.
- the receiving side can recognize the transmission order of the subtitle text information, so that the decoding process can be performed efficiently.
- the receiving side can easily grasp whether or not the subtitle text information is updated.
- TTM TTMLdataMetadata
- THLS text_header_layout_segment
- positioned it is a figure which shows the structural example of TBS (text_body_segment) by which the body (body) of a TTML structure is arrange
- THAS text_header_all_segment
- positioned it is a figure which shows the example of a structure of TWS (text
- FIG. 1 shows a configuration example of a transmission / reception system 10 as an embodiment.
- the transmission / reception system 10 includes a transmission device 100 and a reception device 200.
- the transmission apparatus 100 generates an MPEG2 transport stream TS as a container, and transmits the transport stream TS on a broadcast wave or a net packet.
- This transport stream TS includes a video stream having encoded video data.
- this transport stream TS includes a subtitle stream.
- This subtitle stream includes text information of a subtitle (caption) having display timing information, and abstract information having information corresponding to some information among a plurality of pieces of information indicated by the text information.
- the text information is, for example, TTML (Timed Text Text Markup Language) proposed by W3C ((World Wide Web Consortium).
- the abstract information includes subtitle display timing information.
- This display timing information includes display start timing and display period information.
- the subtitle stream is configured by a PES packet including a PES header and a PES payload, and the subtitle text information and display timing information are arranged in the PES payload.
- the display start timing is inserted in the PES header. It is indicated by the display offset from the PTS.
- the abstract information includes display control information for controlling the display state of the subtitle.
- the display control information includes subtitle display position, color gamut, and dynamic range information.
- the abstract information includes information on the target video.
- the receiving device 200 receives the transport stream TS transmitted from the transmitting device 100 by broadcast waves.
- the transport stream TS includes a video stream including encoded video data and a subtitle stream including subtitle text information and abstract information.
- the receiving device 200 obtains video data from the video stream, obtains subtitle bitmap data from the subtitle stream, and extracts abstract information.
- the receiving apparatus 200 obtains display video data by superimposing the subtitle bitmap data on the video data.
- the television receiver 200 processes the subtitle bitmap data to be superimposed on the video data based on the abstract information.
- the abstract information includes subtitle display timing information, and the receiving apparatus 200 controls the superimposition timing of the title bitmap data on the video data based on the display timing information.
- the abstract information includes display control information for controlling the display state (display position, color gamut, dynamic range, etc.) of the subtitle. Is controlled based on the display control information.
- FIG. 2 shows a configuration example of the transmission device 100.
- the transmission apparatus 100 includes a control unit 101, a camera 102, a video photoelectric conversion unit 103, an RGB / YCbCr conversion unit 104, a video encoder 105, a subtitle generation unit 106, a text format conversion unit 107, and a subtitle encoder. 108, a system encoder 109, and a transmission unit 110.
- the control unit 101 includes a CPU (Central Processing Unit), and controls the operation of each unit of the transmission device 100 based on a control program.
- the camera 102 images a subject and outputs video data (image data) of HDR (High Dynamic Range) or SDR (Standard Dynamic Range).
- the HDR image has a contrast ratio of 0 to 100% * N (N is a number greater than 1) exceeding the brightness of the white peak of the SDR image, for example, 0 to 1000%.
- the level of 100% corresponds to, for example, a white luminance value of 100 cd / m 2 .
- the video photoelectric conversion unit 103 performs photoelectric conversion on the video data obtained by the camera 102 to obtain transmission video data V1.
- the SDR photoelectric conversion characteristic is applied to perform photoelectric conversion to obtain SDR transmission video data (transmission video data having the SDR photoelectric conversion characteristic).
- the HDR photoelectric conversion characteristics are applied to perform photoelectric conversion to obtain HDR transmission video data (transmission video data having HDR photoelectric conversion characteristics).
- the RGB / YCbCr conversion unit 104 converts the transmission video data from the RGB domain to the YCbCr (luminance / color difference) domain.
- the video encoder 105 performs encoding such as MPEG4-AVC or HEVC on the transmission video data V1 converted into the YCbCr domain, and generates a video stream (PES stream) VS including the encoded video data.
- the video encoder 105 has information (transfer function) indicating electro-optic conversion characteristics corresponding to the photoelectric conversion characteristics of the transmission video data V1 in the VUI (video usability information) area of the SPS NAL unit of the access unit (AU). Meta information such as information indicating the color gamut of the transmission video data V1 and information indicating the reference level is inserted.
- the video encoder 105 includes meta data such as information (transfer function) indicating the electro-optical conversion characteristics corresponding to the photoelectric conversion characteristics of the transmission video data V1 and reference level information in the “SEIs” portion of the access unit (AU).
- information transfer function
- SEIs electro-optical conversion characteristics corresponding to the photoelectric conversion characteristics of the transmission video data V1 and reference level information in the “SEIs” portion of the access unit (AU).
- a newly defined dynamic range / SEI message (Dynamic_range SEI message) having information is inserted.
- the dynamic range / SEI message is provided with the information indicating the electro-optical conversion characteristics, even if the transmission video data V1 is HDR transmission video data, the HDR photoelectric conversion characteristics are compatible with the SDR photoelectric conversion characteristics.
- the information indicating the electro-optic conversion characteristic (gamma characteristic) corresponding to the SDR photoelectric conversion characteristic is inserted into the VUI of the SPS NAL unit, the electro-optical conversion characteristic corresponding to the HDR photoelectric conversion characteristic is indicated at a place other than the VUI. This is because information is required.
- FIG. 3 shows an example of photoelectric conversion characteristics.
- the horizontal axis indicates the input luminance level
- the vertical axis indicates the transmission code value.
- a curve a shows an example of the SDR photoelectric conversion characteristics.
- a curve b1 shows an example of the HDR photoelectric conversion characteristic (not compatible with the SDR photoelectric conversion characteristic).
- a curve b2 shows an example of HDR photoelectric conversion characteristics (compatible with SDR photoelectric conversion characteristics).
- the input luminance level matches the SDR photoelectric conversion characteristic up to the compatibility limit value.
- the transmission code value becomes the compatibility level.
- the reference level information is given to the dynamic range / SEI message when the transmission video data V1 is SDR transmission video data
- the VUI of the SPS NAL unit has an electro-optical conversion characteristic (gamma) corresponding to the SDR photoelectric conversion characteristic. This is because there is no standard specification regarding the insertion of the reference level.
- FIG. 4A shows a structural example (Syntax) of the dynamic range / SEI message.
- FIG. 4B shows the contents (Semantics) of main information in the structural example.
- the 1-bit flag information “Dynamic_range_cancel_flag” indicates whether the message “Dynamic_range” is to be refreshed. “0” indicates that the message is refreshed, and “1” indicates that the message is not refreshed, that is, the previous message is maintained as it is.
- the 8-bit field of “coded_data_bit_depth” indicates the number of encoded pixel bits.
- An 8-bit field of “reference_level” indicates a reference luminance level value as a reference level.
- a 1-bit field of “modify_tf_flag” indicates whether or not to modify Transfer Function (TF) indicated by VUI (video usability information). “0” indicates that the TF indicated by the VUI is a target, and “1” indicates that the TF specified by the “transfer_function” of this SEI is used to modify the TF of the VUI.
- the 8-bit field of “transfer_function” indicates an electro-optic conversion characteristic corresponding to the photoelectric conversion characteristic of the transmission video data V1.
- the subtitle generation unit 106 generates text data (character code) DT as subtitle information.
- the text format conversion unit 107 inputs the text data DT, and obtains text information of a subtitle in a predetermined format, in this embodiment, TTML (Timed Text Markup Language).
- TTML Timed Text Markup Language
- FIG. 5 shows an example of a TTML (Timed Text Markup Language) structure.
- TTML is described on an XML basis.
- FIG. 6A also shows an example of the TTML structure.
- TTML consists of a header and a body. Each element such as metadata, styling, styling extension, layout, etc. exists in the header (head).
- FIG. 7 shows a structural example of metadata (TTM: TTML Metadata). This metadata includes metadata title information and copyright information.
- FIG. 8A shows a structural example of styling (TTS: TTML Styling).
- this styling includes information such as region position, size, color (color), font (fontFamily), font size (fontSize), and text alignment (textAlign). Yes.
- Tts: origin specifies the start position of the region that is the display area of the subtitle by the number of pixels.
- start position (see arrow P) is (480, 600).
- Tts: extent specifies the end position of the region by the number of offset pixels in the horizontal and vertical directions from the start position.
- end position indicates (480 + 560, 600 + 350).
- the number of offset pixels corresponds to the horizontal and vertical sizes of the region.
- “Tts: opacity “ 1.0 ”” indicates the mixing ratio between the subtitle (caption) and the background video. For example, “1.0” indicates that the subtitle is 100% and the background video is 0%, and “0.1” indicates that the subtitle (caption) is 0% and the background video is 100%. In the illustrated example, it is “1.0”.
- FIG. 9 shows a structural example of a styling extension (TTML Styling Extension).
- the styling extension includes information on a color space and a dynamic range.
- the color gamut information specifies the color gamut assumed by the subtitle. In the illustrated example, “ITUR2020” is indicated.
- the dynamic range information specifies whether the dynamic range assumed by the subtitle is SDR or HDR. In the example shown in the figure, the SDR is shown.
- FIG. 10 shows a structural example of a layout (region: TTML layout).
- This layout includes information such as an offset (padding), a background color (backgroundColor), an alignment (displayAlign), etc., in addition to the identifier (id) of the region in which the subtitle is arranged.
- FIG. 11 shows an example of the structure of the body.
- information of three subtitles subtitle 1 (subtitle 1), subtitle 2 (subtitle 2), and subtitle 3 (subtitle 3) is included.
- subtitle 1 subtitle 1
- subtitle 2 subtitle 2
- subtitle 3 subtitle 3
- a display start timing and a display end timing are described, and text data is described.
- the display start timing is “T1”
- the display end timing is “T3”
- the text data is “ABC”.
- the subtitle encoder 108 converts the TTML obtained by the text format conversion unit 107 into various segments, and generates a subtitle stream SS composed of PES packets in which those segments are arranged in the payload.
- FIG. 12 shows a configuration example of the PES packet.
- the PES header (PES header) includes PTS (Presentation Time Stamp).
- the PES data payload (PES data payload) includes APTS (abstract_parameter_TimedText_segment), THMS (text_header_metadata_segment), THSS (text header styling segment), THSES (text_header_styling_extension_segment), THLS (text_header_layout_segment) Yes.
- the PES data payload may include APTS (abstract_parameter_TimedText_segment), THAS (text_header_all_segment), and TBS (text_body_segment) segments. Further, the PES data payload (PES data payload) may include APTS (abstract_parameter_TimedText_segment) and TWS (text_whole_segment) segments.
- FIG. 13 shows the segment interface inside the PES.
- PES_data_field indicates the container portion of the PES data payload of the PES packet.
- the 8-bit field of “data_identifier” indicates an ID for identifying the type of data transmitted in the container part. Since the conventional subtitle (in the case of a bitmap) is supposed to be indicated by “0x20”, the text can be identified by a new value, for example, “0x21”.
- the 8-bit field of “subtitle_stream_id” indicates an ID for identifying the type of the subtitle stream.
- a new value for example, “0x01”, can be distinguished from the conventional subtitle stream “0x00” that transmits a bitmap.
- FIG. 14 shows APTS (abstract_parameter_TimedText_segment), THMS (text_header_metadata_segment), THSS (text header styling segment), THSES (text_header_styling_extension_segment), THLS (text_header_layout_segment), and TBS (text_body_segment) when PES data is placed in PBS data.
- APTS abtract_parameter_TimedText_segment
- THMS text_header_metadata_segment
- THSS text header styling segment
- THSES text_header_styling_extension_segment
- THLS text_header_layout_segment
- TBS text_body_segment
- FIG. 15A shows a structural example of “TimedTextSubtitling_segments ()” when each segment of APTS (abstract_parameter_TimedText_segment), THAS (text_header_all_segment), and TBS (text_body_segment) is arranged in the PES data payload.
- FIG. 15B shows a structural example of “TimedTextSubtitling_segments ()” when each segment of APTS (abstract_parameter_TimedText_segment) and TWS (text_whole_segment) is arranged in the PES data payload.
- each segment is inserted into the subtitle stream. For example, when there is no change other than the display subtitle, only two segments of APTS (abstract_parameter_TimedText_segment) and TBS (text_body_segment) are configured. In any case, an APTS segment having abstract information is first arranged in the PES data payload, followed by other segments. With such an arrangement, the receiving side can easily and efficiently extract the segment of the abstract information from the subtitle stream.
- FIG. 16A shows a structural example (syntax) of THMS (text_header_metadata_segment).
- This structure includes information of “sync_byte”, “segment_type”, “page_id”, “segment_length”, “thm_version_number”, and “segment_payload ()”.
- “Segment_type” is 8-bit data indicating the segment type, and is “0x20” indicating THMS, for example.
- “Segment_length” is 8-bit data indicating the length (size) of the segment.
- metadata as shown in FIG. 16B is arranged as XML information. This metadata is the same as the metadata element existing in the TTML header (see FIG. 7).
- FIG. 17A shows a structural example (syntax) of THSS (text_header_styling_segment).
- This structure includes information of “sync_byte”, “segment_type”, “page_id”, “segment_length”, “ths_version_number”, and “segment_payload ()”.
- “Segment_type” is 8-bit data indicating the segment type, and is “0x21” indicating THSS in this example.
- “Segment_length” is 8-bit data indicating the length (size) of the segment.
- metadata as shown in FIG. 17B is arranged as XML information. This metadata is the same as the styling element existing in the TTML header (see FIG. 8A).
- FIG. 18A shows a structural example (syntax) of THSES (text_header_styling_extension_segment).
- This structure includes information of “sync_byte”, “segment_type”, “page_id”, “segment_length”, “thse_version_number”, and “segment_payload ()”.
- “Segment_type” is 8-bit data indicating the segment type, and is “0x22” indicating THSES here, for example.
- “Segment_length” is 8-bit data indicating the length (size) of the segment.
- metadata as shown in FIG. 18B is arranged as XML information. This metadata is the same as the element of the styling extension (styling_extension) existing in the TTML header (see FIG. 9A).
- FIG. 19A shows a structural example (syntax) of THLS (text_header_layout_segment).
- This structure includes information of “sync_byte”, “segment_type”, “page_id”, “segment_length”, “thl_version_number”, and “segment_payload ()”.
- “Segment_type” is 8-bit data indicating the segment type, and is “0x23” indicating THLS, for example.
- “Segment_length” is 8-bit data indicating the length (size) of the segment.
- metadata as shown in FIG. 19B is arranged as XML information. This metadata is the same as the layout element existing in the TTML header (see FIG. 10).
- FIG. 20A shows a structural example (syntax) of TBS (text_body_segment).
- This structure includes information of “sync_byte”, “segment_type”, “page_id”, “segment_length”, “tb_version_number”, and “segment_payload ()”.
- “Segment_type” is 8-bit data indicating the segment type, and is “0x24” indicating TBS here, for example.
- metadata as shown in FIG. 20B is arranged as XML information. This metadata is the same as the TTML body (see FIG. 11).
- FIG. 21A shows a structural example (syntax) of THAS (text_header_all_segment).
- This structure includes information of “sync_byte”, “segment_type”, “page_id”, “segment_length”, “tha_version_number”, and “segment_payload ()”.
- “Segment_type” is 8-bit data indicating the segment type, and is “0x25” indicating THAS here, for example.
- “Segment_length” is 8-bit data indicating the length (size) of the segment.
- metadata as shown in FIG. 21B is arranged as XML information. This metadata is the entire header.
- FIG. 22A shows a structural example (syntax) of TWS (text whole segment).
- This structure includes information of “sync_byte”, “segment_type”, “page_id”, “segment_length”, “tw_version_number”, and “segment_payload ()”.
- “Segment_type” is 8-bit data indicating the segment type, and is, for example, “0x26” indicating TWS here.
- “Segment_length” is 8-bit data indicating the length (size) of the segment.
- metadata as shown in FIG. 22B is arranged as XML information. This metadata is the entire TTML (see FIG. 5). This structure is a structure for maintaining compatibility in the entire TTML, and puts the entire TTML in one segment.
- the receiving side can recognize the transmission order of TTML. Therefore, even when all the elements of TTML are sent together, the transmission order of TTML is changed. It can be recognized that the process is performed in a predetermined order, the process up to decoding is simplified, and the decoding process can be performed efficiently.
- FIG. 23B shows metadata (XML information) arranged in “segment_payload ()” in that case.
- APTS abtract_parameter_TimedText_segment
- This APTS segment includes abstract information. This abstract information has information corresponding to some information among a plurality of pieces of information indicated by TTML.
- FIG. 24 and FIG. 25 show a structure example (syntax) of APTS (abstract_parameter_TimedText_segment). 26 and 27 show the contents (Semantics) of main information in the structural example. Like the other segments, this structure includes information on “sync_byte”, “segment_type”, “page_id”, and “segment_length”. “Segment_type” is 8-bit data indicating the segment type, and is “0x19” indicating APTS here, for example. “Segment_length” is 8-bit data indicating the length (size) of the segment.
- the 4-bit field of “APT_version_number” indicates whether or not there is a change between the contents of the APTS (abstract_parameter_TimedText_segment) and the content that has been sent before. If there is a change, the value is incremented by one.
- a 4-bit field of “TTM_version_number” indicates whether or not there is a change between the contents of the THMS (text_header_metadata_segment) and the content that has been sent before. If there is a change, the value is incremented by one.
- a 4-bit field of “TTS_version_number” indicates whether or not there is a change with respect to the content sent to the element of THSS (text_header_styling_segment), and when there is a change, the value is increased by one.
- the 4-bit field of “TTSE_version_number” indicates whether or not there is a change between the contents sent to the element of THSES (text_header_styling_extension_segment) and if there is a change, the value is incremented by one.
- the 4-bit field of “TTL_version_number” indicates whether or not there is a change with respect to the content sent to the element of THLS (text_header_layout_segment) before. If there is a change, the value is incremented by one.
- the 4-bit field of“ TTHA_version_number ” indicates whether or not there is a change between the contents of the THAS (text_header_all_segment) and the content sent previously, and if there is a change, the value is incremented by one.
- the 4-bit field of “TW_version_number” indicates whether or not there is a change between the contents of TWS (text whole segment) and the contents sent previously, and if there is a change, the value is incremented by one. .
- the 4-bit field “subtitle_display_area” specifies a subtitle display area (subtitle area). For example, “0x1” specifies 640h * 480v, “0x2” specifies 720h * 480v, “0x3” specifies 720h * 576v, “0x4” specifies 1280h * 720v, and “0x5” 1920h * 1080v is designated, “0x6” designates 3840h * 2160v, and “0x7” designates 7680h * 4320v.
- the 4-bit field of “subtitle_color_gamut_info” specifies the assumed color gamut of the subtitle.
- the 4-bit field of “subtitle_dynamic_range_info” specifies the dynamic range assumed by the subtitle. For example, “0x1” indicates SDR and “0x2” indicates HDR. When HDR is specified for the subtitle, it is assumed that the luminance of the subtitle is assumed to be kept below the standard white level of the video.
- target_video_resolution specifies the assumed video resolution. For example, “0x1” specifies 640h * 480v, “0x2” specifies 720h * 480v, “0x3” specifies 720h * 576v, “0x4” specifies 1280h * 720v, and “0x5” 1920h * 1080v is designated, “0x6” designates 3840h * 2160v, and “0x7” designates 7680h * 4320v.
- the 4-bit field of “target_video_color_gamut_info” specifies the assumed color gamut of the video. For example, “0x1” indicates “BT.709” and “0x2” indicates “BT.2020”.
- a 4-bit field of “target_video_dynamic_range_info” specifies an assumed dynamic range of video. For example, “0x1” indicates “BT.709”, “0x2” indicates “BT.202x”, and “0x3” indicates “Smpte 2084”.
- the 4-bit field of“ number_of_regions ” specifies the number of regions. The following fields exist as many times as there are regions.
- a 16-bit field of “region_id” indicates a region ID.
- the 8-bit field of“ start_time_offset ” indicates the display start time of the subtitle as an offset value from the PTS.
- the offset value of “start_time_offset” is signed, and a negative value indicates display start at a point earlier than PTS.
- the offset value of “start_time_offset” is 0, it means that display is started at the timing of PTS.
- the precision of the value in the case of 8-bit representation is up to the first decimal place when the code value is divided by 10.
- the 8-bit field “end_time_offset” indicates the display end time of the subtitle as an offset value from “start_time_offset”. This offset value indicates the display period. When the offset value of “start_time_offset” described above is 0, it indicates that the display ends at the timing of the value obtained by adding the offset value of “end_time_offset” to PTS. The precision of the value in the case of 8-bit representation is up to the first decimal place when the code value is divided by 10.
- start_time_offset and “end_time_offset” can be transmitted with the same 90 kHz accuracy as PTS.
- a 32-bit space is secured as each field of “start_time_offset” and “end_time_offset”.
- the subtitle encoder 108 converts the display start timing (begin) and display end timing (end) of each subtitle included in the body of the TTML when converting the TTML into a segment. Based on the description, “PTS”, “start_time_offset”, and “end_time_offset” of each subtitle are set with reference to system time information (PCR, video / audio synchronization time). At this time, the subtitle encoder 108 may set the “PTS”, “start_time_offset”, and “end_time_offset” while verifying that the operation on the receiving side is performed correctly using the decoder buffer model. Good.
- the 16-bit field of “region_start_horizontal” indicates the horizontal pixel position of the upper left corner of the region in the subtitle display area specified by the “subtitle_display_area” described above (see point P in FIG. 8B).
- a 16-bit field of “region_start_vertical” indicates the vertical pixel position of the upper left end point of the region in the subtitle display area.
- the 16-bit field “region_end_horizontal” indicates the horizontal pixel position of the lower right corner point (see point Q in FIG. 8B) of the region in the subtitle display area.
- a 16-bit field of “region_end_vertical” indicates the vertical pixel position of the lower right end point of the region in the subtitle display area.
- the system encoder 109 generates a transport stream TS including the video stream VS generated by the video encoder 105 and the subtitle stream SS generated by the subtitle encoder 108.
- the transmission unit 110 transmits the transport stream TS on a broadcast wave or a net packet and transmits it to the reception device 200.
- Video data (image data) obtained by imaging with the camera 102 is supplied to the video photoelectric conversion unit 103.
- the video data obtained by the camera 102 is subjected to photoelectric conversion to obtain transmission video data V1.
- the video data is SDR video data
- SDR photoelectric conversion characteristics are applied to perform photoelectric conversion
- SDR transmission video data transmission video data having SDR photoelectric conversion characteristics
- HDR transmission video data transmission video data having HDR photoelectric conversion characteristics
- the transmission video data V1 obtained by the video photoelectric conversion unit 103 is supplied from the RGB domain to the YCbCr (luminance / color difference) domain by the RGB / YCbCr conversion unit 104 and then supplied to the video encoder 105.
- the video encoder 105 performs encoding such as MPEG4-AVC or HEVC on the transmission video data V1 to generate a video stream (PES stream) VS including the encoded video data.
- information (transfer function) indicating the electro-optic conversion characteristic corresponding to the photoelectric conversion characteristic of the transmission video data V1, and the transmission video data V1 are stored in the VUI area of the SPS NAL unit of the access unit (AU).
- Meta information such as information indicating a color gamut and information indicating a reference level is inserted.
- meta data such as information (transfer function) indicating the light-emission conversion characteristic corresponding to the photoelectric conversion characteristic of the transmission video data V1 and information of the reference level is included in the “SEIs” portion of the access unit (AU).
- SEIs transfer function
- a newly defined dynamic range SEI message (see FIG. 4) having information is inserted.
- the subtitle generation unit 106 generates text data (character code) DT as subtitle information.
- This text data DT is supplied to the text format conversion unit 107.
- the text format conversion unit 107 converts the subtitle text information having display timing information, that is, TTML, based on the text data DT (see FIGS. 3 and 4). This TTML is supplied to the subtitle encoder 108.
- the subtitle encoder 108 converts the TTML obtained by the text format conversion unit 107 into various segments, and generates a subtitle stream SS composed of PES packets in which those segments are arranged in the payload.
- an APTS segment having abstract information (see FIGS. 24 to 27) is first arranged in the payload of the PES packet, and subsequently, a segment having subtitle text information is arranged (see FIG. 12).
- the video stream VS generated by the video encoder 105 is supplied to the system encoder 109.
- the subtitle stream SS generated by the subtitle encoder 108 is supplied to the system encoder 109.
- a transport stream TS including the video stream VS and the subtitle stream SS is generated.
- the transport stream TS is transmitted to the receiving device 200 by the transmitting unit 110 on a broadcast wave or a net packet.
- FIG. 29 illustrates a configuration example of the receiving device 200.
- the receiving apparatus 200 includes a control unit 201, a user operation unit 202, a reception unit 203, a system decoder 204, a video decoder 205, a subtitle decoder 206, a color gamut / luminance conversion unit 207, a position / size conversion. Part 208.
- the receiving apparatus 200 includes a video superimposing unit 209, a YCbCr / RGB converting unit 210, an electro-optic converting unit 211, a display mapping unit 212, and a CE monitor 213.
- the control unit 201 includes a CPU (Central Processing Unit) and controls the operation of each unit of the receiving device 200 based on a control program.
- the user operation unit 202 is a switch, a touch panel, a remote control transmission unit, or the like for a user such as a viewer to perform various operations.
- the reception unit 203 receives the transport stream TS transmitted from the transmission device 100 on broadcast waves or net packets.
- the system decoder 204 extracts the video stream VS and the subtitle stream SS from the transport stream TS. Further, the system decoder 204 extracts various information inserted in the transport stream TS (container) and sends it to the control unit 201.
- the video decoder 205 decodes the video stream VS extracted by the system decoder 204 and outputs transmission video data V1. In addition, the video decoder 205 extracts a parameter set or SEI message inserted in each access unit constituting the video stream VS and sends it to the control unit 201.
- the SEI message also includes a dynamic range SEI message (see FIG. 4) having information (transfer function) indicating electro-optic conversion characteristics corresponding to the photoelectric conversion characteristics of the transmission video data V1, reference level information, and the like. .
- the subtitle decoder 206 processes segment data of each region included in the subtitle stream SS and outputs bitmap data of each region to be superimposed on the video data.
- the subtitle decoder 206 also extracts the abstract information included in the APTS segment and sends it to the control unit 201.
- This abstract information includes subtitle display timing information, subtitle display control information (subtitle display position, color gamut and dynamic range information), and target video information (resolution, color gamut and dynamic range information). Etc. are included.
- the display timing information and display control information of the subtitle are included in the XML information arranged in “segment_payload ()” of the segment other than APTS, it can also be obtained by scanning the XML information.
- abstract information can be easily obtained from the APTS segment only by extraction.
- the target video information (resolution, color gamut, and dynamic range information) can be acquired from the video stream VS system, but can be easily obtained simply by extracting the abstract information from the APTS segment.
- FIG. 30 shows a configuration example of the subtitle decoder 206.
- the subtitle decoder 206 includes a coded buffer 261, a subtitle segment decoder 262, a font development unit 263, and a bitmap buffer 264.
- the coded buffer 261 temporarily holds the subtitle stream SS.
- the subtitle segment decoder 262 decodes the segment data of each region held in the coded buffer 261 at a predetermined timing to obtain text data and control codes of each region.
- the font expansion unit 263 expands the font based on the text data and control code of each region obtained by the subtitle segment decoder 262, and obtains subtitle bitmap data of each region.
- the font development unit 263 uses, for example, position information (“region_start_horizontal”, “region_start_vertical”, “region_end_horizontal”, “region_end_vertical”) included in the abstract information as the position information of each region.
- Bitmap data of this subtitle is obtained in the RGB domain.
- the color gamut of the bitmap data of the subtitle matches the color gamut indicated by the color gamut information of the subtitle included in the abstract information.
- the dynamic range of the bitmap data of the subtitle is assumed to match the dynamic range indicated by the dynamic range information of the subtitle included in the abstract information.
- the subtitle bitmap data has a dynamic range of SDR and is subjected to photoelectric conversion by applying SDR photoelectric conversion characteristics.
- the dynamic range information is “HDR”
- the subtitle bitmap data has a dynamic range of HDR and is subjected to photoelectric conversion by applying the HDR photoelectric conversion characteristics. In this case, it is limited to the luminance range up to the HDR reference level on the premise of superimposition on the HDR video.
- the bitmap buffer 264 temporarily holds bitmap data of each region obtained by the font development unit 263.
- the bitmap data of each region held in the bitmap buffer 264 is read from the display start timing and superimposed on the image data, and this is continued for the display period.
- the subtitle segment decoder 262 extracts the PTS from the PES header of the PES packet.
- the subtitle segment decoder 262 extracts abstract information from the APTS segment. These pieces of information are sent to the control unit 201.
- the control unit 201 controls the read timing of the bitmap data of each region from the bitmap buffer 264 based on the PTS and the information of “start_time_offset” and “end_time_offset” included in the abstract information.
- the color gamut / luminance conversion unit 207 controls the color gamut information (“subtitle_color_gamut_info”) of the subtitle bitmap data and the color gamut information (“target_video_color_gamut_info”) of the video data under the control of the control unit 201.
- the color gamut of the subtitle bitmap data is matched with the color gamut of the video data.
- the color gamut / luminance conversion unit 207 is controlled by the control unit 201 based on the dynamic range information (“subtitle_dynamic_range_info”) of the subtitle bitmap data and the dynamic range information (“target_video_dynamic_range_info”) of the video data. Adjustment is made so that the maximum luminance level of the subtitle bitmap data is equal to or lower than the luminance reference level of the video data.
- FIG. 31 shows a configuration example of the color gamut / luminance conversion unit 207.
- the color gamut luminance conversion unit 210 includes an electro-optic conversion unit 221, a color gamut conversion unit 222, a photoelectric conversion unit 223, an RGB / YCbCr conversion unit 224, and a luminance conversion unit 225.
- the light-to-light converter 221 performs photoelectric conversion on the input subtitle bitmap data.
- the electro-optic conversion unit 221 performs electro-optic conversion by applying the SDR electro-optic conversion characteristics to obtain a linear state.
- the dynamic range of the subtitle bitmap data is SDR
- the electro-optic conversion unit 221 performs electro-optic conversion by applying the HDR electro-optic conversion characteristics to obtain a linear state. It is also conceivable that the input subtitle bitmap data is in a linear state where no photoelectric conversion is performed. In that case, the electro-optic conversion unit 221 becomes unnecessary.
- the color gamut conversion unit 222 matches the color gamut of the subtitle bitmap data output from the electro-optic conversion unit 221 with the color gamut of the video data. For example, when the color gamut of the subtitle bitmap data is “BT.709” and the video data gamut is “BT.2020”, the color gamut of the subtitle bitmap data is changed from “BT.709” to “BT.709”. Converted to BT.2020 ". When the color gamut of the subtitle bitmap data is the same as that of the video data, the color gamut conversion unit 222 outputs the input subtitle bitmap data as it is without doing anything.
- the photoelectric conversion unit 223 performs photoelectric conversion on the subtitle bitmap data output from the color gamut conversion unit 222 by applying the same photoelectric conversion characteristic as that applied to the video data.
- the RGB / YCbCr conversion unit 224 converts the subtitle bitmap data output from the photoelectric conversion unit 223 from the RGB domain to the YCbCr (luminance / color difference) domain.
- the luminance conversion unit 225 For the subtitle bitmap data output from the RGB / YCbCr converter 224, the luminance conversion unit 225 has the maximum luminance level of the subtitle bitmap data equal to or lower than the reference level of the luminance of the video data. Adjustment is performed so that the white level is obtained, and output bitmap data is obtained. In this case, if the bitmap data of the subtitle has already been subjected to brightness adjustment of the bitmap data in consideration of rendering to the HDR video, when the video data is HDR, it is input without doing anything substantially. The subtitle bitmap data is output as is.
- FIG. 32 shows a configuration example of the configuration unit 225Y related to the luminance signal Y included in the luminance conversion unit 225.
- the configuration unit 225 ⁇ / b> Y includes an encoded pixel bit number adjustment unit 231 and a level adjustment unit 232.
- the encoded pixel bit number adjusting unit 231 matches the encoded pixel bit number of the luminance signal Ys of the subtitle bitmap data with the encoded pixel bit number of the video data. For example, when the number of encoded pixel bits of the luminance signal Ys is “8 bits” and the number of encoded pixel bits of the video data is “10 bits”, the number of encoded pixel bits of the luminance signal Ys is changed from “8 bits” to “10 bits”. Converted.
- the level adjustment unit 232 adjusts the luminance signal Ys combined with the number of encoded pixel bits so that the maximum level is equal to or lower than the reference level of the luminance of the video data or the reference white level, and outputs the output luminance signal Ys ′. And
- FIG. 33 schematically shows the operation of the component 225Y shown in FIG.
- the illustrated example shows a case where the video data is HDR.
- the reference level is the boundary between the non-shiny part and the shiny part.
- a reference level exists between the maximum level (sc_high) and the minimum level (sc_low) of the luminance signal Ys after the number of encoded pixel bits is adjusted.
- the maximum level (sc_high) is adjusted to be equal to or lower than the reference level.
- a linear scale-down method is employed.
- the configuration unit 225Y (see FIG. 32) related to the luminance signal Ys included in the luminance conversion unit 225 has been described.
- the luminance conversion unit 225 only the process of matching the number of encoded pixel bits with the number of encoded pixel bits of video data is performed on the color difference signals Cb and Cr.
- the entire range expressed by the bit width is set to 100%, and the median value is used as a reference, and 50% in the positive direction from the reference value and in the negative direction. Is converted to a 50% amplitude.
- the position / size conversion unit 208 performs position conversion processing on the subtitle bitmap data obtained by the color gamut / brightness conversion unit 207 under the control of the control unit 201.
- the position / size conversion unit 208 displays the subtitle at an appropriate position in the background video. Change the position of the subtitle.
- the subtitle is compatible with HD resolution and the video is UHD resolution
- the UHD resolution exceeds the HD resolution and includes 4K resolution or 8K resolution.
- FIG. 34 (a) shows an example in which the video is UHD resolution and the subtitle is HD resolution compatible.
- the subtitle display area is represented by “subtitle area” in the figure.
- the positional relationship between “subtitle area” and the video is represented by a reference position shared by both, that is, the left-top.
- the pixel position of the start point of the region is (a, b), and the pixel position of the end point is (c, d).
- the display position of the subtitle on the background video is not the position intended by the production side, but is biased to the upper right.
- FIG. 34 (b) shows an example when the position conversion process is performed.
- the pixel position of the start point of the region that is the subtitle display area is (a ′, b ′), and the pixel position of the end point is (c ′, d ′).
- the position coordinates of the region before the position conversion are the coordinates of the HD display area, they are converted into the coordinates of the UHD display area according to the relationship with the video image frame, and thus based on the ratio of the UHD resolution to the HD resolution.
- the subtitle size conversion processing is performed simultaneously with the position conversion.
- the position / size conversion unit 208 applies the subtitle bitmap data obtained by the color gamut / brightness conversion unit 207 according to the operation of a user such as a viewer under the control of the control unit 201.
- the subtitle size conversion processing is performed automatically from the relationship between the video resolution and the subtitle corresponding resolution.
- region center position As shown in FIG. 35 (a), from the center position of the display area (dc: display center) to the center position of the region (region), that is, the point that divides the region into two horizontal and vertical directions (region center position: rc) Is determined in proportion to the video resolution.
- the center position rc of the region is defined from the center position dc of the subtitle display area
- the range from dc to rc The position is controlled so that the distance is doubled by the number of pixels.
- the ratio between the distance from rc to (rsx2, rsy2) and the distance from rc to (rsx1, rsy1), the distance from rc to (rex2, rey2), and the distance from rc to (rex1, rey1) The ratio of is made to be consistent with Ratio.
- the video superimposing unit 209 superimposes the subtitle bitmap data output from the position / size converting unit 208 on the transmission video data V1 output from the video decoder 205.
- the video superimposing unit 209 mixes the subtitle bitmap data at the mixing ratio indicated by the mixing ratio information (Mixing data) obtained by the subtitle decoder 206.
- the YCbCr / RGB conversion unit 210 converts the transmission video data V1 ′ on which the subtitle bitmap data is superimposed from the YCbCr (luminance / color difference) domain to the RGB domain. In this case, the YCbCr / RGB conversion unit 210 performs conversion using a conversion formula corresponding to the color gamut based on the color gamut information.
- the electro-optic conversion unit 211 performs electro-optic conversion by applying electro-optic conversion characteristics corresponding to the photoelectric conversion characteristics applied to the transmission video data V1 ′ converted into the RGB domain, and displays the image. For video data.
- the display mapping unit 212 performs display luminance adjustment on the display video data according to the maximum luminance display capability of the CE monitor 213.
- the CE monitor 213 displays an image based on the display video data for which the display brightness is adjusted.
- the CE monitor 213 includes, for example, an LCD (Liquid Crystal Display), an organic EL display (organic electroluminescence display), or the like.
- the reception unit 203 receives the transport stream TS transmitted from the transmission device 100 on broadcast waves or net packets.
- This transport stream TS is supplied to the system decoder 204.
- the system decoder 204 extracts the video stream VS and the subtitle stream SS from the transport stream TS. Further, the system decoder 204 extracts various information inserted in the transport stream TS (container) and sends it to the control unit 201.
- the video stream VS extracted by the system decoder 204 is supplied to the video decoder 205.
- the video stream VS is decoded to obtain transmission video data V1. Further, the video decoder 205 extracts a parameter set and SEI message inserted in each access unit constituting the video stream VS and sends the extracted parameter set and SEI message to the control unit 201.
- the SEI message also includes a dynamic range SEI message (see FIG. 4) having information (transfer function) indicating electro-optic conversion characteristics corresponding to the photoelectric conversion characteristics of the transmission video data V1, reference level information, and the like. .
- the subtitle stream SS extracted by the system decoder 204 is supplied to the subtitle decoder 206.
- the segment data of each region included in the subtitle stream SS is subjected to decoding processing, and the subtitle bitmap data of each region to be superimposed on the video data is obtained.
- subtitle decoder 206 abstract information included in the APTS segment (see FIGS. 24 and 25) is extracted and sent to the control unit 201.
- This abstract information includes subtitle display timing information, subtitle display control information (subtitle display position, color gamut and dynamic range information), and target video information (resolution, color gamut and dynamic range information). Etc. are included.
- the subtitle decoder 206 controls the subtitle bit of each region by the control unit 201 based on the display timing information (“start_time_offset” and “end_time_offset”) of the subtitle included in the abstract information.
- the output timing of map data is controlled.
- the bitmap data of the subtitle of each region obtained by the subtitle decoder 206 is supplied to the color gamut / luminance conversion unit 207.
- the color gamut / luminance conversion unit 207 under the control of the control unit 201, for example, based on the color gamut information (“subtitle_color_gamut_info” and “target_video_color_gamut_info”) included in the abstract information, the color gamut of the subtitle bitmap data is changed. , Matched to the color gamut of the video data.
- the color gamut / luminance conversion unit 207 controls the luminance of the subtitle bitmap data based on the dynamic range information (“subtitle_dynamic_range_info”, “target_video_dynamic_range_info”) included in the abstract information, for example, under the control of the control unit 201. Is adjusted to be equal to or lower than the reference level of the luminance of the video data.
- the bitmap data of the subtitles of each region obtained by the color gamut / luminance conversion unit 207 is supplied to the position / size conversion unit 208.
- the subtitle bitmap data of each region is controlled based on, for example, resolution information (“subtitle_display_area”, “target_video_resolution”) included in the abstract information under the control of the control unit 201.
- the position conversion process is performed.
- the position / size conversion unit 208 applies the subtitle bitmap data obtained by the color gamut / brightness conversion unit 207 according to a user operation such as a viewer under the control of the control unit 201.
- the subtitle size conversion process is automatically performed based on the relationship between the video resolution and the subtitle corresponding resolution.
- the transmission video data V1 obtained by the video decoder 204 is supplied to the video superimposing unit 209. Also, the subtitle bitmap data of each region obtained by the position / size conversion unit 208 is supplied to the video superimposing unit 209. The video superimposing unit 209 superimposes the subtitle bitmap data of each region on the transmission video data V1. In this case, the subtitle bitmap data is mixed at the mixing ratio indicated by the mixing ratio information (Mixing data).
- the transmission video data V1 ′ obtained by superimposing the subtitle bitmap data of each region obtained by the video superimposing unit 209 is converted from the YCbCr (luminance / color difference) domain to the RGB domain by the YCbCr / RGB conversion unit 210. , And supplied to the electro-optic conversion unit 211.
- electro-optic conversion unit 211 electro-optic conversion characteristics corresponding to the photoelectric conversion characteristics applied to the transmission video data V1 ′ are applied to perform electro-optic conversion, and display video data for displaying an image is obtained. .
- Display video data is supplied to the display mapping unit 212.
- display luminance adjustment is performed on the display video data in accordance with the maximum luminance display capability of the CE monitor 213 and the like.
- the display video data whose display brightness is adjusted in this way is supplied to the CE monitor 213.
- An image is displayed on the CE monitor 213 based on the display video data.
- the subtitle stream includes the abstract information corresponding to the text information together with the text information of the subtitle. Therefore, the receiving side can perform processing for subtitle display using the abstract information, and the processing load can be reduced.
- a subtitle stream SS including a PES packet in which segments of APTS (abstract_parameter_TimedText_segment) and TBS (text body segment) are arranged in the PES data payload is transmitted.
- TBS segment data and region position information (Region_position) included in the APTS segment subtitle bitmap data for displaying the word “ABC” at the position of the region “region ⁇ ⁇ ⁇ ⁇ ⁇ r1” Is generated.
- the receiving side outputs this bitmap data from the display start timing T1 to the display end timing T3 based on the PTS1 and the display timing information (STS1, ETS1) included in the APTS segment.
- STS1, ETS1 the display timing information included in the APTS segment.
- a subtitle stream SS including a PES packet in which segments of APTS (abstract_parameter_TimedText_segment) and TBS (text body segment) are arranged in the PES data payload is transmitted.
- TBS segment data and region position information (Region_position) included in the APTS segment subtitle bitmap data for displaying the word “DEF” at the position of the region “region ⁇ ⁇ ⁇ ⁇ ⁇ r2” Is generated.
- the receiving side outputs this bitmap data from the display start timing T2 to the display end timing T5 based on the PTS2 and the display timing information (STS2, ETS2) included in the APTS segment.
- STS2, ETS2 the display timing information included in the APTS segment.
- a subtitle stream SS including a PES packet in which segments of APTS (abstract_parameter_TimedText_segment) and TBS (text body segment) are arranged in the PES data payload is transmitted.
- TBS segment data and region position information (Region_position) included in the APTS segment subtitle bitmap data for displaying the word “GHI” at the position of the region “region ⁇ ⁇ ⁇ ⁇ ⁇ r3” Is generated.
- the receiving side outputs the bitmap data from the display start timing T4 to the display end timing T6 based on the display timing information (STS3, ETS3) included in the PTS3 and the APTS segment.
- STS3, ETS3 display timing information included in the PTS3
- APTS segment the display timing information included in the PTS3 and the APTS segment.
- TTML is used as text information of a subtitle having a predetermined format having display timing information.
- present technology is not limited to this, and other timed text information having information equivalent to TTML may be used.
- a TTML derivative format may be used.
- the container is a transport stream (MPEG-2 TS)
- MPEG-2 TS transport stream
- the present technology is not limited to the MPEG-2 TS container, and can be similarly realized even with other format containers such as MMT or ISOBMFF.
- the transmission / reception system 10 including the transmission device 100 and the reception device 200 is shown, but the configuration of the transmission / reception system to which the present technology can be applied is not limited thereto.
- the configuration of the receiver 200 may be a set-top box and a monitor connected by a digital interface such as HDMI (High-Definition Multimedia Interface). “HDMI” is a registered trademark.
- this technique can also take the following structures.
- a video encoding unit for generating a video stream including encoded video data
- a subtitle encoding unit for generating a subtitle stream including text information of a subtitle having display timing information, and abstract information having information corresponding to some of the plurality of pieces of information indicated by the text information
- a transmission apparatus comprising: a transmission unit configured to transmit a container including the video stream and the subtitle stream.
- the abstract information includes display timing information of a subtitle.
- the display timing information of the subtitle includes information of a display start timing and a display period.
- the subtitle stream is composed of PES packets including a PES header and a PES payload.
- the subtitle text information and the abstract information are arranged in the PES payload, The transmission device according to (3), wherein the display start timing is indicated by a display offset from the PTS inserted in the PES header.
- the abstract information includes display control information for controlling a display state of a subtitle.
- the display control information includes at least information of a subtitle display position, a color gamut, and a dynamic range.
- the display control information further includes target video information.
- the abstract information includes notification information for notifying that an element of text information of the subtitle is changed.
- the subtitle encoding part The transmission apparatus according to any one of (1) to (8), wherein the subtitle text information and the abstract information are segmented to generate the subtitle stream having a predetermined number of segments.
- the subtitle stream includes The transmission device according to (9), wherein the abstract information segment is arranged first, and then the subtitle text information segment is arranged.
- (11) The transmission device according to any one of (1) to (10), wherein the text information of the subtitle is TTML or a derived format of the TTML.
- a video encoding step for generating a video stream including encoded video data
- a subtitle encoding step for generating a subtitle stream including text information of a subtitle having display timing information and abstract information having information corresponding to some of the plurality of pieces of information indicated by the text information
- a transmission method comprising: a transmission step of transmitting a container including the video stream and the subtitle stream by a transmission unit.
- a receiving unit that receives a container of a predetermined format including a video stream and a subtitle stream,
- the video stream includes encoded video data
- the subtitle stream includes text information of a subtitle having display timing information, and abstract information having information corresponding to some information of a plurality of pieces of information indicated by the text information
- a video decoding unit that obtains video data by performing a decoding process on the video stream
- a subtitle decoding unit that performs decoding processing on the subtitle stream to obtain bitmap data of the subtitle, and extracts the abstract information
- a video superimposing unit for superimposing the subtitle bitmap data on the video data to obtain display video data
- a receiving apparatus further comprising: a control unit that controls bitmap data of a subtitle superimposed on the video data based on the abstract information.
- the abstract information includes subtitle display timing information, The control unit The receiving device according to (13), wherein the superimposition timing of the subtitle bitmap data on the video data is controlled based on display timing information of the subtitle.
- the abstract information includes display control information for controlling the display state of the subtitle.
- the control unit The receiving device according to (13) or (14), wherein the state of the bitmap of the subtitle superimposed on the video data is controlled based on the display control information.
- the reception unit includes a reception step of receiving a container in a predetermined format including the video stream and the subtitle stream,
- the video stream includes encoded video data
- the subtitle stream includes text information of a subtitle having display timing information, and abstract information having information corresponding to some information of a plurality of pieces of information indicated by the text information,
- a video decoding step for obtaining video data by performing a decoding process on the video stream;
- a subtitle decoding step of performing decoding processing on the subtitle stream to obtain bitmap data of the subtitle and extracting the abstract information;
- a receiving method further comprising: a control step of controlling bitmap data of a subtitle superimposed on the video data based on the abstract information.
- a video encoding unit that generates a video stream including encoded video data
- a subtitle encoding unit that generates one or a plurality of segments in which elements of text information of a subtitle having display timing information are arranged, and generates a subtitle stream including the one or more segments
- a transmission apparatus comprising: a transmission unit configured to transmit a container having a predetermined format including the video stream and the subtitle stream.
- the subtitle encoding unit When generating one segment in which all elements of the text information of the subtitle are arranged, The transmission device according to (17), wherein information regarding the transmission order and / or presence / absence of update of the subtitle text information is inserted into the segment layer or the element layer.
- the main feature of this technology is that the subtitle stream includes the text information of the subtitle and the abstract information corresponding to the text information, thereby reducing the processing load for displaying the subtitle on the receiving side. (See FIG. 12).
- DESCRIPTION OF SYMBOLS 10 ... Transmission / reception system 100 ... Transmission apparatus 101 ... Control part 102 ... Camera 103 ... Video photoelectric conversion part 104 ... RGB / YCbCr conversion part 105 ... Video encoder 106 ... Subtitle generation unit 107 ... Text format conversion unit 108 ... Subtitle encoder 109 ... System encoder 110 ... Transmission unit 200 ... Receiving device 201 ... Control unit 202 ... User operation unit 203 .. receiving unit 204... System decoder 205... Video decoder 206 .. subtitle decoder 207 .. color gamut / luminance conversion unit 208... Position / size conversion unit 209. ..YCbCr / RGB conversion unit 211 ...
- Electro-optic conversion unit 212 ... Display mapping Unit 213... CE monitor 221... Light conversion unit 222 .. color gamut conversion unit 223 .. photoelectric conversion unit 224... RGB / YCbCr conversion unit 225 .. luminance conversion unit 225 Y. 231... Encoded pixel bit number adjustment unit 232... Level adjustment unit 261. Coded buffer 262 .. subtitle segment decoder 263... Font development unit 264.
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Library & Information Science (AREA)
- Human Computer Interaction (AREA)
- Computer Security & Cryptography (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
Description
符号化ビデオデータを含むビデオストリームを生成するビデオエンコード部と、
表示タイミング情報を持つサブタイトルのテキスト情報と、該テキスト情報で示される複数の情報のうちの一部の情報に対応した情報を持つアブストラクト情報を含むサブタイトルストリームを生成するサブタイトルエンコード部と、
上記ビデオストリームと上記サブタイトルストリームを含む所定フォーマットのコンテナを送信する送信部を備える
送信装置にある。
ビデオストリームとサブタイトルストリームを含む所定フォーマットのコンテナを受信する受信部を備え、
上記ビデオストリームは、符号化ビデオデータを含み、
上記サブタイトルストリームは、表示タイミング情報を持つサブタイトルのテキスト情報と、該テキスト情報で示される複数の情報のうちの一部の情報に対応した情報を持つアブストラクト情報を含み、
上記ビデオストリームをデコードしてビデオデータを得るビデオデコード処理と、上記サブタイトルストリームをデコードしてサブタイトルのビットマップデータを得ると共に、上記アブストラクト情報を抽出するサブタイトルデコード処理と、上記ビデオデータに上記サブタイトルのビットマップデータを重畳して表示用ビデオデータを得るビデオ重畳処理と、上記ビデオデータに重畳されるサブタイトルのビットマップデータを上記アブストラクト情報に基づいて処理するビットマップデータ処理を制御する制御部をさらに備える
受信装置にある。
符号化ビデオデータを含むビデオストリームを生成するビデオエンコード部と、
表示タイミング情報を持つサブタイトルのテキスト情報の要素が配置された1つまたは複数のセグメントを生成し、該1つまたは複数のセグメントを含むサブタイトルストリームを生成するサブタイトルエンコード部と、
上記ビデオストリームと上記サブタイトルストリームを含む所定フォーマットのコンテナを送信する送信部を備える
送信装置にある。
1.実施の形態
2.変形例
[送受信システムの構成例]
図1は、実施の形態としての送受信システム10の構成例を示している。この送受信システム10は、送信装置100と受信装置200により構成されている。
図2は、送信装置100の構成例を示している。この送信装置100は、制御部101と、カメラ102と、ビデオ光電変換部103と、RGB/YCbCr変換部104と、ビデオエンコーダ105と、サブタイトル発生部106と、テキストフォーマット変換部107と、サブタイトルエンコーダ108と、システムエンコーダ109と、送信部110を有している。
ここで、APTS(abstract_parameter_TimedText_segment)のセグメントについて説明する。このAPTSのセグメントには、アブストラクト情報が含まれる。このアブストラクト情報には、TTMLで示される複数の情報のうちの一部の情報に対応した情報を持っている。
図29は、受信装置200の構成例を示している。この受信装置200は、制御部201と、ユーザ操作部202と、受信部203と、システムデコーダ204と、ビデオデコーダ205と、サブタイトルデコーダ206と、色域・輝度変換部207と、位置・サイズ変換部208を有している。また、受信装置200は、ビデオ重畳部209と、YCbCr/RGB変換部210と、電光変換部211と、表示マッピング部212と、CEモニタ213を有している。
なお、上述実施の形態においては、表示タイミング情報を持つ所定フォーマットのサブタイトルのテキスト情報としてTTMLを用いる例を示した。しかし、本技術は、これに限定されず、TTMLと同等の情報を持つその他のタイムドテキスト情報を用いることも考えらえる。例えば、TTMLの派生フォーマットを用いてもよい。
(1)符号化ビデオデータを含むビデオストリームを生成するビデオエンコード部と、
表示タイミング情報を持つサブタイトルのテキスト情報と、該テキスト情報で示される複数の情報のうちの一部の情報に対応した情報を持つアブストラクト情報を含むサブタイトルストリームを生成するサブタイトルエンコード部と、
上記ビデオストリームと上記サブタイトルストリームを含むコンテナを送信する送信部を備える
送信装置。
(2)上記アブストラクト情報には、サブタイトルの表示タイミング情報が含まれる
前記(1)に記載の送信装置。
(3)上記サブタイトルの表示タイミング情報は、表示開始タイミングと表示期間の情報を持つ
前記(2)に記載の送信装置。
(4)上記サブタイトルストリームは、PESヘッダとPESペイロードとからなるPESパケットにより構成され、
上記サブタイトルのテキスト情報と上記アブストラクト情報はPESペイロードに配置され、
上記表示開始タイミングは、上記PESヘッダに挿入されているPTSからの表示オフセットで示される
前記(3)に記載の送信装置。
(5)上記アブストラクト情報には、サブタイトルの表示状態を制御するための表示制御情報が含まれる
前記(1)から(4)のいずれかに記載の送信装置。
(6)上記表示制御情報には、少なくともサブタイトルの表示位置、色域およびダイナミックレンジのうちのいずれかの情報が含まれる
前記(5)に記載の送信装置。
(7)上記表示制御情報には、対象ビデオの情報がさらに含まれる
前記(6)に記載の送信装置。
(8)上記アブストラクト情報には、上記サブタイトルのテキスト情報の要素に変化があることを通知する通知情報が含まれる
前記(1)から(7)のいずれかに記載の送信装置。
(9)上記サブタイトルエンコード部は、
上記サブタイトルのテキスト情報と上記アブストラクト情報をセグメント化し、所定数のセグメントを持つ上記サブタイトルストリームを生成する
前記(1)から(8)のいずれかに記載の送信装置。
(10)上記サブタイトルストリームには、
上記アブストラクト情報のセグメントが最初に配置され、続いて上記サブタイトルのテキスト情報のセグメントが配置される
前記(9)に記載の送信装置。
(11)上記サブタイトルのテキスト情報は、TTML、あるいは該TTMLの派生フォーマットである
前記(1)から(10)のいずれかに記載の送信装置。
(12)符号化ビデオデータを含むビデオストリームを生成するビデオエンコードステップと、
表示タイミング情報を持つサブタイトルのテキスト情報と、該テキスト情報で示される複数の情報のうちの一部の情報に対応した情報を持つアブストラクト情報を含むサブタイトルストリームを生成するサブタイトルエンコードステップと、
送信部により、上記ビデオストリームと上記サブタイトルストリームを含むコンテナを送信する送信ステップを有する
送信方法。
(13)ビデオストリームとサブタイトルストリームを含む所定フォーマットのコンテナを受信する受信部を備え、
上記ビデオストリームは、符号化ビデオデータを含み、
上記サブタイトルストリームは、表示タイミング情報を持つサブタイトルのテキスト情報と、該テキスト情報で示される複数の情報のうちの一部の情報に対応した情報を持つアブストラクト情報を含み、
上記ビデオストリームにデコード処理を施してビデオデータを得るビデオデコード部と、
上記サブタイトルストリームにデコード処理を施してサブタイトルのビットマップデータを得ると共に、上記アブストラクト情報を抽出するサブタイトルデコード部と、
上記ビデオデータに上記サブタイトルのビットマップデータを重畳して表示用ビデオデータを得るビデオ重畳部と、
上記ビデオデータに重畳されるサブタイトルのビットマップデータを、上記アブストラクト情報に基づいて制御する制御部をさらに備える
受信装置。
(14)上記アブストラクト情報には、サブタイトルの表示タイミング情報が含まれており、
上記制御部は、
上記ビデオデータへの上記サブタイトルのビットマップデータの重畳タイミングを、上記サブタイトルの表示タイミング情報に基づいて制御する
前記(13)に記載の受信装置。
(15)上記アブストラクト情報には、サブタイトルの表示状態を制御するための表示制御情報が含まれており、
上記制御部は、
上記ビデオデータに重畳される上記サブタイトルのビットマップの状態を上記表示制御情報に基づいて制御する
前記(13)または(14)に記載の受信装置。
(16)受信部により、ビデオストリームとサブタイトルストリームを含む所定フォーマットのコンテナを受信する受信ステップを有し、
上記ビデオストリームは、符号化ビデオデータを含み、
上記サブタイトルストリームは、表示タイミング情報を持つサブタイトルのテキスト情報と、該テキスト情報で示される複数の情報のうちの一部の情報に対応した情報を持つアブストラクト情報を含み、
上記ビデオストリームにデコード処理を施してビデオデータを得るビデオデコードステップと、
上記サブタイトルストリームにデコード処理を施してサブタイトルのビットマップデータを得ると共に、上記アブストラクト情報を抽出するサブタイトルデコードステップと、
上記ビデオデータに上記サブタイトルのビットマップデータを重畳して表示用ビデオデータを得るビデオ重畳ステップと、
上記ビデオデータに重畳されるサブタイトルのビットマップデータを、上記アブストラクト情報に基づいて制御する制御ステップをさらに有する
受信方法。
(17)符号化ビデオデータを含むビデオストリームを生成するビデオエンコード部と、
表示タイミング情報を持つサブタイトルのテキスト情報の要素が配置された1つまたは複数のセグメントを生成し、該1つまたは複数のセグメントを含むサブタイトルストリームを生成するサブタイトルエンコード部と、
上記ビデオストリームと上記サブタイトルストリームを含む所定フォーマットのコンテナを送信する送信部を備える
送信装置。
(18)上記サブタイトルエンコード部は、
上記サブタイトルのテキスト情報の全ての要素が配置された1つのセグメントを生成する場合、
上記セグメントのレイヤまたは上記要素のレイヤに、上記サブタイトルのテキスト情報の伝送順および/または更新の有無に関する情報を挿入する
前記(17)に記載の送信装置。
(19)上記サブタイトルのテキスト情報は、TTML、あるいは該TTMLの派生フォーマットである
前記(17)または(18)に記載の送信装置。
(20)符号化ビデオデータを含むビデオストリームを生成するビデオエンコードステップと、
表示タイミング情報を持つサブタイトルのテキスト情報の要素が配置された1つまたは複数のセグメントを生成し、該1つまたは複数のセグメントを含むサブタイトルストリームを生成するサブタイトルエンコードステップと、
送信部により、上記ビデオストリームと上記サブタイトルストリームを含む所定フォーマットのコンテナを送信する送信ステップを有する
送信方法。
100・・・送信装置
101・・・制御部
102・・・カメラ
103・・・ビデオ光電変換部
104・・・RGB/YCbCr変換部
105・・・ビデオエンコーダ
106・・・サブタイトル発生部
107・・・テキストフォーマット変換部
108・・・サブタイトルエンコーダ
109・・・システムエンコーダ
110・・・送信部
200・・・受信装置
201・・・制御部
202・・・ユーザ操作部
203・・・受信部
204・・・システムデコーダ
205・・・ビデオデコーダ
206・・・サブタイトルデコーダ
207・・・色域・輝度変換部
208・・・位置・サイズ変換部
209・・・ビデオ重畳部
210・・・YCbCr/RGB変換部
211・・・電光変換部
212・・・表示マッピング部
213・・・CEモニタ
221・・・電光変換部
222・・・色域変換部
223・・・光電変換部
224・・・RGB/YCbCr変換部
225・・・輝度変換部
225Y・・・構成部
231・・・符号化画素ビット数調整部
232・・・レベル調整部
261・・・コーデッドバッファ
262・・・サブタイトルセグメントデコーダ
263・・・フォント展開部
264・・・ビットマップバッファ
Claims (20)
- 符号化ビデオデータを含むビデオストリームを生成するビデオエンコード部と、
表示タイミング情報を持つサブタイトルのテキスト情報と、該テキスト情報で示される複数の情報のうちの一部の情報に対応した情報を持つアブストラクト情報を含むサブタイトルストリームを生成するサブタイトルエンコード部と、
上記ビデオストリームと上記サブタイトルストリームを含む所定フォーマットのコンテナを送信する送信部を備える
送信装置。 - 上記アブストラクト情報には、サブタイトルの表示タイミング情報が含まれる
請求項1に記載の送信装置。 - 上記サブタイトルの表示タイミング情報は、表示開始タイミングと表示期間の情報を持つ
請求項2に記載の送信装置。 - 上記サブタイトルストリームは、PESヘッダとPESペイロードとからなるPESパケットにより構成され、
上記サブタイトルのテキスト情報と上記アブストラクト情報はPESペイロードに配置され、
上記表示開始タイミングは、上記PESヘッダに挿入されているPTSからの表示オフセットで示される
請求項3に記載の送信装置。 - 上記アブストラクト情報には、サブタイトルの表示状態を制御するための表示制御情報が含まれる
請求項1に記載の送信装置。 - 上記表示制御情報には、少なくともサブタイトルの表示位置、色域およびダイナミックレンジのうちのいずれかの情報が含まれる
請求項5に記載の送信装置。 - 上記表示制御情報には、対象ビデオの情報がさらに含まれる
請求項6に記載の送信装置。 - 上記アブストラクト情報には、上記サブタイトルのテキスト情報の要素に変化があることを通知する通知情報が含まれる
請求項1に記載の送信装置。 - 上記サブタイトルエンコード部は、
上記サブタイトルのテキスト情報と上記アブストラクト情報をセグメント化し、所定数のセグメントを持つ上記サブタイトルストリームを生成する
請求項1に記載の送信装置。 - 上記サブタイトルストリームには、
上記アブストラクト情報のセグメントが最初に配置され、続いて上記サブタイトルのテキスト情報のセグメントが配置される
請求項9に記載の送信装置。 - 上記サブタイトルのテキスト情報は、TTML、あるいは該TTMLの派生フォーマットである
請求項1に記載の送信装置。 - 符号化ビデオデータを含むビデオストリームを生成するビデオエンコードステップと、
表示タイミング情報を持つサブタイトルのテキスト情報と、該テキスト情報で示される複数の情報のうちの一部の情報に対応した情報を持つアブストラクト情報を含むサブタイトルストリームを生成するサブタイトルエンコードステップと、
送信部により、上記ビデオストリームと上記サブタイトルストリームを含む所定フォーマットのコンテナを送信する送信ステップを有する
送信方法。 - ビデオストリームとサブタイトルストリームを含む所定フォーマットのコンテナを受信する受信部を備え、
上記ビデオストリームは、符号化ビデオデータを含み、
上記サブタイトルストリームは、表示タイミング情報を持つサブタイトルのテキスト情報と、該テキスト情報で示される複数の情報のうちの一部の情報に対応した情報を持つアブストラクト情報を含み、
上記ビデオストリームをデコードしてビデオデータを得るビデオデコード処理と、上記サブタイトルストリームをデコードしてサブタイトルのビットマップデータを得ると共に、上記アブストラクト情報を抽出するサブタイトルデコード処理と、上記ビデオデータに上記サブタイトルのビットマップデータを重畳して表示用ビデオデータを得るビデオ重畳処理と、上記ビデオデータに重畳されるサブタイトルのビットマップデータを上記アブストラクト情報に基づいて処理するビットマップデータ処理を制御する制御部をさらに備える
受信装置。 - 上記アブストラクト情報には、サブタイトルの表示タイミング情報が含まれており、
上記ビットマップデータ処理では、
上記ビデオデータへの上記サブタイトルのビットマップデータの重畳タイミングを、上記サブタイトルの表示タイミング情報に基づいて制御する
請求項13に記載の受信装置。 - 上記アブストラクト情報には、サブタイトルの表示状態を制御するための表示制御情報が含まれており、
上記ビットマップデータ処理では、
上記ビデオデータに重畳される上記サブタイトルのビットマップの状態を上記表示制御情報に基づいて制御する
請求項13に記載の受信装置。 - 受信部により、ビデオストリームとサブタイトルストリームを含む所定フォーマットのコンテナを受信する受信ステップを有し、
上記ビデオストリームは、符号化ビデオデータを含み、
上記サブタイトルストリームは、表示タイミング情報を持つサブタイトルのテキスト情報と、該テキスト情報で示される複数の情報のうちの一部の情報に対応した情報を持つアブストラクト情報を含み、
上記ビデオストリームにデコード処理を施してビデオデータを得るビデオデコードステップと、
上記サブタイトルストリームにデコード処理を施してサブタイトルのビットマップデータを得ると共に、上記アブストラクト情報を抽出するサブタイトルデコードステップと、
上記ビデオデータに上記サブタイトルのビットマップデータを重畳して表示用ビデオデータを得るビデオ重畳ステップと、
上記ビデオデータに重畳されるサブタイトルのビットマップデータを、上記アブストラクト情報に基づいて制御する制御ステップをさらに有する
受信方法。 - 符号化ビデオデータを含むビデオストリームを生成するビデオエンコード部と、
表示タイミング情報を持つサブタイトルのテキスト情報の要素が配置された1つまたは複数のセグメントを生成し、該1つまたは複数のセグメントを含むサブタイトルストリームを生成するサブタイトルエンコード部と、
上記ビデオストリームと上記サブタイトルストリームを含む所定フォーマットのコンテナを送信する送信部を備える
送信装置。 - 上記サブタイトルエンコード部は、
上記サブタイトルのテキスト情報の全ての要素が配置された1つのセグメントを生成する場合、
上記セグメントのレイヤまたは上記要素のレイヤに、上記サブタイトルのテキスト情報の伝送順および/または更新の有無に関する情報を挿入する
請求項17に記載の送信装置。 - 上記サブタイトルのテキスト情報は、TTML、あるいは該TTMLの派生フォーマットである
請求項17に記載の送信装置。 - 符号化ビデオデータを含むビデオストリームを生成するビデオエンコードステップと、
表示タイミング情報を持つサブタイトルのテキスト情報の要素が配置された1つまたは複数のセグメントを生成し、該1つまたは複数のセグメントを含むサブタイトルストリームを生成するサブタイトルエンコードステップと、
送信部により、上記ビデオストリームと上記サブタイトルストリームを含む所定フォーマットのコンテナを送信する送信ステップを有する
送信方法。
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2017528617A JP6943179B2 (ja) | 2015-07-16 | 2016-07-05 | 送信装置、送信方法、受信装置および受信方法 |
CN201680040344.2A CN107852517A (zh) | 2015-07-16 | 2016-07-05 | 传输装置、传输方法、接收装置和接收方法 |
EP16824339.2A EP3324637B1 (en) | 2015-07-16 | 2016-07-05 | Transmission device, transmission method, receiving device and receiving method |
US15/739,827 US20180376173A1 (en) | 2015-07-16 | 2016-07-05 | Transmission device, transmission method, reception device, and reception method |
AU2016294096A AU2016294096B2 (en) | 2015-07-16 | 2016-07-05 | Transmission device, transmission method, receiving device and receiving method |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2015142497 | 2015-07-16 | ||
JP2015-142497 | 2015-07-16 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017010359A1 true WO2017010359A1 (ja) | 2017-01-19 |
Family
ID=57758069
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2016/069955 WO2017010359A1 (ja) | 2015-07-16 | 2016-07-05 | 送信装置、送信方法、受信装置および受信方法 |
Country Status (6)
Country | Link |
---|---|
US (1) | US20180376173A1 (ja) |
EP (1) | EP3324637B1 (ja) |
JP (3) | JP6943179B2 (ja) |
CN (1) | CN107852517A (ja) |
AU (1) | AU2016294096B2 (ja) |
WO (1) | WO2017010359A1 (ja) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020133372A1 (zh) * | 2018-12-29 | 2020-07-02 | 深圳市大疆创新科技有限公司 | 视频的字幕处理方法和导播系统 |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20200095651A (ko) * | 2019-02-01 | 2020-08-11 | 삼성전자주식회사 | 고 동적 범위 콘텐트를 재생하는 전자 장치 및 그 방법 |
US10970882B2 (en) | 2019-07-24 | 2021-04-06 | At&T Intellectual Property I, L.P. | Method for scalable volumetric video coding |
US10979692B2 (en) * | 2019-08-14 | 2021-04-13 | At&T Intellectual Property I, L.P. | System and method for streaming visible portions of volumetric video |
CN113206853B (zh) * | 2021-05-08 | 2022-07-29 | 杭州当虹科技股份有限公司 | 一种视频批改结果保存改进方法 |
CN117714805A (zh) * | 2022-09-08 | 2024-03-15 | 海信电子科技(深圳)有限公司 | 一种显示设备及字幕显示方法 |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101312506B (zh) * | 2007-05-21 | 2010-12-22 | Nec卡西欧移动通信株式会社 | 附带字幕的影像重现装置 |
DE102009037208B3 (de) * | 2009-08-12 | 2011-06-01 | Aesculap Ag | Verfahren und Vorrichtung zur Bestimmung der Lage einer Tangentialebene an drei Extrempunkten eines Körpers |
JP2011239169A (ja) * | 2010-05-10 | 2011-11-24 | Sony Corp | 立体画像データ送信装置、立体画像データ送信方法、立体画像データ受信装置および立体画像データ受信方法 |
JP5685969B2 (ja) * | 2011-02-15 | 2015-03-18 | ソニー株式会社 | 表示制御方法、表示制御装置 |
WO2012177874A2 (en) * | 2011-06-21 | 2012-12-27 | The Nielsen Company (Us), Llc | Methods and apparatus to measure exposure to streaming media |
JP2013021415A (ja) * | 2011-07-07 | 2013-01-31 | Sony Corp | 送信装置、送信方法および受信装置 |
CN103688532B (zh) * | 2011-07-29 | 2018-05-04 | 索尼公司 | 流式传输分发装置和方法、流式传输接收装置和方法、流式传输系统 |
EP2793464A4 (en) * | 2011-12-16 | 2015-09-02 | Sony Corp | RECEIVING DEVICE AND CONTROL METHOD THEREFOR, DEVICE, METHOD, AND BROADCASTING SYSTEM, AND PROGRAM |
JPWO2013105401A1 (ja) * | 2012-01-13 | 2015-05-11 | ソニー株式会社 | 送信装置、送信方法、受信装置および受信方法 |
JP5713142B1 (ja) * | 2014-12-05 | 2015-05-07 | ソニー株式会社 | 受信装置、およびデータ処理方法 |
CN107005733B (zh) * | 2014-12-19 | 2020-06-16 | 索尼公司 | 发送装置、发送方法、接收装置以及接收方法 |
-
2016
- 2016-07-05 WO PCT/JP2016/069955 patent/WO2017010359A1/ja active Application Filing
- 2016-07-05 AU AU2016294096A patent/AU2016294096B2/en not_active Ceased
- 2016-07-05 EP EP16824339.2A patent/EP3324637B1/en active Active
- 2016-07-05 CN CN201680040344.2A patent/CN107852517A/zh active Pending
- 2016-07-05 JP JP2017528617A patent/JP6943179B2/ja active Active
- 2016-07-05 US US15/739,827 patent/US20180376173A1/en not_active Abandoned
-
2021
- 2021-09-08 JP JP2021145958A patent/JP7259901B2/ja active Active
-
2023
- 2023-04-05 JP JP2023061094A patent/JP2023076613A/ja active Pending
Non-Patent Citations (1)
Title |
---|
DATA CODING AND TRANSMISSION SPECIFICATION FOR DIGITAL BROADCASTING ARIB STD-B24, VER.6.1, vol. 3, 16 December 2014 (2014-12-16), pages 56 - 61, XP009508389 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020133372A1 (zh) * | 2018-12-29 | 2020-07-02 | 深圳市大疆创新科技有限公司 | 视频的字幕处理方法和导播系统 |
Also Published As
Publication number | Publication date |
---|---|
EP3324637A1 (en) | 2018-05-23 |
JP6943179B2 (ja) | 2021-09-29 |
JP2021185714A (ja) | 2021-12-09 |
JP2023076613A (ja) | 2023-06-01 |
EP3324637A4 (en) | 2019-03-20 |
JP7259901B2 (ja) | 2023-04-18 |
EP3324637B1 (en) | 2021-12-22 |
CN107852517A (zh) | 2018-03-27 |
AU2016294096B2 (en) | 2020-10-29 |
JPWO2017010359A1 (ja) | 2018-04-26 |
US20180376173A1 (en) | 2018-12-27 |
AU2016294096A1 (en) | 2018-01-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7259901B2 (ja) | 送信方法および送信装置 | |
US11916594B2 (en) | Transmission apparatus, transmission method, reception apparatus, and reception method | |
US10542304B2 (en) | Transmission device, transmission method, reception device, and reception method | |
JP6702300B2 (ja) | 送信装置、送信方法、受信装置および受信方法 | |
US10375448B2 (en) | Reception device, reception method, transmission device, and transmission method | |
US10575062B2 (en) | Reception apparatus, reception method, transmission apparatus, and transmission method | |
WO2016108268A1 (ja) | 送信装置、送信方法、受信装置および受信方法 | |
US10349144B2 (en) | Reception device, receiving method, transmission device, and transmitting method | |
US20200068247A1 (en) | Reception apparatus, reception method, and transmission apparatus | |
JP2024015131A (ja) | 送信装置、送信方法、受信装置および受信方法 | |
US10904592B2 (en) | Transmission apparatus, transmission method, image processing apparatus, image processing method, reception apparatus, and reception method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16824339 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2017528617 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2016294096 Country of ref document: AU Date of ref document: 20160705 Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2016824339 Country of ref document: EP |