CN101360251B

CN101360251B - Storage medium recording text-based subtitle stream, apparatus and method reproducing thereof

Info

Publication number: CN101360251B
Application number: CN200810135887XA
Authority: CN
Inventors: 郑吉洙; 朴成煜; 金光玟
Original assignee: Samsung Electronics Co Ltd
Current assignee: Samsung Electronics Co Ltd
Priority date: 2004-02-28
Filing date: 2005-02-28
Publication date: 2011-02-16
Anticipated expiration: 2025-02-28
Also published as: HK1126605A1; TWI320925B; JP2011035922A; MY139164A; CN101059984B; KR100727921B1; ES2364644T3; RU2490730C2; TWI417873B; CN101360251A; CN1774759A; KR20050088035A; TW200529202A; CN100479047C; JP5307099B2; ATE504919T1; CN101059984A; HK1088434A1; JP4776614B2; TW201009820A

Abstract

A storage medium storing a multimedia image stream and a text-based subtitle stream, and a reproducing apparatus and a reproducing method therefor are provided to reproduce the text-based subtitle data stream recorded separately from the multimedia image stream such that the subtitle data can be easily produced and edited and a caption can be provided in a plurality of languages. The storage medium stores: image data; and text-based subtitle data to display a caption on an image based on the image data, wherein the subtitle data includes: one style information item specifying an output style of the caption; and a plurality of presentation information items that are displaying units of the caption, and the subtitle data is separated and recorded separately from the image data. Accordingly, a caption can be provided in a plurality of languages, and can be easily produced and edited, and the output style of caption data can be changed in a variety of ways. In addition, part of a caption can be emphasized or a separate style that a user can change can be applied.

Description

The storage medium of record text based caption stream and reproduce its equipment and method

The application is to be that February 28, application number in 2005 are 200580000307.0, are entitled as the dividing an application of patent application of " storage medium of record text based caption stream and reproduce its equipment and method " applying date.

Technical field

The present invention relates to the reproduction of multi-media image, more particularly, relate to a kind of recording multimedia image stream and text based caption stream storage medium, reproduce the media stream be recorded on this storage medium and the reproducer and the reproducting method of text based caption stream.

Background technology

The video flowing and the audio stream of high density (HD) multi-media image are provided, the presentation graphic stream that furnishes an explanation, be multiplexed into main flow with being provided for the button of user interactions or the interactive graphic stream of menu, also can be described as audio-visual ' AV ' data flow, and be recorded on the storage medium.Specifically, in order on image, to show captions or explanation, be used to provide the presentation graphic of captions that the image that also provides based on bitmap is provided.

Summary of the invention

Technical problem

Except its large scale, also have such problem based on the explanation data of bitmap: the editor of the generation of captions or explanation data and the explanation data of generation is difficult to.This is because data are described and carry out multiplexed as other data flow of video, audio frequency and interactive graphic stream.And, also have another problem, be that the output type that data are described can not change in many ways, that is, a kind of output type of explanation is changed over the another kind of output type of explanation.

Technical scheme

Aspect of the present invention advantageously provides storage medium, and the text based caption stream is recorded and is reproduced in the reproducer and the method for the text-based subtitle data that writes down on this storage medium in this storage medium.

Beneficial effect

The present invention has advantageously provided the storage medium that text-based subtitle data stream and view data are stored discretely, with the reproducer and the reproducting method that reproduce this text-based subtitle data stream, so that the editor of the caption data of the generation of caption data and generation can become is simpler.In addition, no matter the quantity of caption data project can furnish an explanation with multilingual.

Description of drawings

When in conjunction with the accompanying drawings, read of the present invention disclosed all when forming parts, according to the detailed description of following exemplary embodiment and claim, will more clearly be familiar with the present invention.Even the following stated and illustrate disclose to concentrate on and disclose exemplary embodiment of the present invention, but should be expressly understood that exemplary embodiment explanation of the present invention and example, and the present invention is not limited to this.The spirit and scope of the present invention only are confined to the clause of appended claim.Below represent brief description of the drawings, wherein:

Fig. 1 is a diagrammatic sketch of explaining the Multidimensional Data Structures that writes down according to an embodiment of the invention on storage medium;

Fig. 2 illustrates the example data structure and the text based caption stream of montage AV stream shown in Figure 1 according to an embodiment of the invention;

Fig. 3 explains the diagrammatic sketch of the example data structure of text based caption stream according to an embodiment of the invention;

Fig. 4 illustrates the text based caption stream that has data structure shown in Figure 3 according to an embodiment of the invention;

Fig. 5 illustrates dialogue types unit shown in Figure 3 according to an embodiment of the invention;

Fig. 6 explains the diagrammatic sketch of the example data structure of dialogue types unit according to an embodiment of the invention;

Fig. 7 explains the diagrammatic sketch of the example data structure of dialogue types unit according to an embodiment of the invention;

Fig. 8 illustrates Fig. 6 or exemplary dialogue types unit shown in Figure 7 according to an embodiment of the invention;

Fig. 9 A and 9B illustrate the exemplary clip information file that comprises according to an embodiment of the invention by the multiple fonts collection of font information reference;

Figure 10 is the diagrammatic sketch by the position of a plurality of font files of font file information reference shown in displayed map 9A and the 9B;

Figure 11 is a diagrammatic sketch of explaining the example data structure of dialogue projection unit shown in Figure 3 according to an embodiment of the invention;

Figure 12 A and 12B are the diagrammatic sketch of explaining the example data structure of dialogue projection unit shown in Figure 3 according to an embodiment of the invention;

Figure 13 illustrates according to an embodiment of the invention from the projection of the dialogue shown in Figure 11 to Figure 12 B unit;

Figure 14 is a diagrammatic sketch of explaining the example data structure of dialog text information shown in Figure 13;

Figure 15 illustrates dialog text information shown in Figure 13 according to an embodiment of the invention;

Figure 16 is a diagrammatic sketch of explaining the constraint in reproducing continuously dialogue projection unit (DPU) continuously;

Figure 17 is a diagrammatic sketch of explaining the exemplary reproducer that is used to reproduce the text based caption stream according to an embodiment of the invention;

Figure 18 explains the diagrammatic sketch of the preload process of text based caption stream in exemplary reproducer according to an embodiment of the invention;

Figure 19 is a diagrammatic sketch of explaining the reproduction process of the projection unit of the dialogue in exemplary reproducer according to an embodiment of the invention (DPU);

Figure 20 explains that according to an embodiment of the invention text based caption stream and moving image therein are by synchronously and be output at the diagrammatic sketch of the process in the reproducer;

Figure 21 explains that according to an embodiment of the invention text based caption stream therein is output to the diagrammatic sketch of the process in the screen of exemplary reproducer;

Figure 22 is a diagrammatic sketch of explaining the process of according to an embodiment of the invention the text based caption stream being translated in reproducer;

Figure 23 illustrates according to an embodiment of the invention the exemplary status register of placing at the exemplary reproducer that is used for reproducing text-based subtitle data stream; With

Figure 24 is the flow chart that reproduces the method for text based caption stream according to an embodiment of the invention.

Preferred forms

According to an aspect of the present invention, a kind of view data that writes down on the storage medium and text-based subtitle data of being reproduced in is with the equipment based on view data explicit declaration on image, and comprising: Video Decoder is used for view data is decoded; And subtitle decoder, be used for converting the presentation in-formation project to message bit pattern, and the presentation in-formation project and the decoded image data of control transformation are synchronously exported based on type information.Text-based subtitle data comprises the type information as the output type of the presentation in-formation project of the unit of explicit declaration and appointment explanation.

Subtitle decoder can be decoded to separating the text based captions that write down with view data, and exports this caption data, and this caption data is layered on the decoded image data.Can be that unit forms type information and presentation in-formation with the basic stream (PES) of packing, and subtitle decoder can PES be that unit resolves and handle type information and presentation in-formation.

Type information can one PES be formed and be recorded in the front portion of caption data, and a plurality of presentation in-formation projects are that unit is recorded with PES after type information, and subtitle decoder is applied to a plurality of presentation in-formation projects with a type information project.

In addition, presentation in-formation can comprise: text message is used to indicate the content of explanation; And combined information, be used to control the output of the bitmap images that is obtained by converting text information.The time that subtitle decoder can come the text message of control transformation to be output by the reference combined information.

Presentation in-formation can specify in one or more window areas that will be output on the screen wherein are described, and subtitle decoder can output to the text message of conversion in one or more window areas simultaneously.

The output time started of the presentation in-formation in combined information and end of output time can be defined in the temporal information of the length of a game's axle that uses in the playlist as the reproduction units of view data, and subtitle decoder can be by synchronous with the output of the output of the text message of conversion and decoded image data with reference to output time started and end of output time.

If the end of output time of just reproduced presentation in-formation project is identical with the output time started of next presentation in-formation project, then subtitle decoder reproduces two presentation in-formation projects serially.

If next information reproduction project need not be reproduced continuously, then subtitle decoder can be at output time started and the end of output internal buffer of resetting between the time, if and next presentation in-formation project must be by continuous reproduction, then can keep this buffer and do not reset.

Type information can be the producer's predefine and the one group of output type that is applied to presentation in-formation by storage medium, and subtitle decoder can convert a plurality of presentation in-formation projects that write down later on to bitmap images based on the type information.

In addition, text message in information reproduction can comprise be converted into the text of bitmap images and will be applied to part text only row in type information, and be applied to the only text of part by type information in will going, subtitle decoder is emphasized the part text of appointment, wherein, be applied to the text of this part by the predefined type information of producer.

As type information in the row, subtitle decoder can be applied to the part text with the predetermined absolute value of being scheduled to the relative value of font information or comprise in the predefined type information by the producer.

In addition, type information can comprise that also the user can change type information, and receiving about after can changing the selection information of a type the type information project the user from the user, subtitle decoder can be with by type information in the predefined type information of producer, the row, the last user corresponding with selection information can change the type information project and be applied to text then.

As the changeable type information of user, subtitle decoder can be applied to text with the relative value of the predetermined font information in the predefined type information project by the producer.

If except by the predefined type information of producer, storage medium also allows the predefined type information that defines in reproducer, and then subtitle decoder can be applied to text with predetermined type information.

In addition, type information can comprise and will be applied to one group of palette of presentation in-formation, and based on the color that defines in palette, subtitle decoder converts all the presentation in-formation projects after type information to bitmap images.

Except the one group of palette that in type information, comprises, presentation in-formation also can comprise one group of palette and color updating mark, if and the color updating mark is configured to ' 1 ', then subtitle decoder can be applicable to the one group of palette that comprises in the presentation in-formation, if the color updating mark is configured to ' 0 ', then subtitle decoder can be applicable to the initial set palette that comprises in the type information.

By the color updating mark being arranged to ' 1 ' and change the transparence value of the palette that in a plurality of continuous presentation in-formation projects, comprises gradually, subtitle decoder can be carried out the effect of fading in/fade out, if and the effect of fading in/fade out is done, then based on reset color look-up table (CLUT) in the subtitle decoder of the initial set palette that in type information, comprises.

In addition, type information can comprise: area information is used to indicate the position of the window area of the presentation in-formation that will be output to the conversion on the image; With for to convert presentation in-formation to bitmap images required font information, and subtitle decoder can convert the presentation in-formation of conversion to bitmap images by using area information and font information.

Font information can comprise at least one in output starting position, outbound course, classification, between-line spacing, font ID symbol, font type, font size or the color of the presentation in-formation of conversion, and subtitle decoder converts presentation in-formation to bitmap images based on font information.

As font ID symbol, subtitle decoder can be with reference to the indication information about the font file that comprises in the clip information file of the attribute information of the record cell of storing image data.

In addition, before view data was reproduced, subtitle decoder can cushion to caption data with by the font file of caption data reference.

In addition, if support multilingual a plurality of caption data project to be recorded on the storage medium, then subtitle decoder can receive the selection information of language about expectation from the user, and is reproduced in caption data project corresponding with selection information in a plurality of caption data projects.

According to a further aspect in the invention, a kind of storage medium from storing image data and text-based subtitle data reproduces data with the method based on view data explicit declaration on image, and this method comprises: view data is decoded; Read type information and presentation in-formation project; Based on type information, convert the presentation in-formation project to bitmap images; Synchronously export with the presentation in-formation and the decoded image data of control transformation.Text-based subtitle data comprises: presentation in-formation, the unit of expression explicit declaration; And type information, be used to specify the output type of explanation.

According to a further aspect in the invention, provide a kind of storage medium, be used for storage: view data; And text-based subtitle data, be used for based on view data explicit declaration on image, wherein, caption data comprises: a type information project is used to specify the output type of explanation; With a plurality of presentation in-formation projects, display unit as an illustration, and caption data is separated with view data and independent record.

Will be in ensuing description part set forth of the present invention other aspect and/or advantage, some will be clearly by describing, and perhaps can learn through enforcement of the present invention.

Embodiments of the present invention

Now, the present invention is described the accompanying drawing that is illustrated with reference to exemplary embodiment of the present invention therein in more detail.

With reference to Fig. 1, form by a plurality of layers according to the storage medium (for example, the medium 230 shown in Fig. 2) of exemplary embodiment of the present invention, with the management Multidimensional Data Structures 100 of the multi-media image stream of record thereon.Multidimensional Data Structures 100 comprises montage 110, i.e. the record cell of multi-media image; Playlist 120, the i.e. reproduction units of multi-media image; Movie objects 130 comprises the navigation command that is used for the multimedia rendering image; With concordance list 140, be used to specify the title of at first reproduced movie objects and movie objects 130.

Montage 110 is used as an object and carries out, the clip information 114 that it comprises the montage AV stream 112 of phonotape and videotape (AV) data flow that is used for the high image quality film and is used for the attribute corresponding with the AV data flow.For example, the AV data flow can be compressed according to the standard as Motion Picture Experts Group (MPEG).Yet aspect all, this montage 110 does not need to ask AV data flow 112 to be compressed of the present invention.In addition, clip information 114 can comprise the audio/video attribute of AV data flow 112, and inlet point mapping is that the unit writes down etc. with predetermined section about the information of the position of random access inlet point therein.

Playlist 120 is one group of recovery time interval of these montages 110, and each recovery time is called as playitems playitem 122 at interval.Movie objects 130 is formed by the navigation command program, and the reproduction of these navigation command starting playlists 120, between movie objects 130, switch, or according to the reproduction of user's preference managing playlist 120.

Concordance list 140 is that the table that is positioned at the upper strata of storage medium is used for defining a plurality of titles and menu, and comprises the start position information of all titles and menu, operates selected title by the user who calls out as title search or menu thus and menu can be reproduced.This concordance list 140 also comprises the title that at first reproduced automatically and the start position information of menu when storage medium is placed on the reproducer.

In these projects, now, explain that with reference to Fig. 2 multi-media image therein is compressed the structure of the montage AV stream of coding.Fig. 2 illustrates the example data structure and the text based caption stream 220 of AV data flow 210 shown in Figure 1 according to an embodiment of the invention.

With reference to Fig. 2, in order to solve the problem that relates to above-mentioned explanation data based on bitmap, according to an embodiment of the invention text-based subtitle data stream 220 with as the storage medium 230 of digital versatile disc (DVD) on the montage AV data flow 210 of record providing discretely.AV data flow 210 comprises video flowing 202, audio stream 204, is used to the presentation graphic stream 206 of caption data is provided and be used to provide and the button of user interactions and the interactive graphic stream 208 of menu, all these streams are multiplexed in the moving image main flow that is called phonotape and videotape ' AV ' data flow, and are recorded in the storage medium 230.

Text-based subtitle data 220 expressions are used to provide and will be recorded in the captions of the multi-media image on the storage medium 230 or the data of explanation according to an embodiment of the invention, and can carry out as the SGML of extend markup language (XML) by using.Yet, use binary data that the captions or the explanation of this multi-media image are provided.Below, use binary data to provide the text-based subtitle data 220 of the explanation of multi-media image will be called " text based caption stream " for short.Be used to provide the presentation graphic stream 206 of caption data or explanation data also to provide caption data based on bitmap on screen, to show captions (or explanation).

Because text-based subtitle data stream 220 is separated record with AV data flow 210, and not with AV data flow 210 by multiplexed, the size of text-based subtitle data stream 220 is not limited to this.Consequently, can use multilingual that captions or explanation are provided.And text-based subtitle data stream 220 can be made with editing effectively easily without any difficulty.

Then, text based caption stream 220 is converted into the bitmap graphics image, and is output on the screen, is laminated on the multi-media image.Become process to be called as translation (rendering) such text based data transaction based on the bitmap images of figure.Text based caption stream 220 comprises the information of request translation specification text.

Now, explain the structure of the text based caption stream 220 that comprises translation information with reference to Fig. 3.Fig. 3 explains the diagrammatic sketch of the example data structure of text based caption stream 220 according to an embodiment of the invention.

With reference to Fig. 3, text based caption stream 220 comprises dialogue types unit (DSU) 310 and a plurality of dialogue projection unit (DPU) 320 to 340 according to an embodiment of the invention.DSU 310 and DPU

320-340 also is called as dialog unit.Write down each the dialog unit 310-340 that forms text based caption stream 220 with basic stream (PES) of packing or the form that is called as PES bag 350 simply.In addition, be the PES that the unit comes recording and sending text based caption stream 220 with transmission package (TP) 362.A series of TP is called as transport stream (TS).

Yet, as shown in Figure 2, according to an embodiment of the invention text based caption stream 220 not with AV data flow 210 by multiplexed, and the independent TS that is used as on the storage medium 230 comes record.

With reference to Fig. 3, in the PES bag 350 in being included in text based caption stream 220, a dialog unit is recorded.Text based caption stream 220 comprises DSU 310 and a plurality of DPU after DSU 310 320 to 340 that are positioned at the front.DSU 310 comprises the information of specifying the output type of dialogue in the explanation that shows on the reproduced screen of multi-media image thereon.Simultaneously, a plurality of DPU 320 to 340 comprise about the text message clauses and subclauses of the conversation content that will be shown with about the information of each output time.

Fig. 4 illustrates the text based caption stream 220 that has data structure shown in Figure 3 according to an embodiment of the invention.

With reference to Fig. 4, text based caption stream 220 comprises DSU 410 and a plurality of DPU 420.

In exemplary embodiment of the present invention, the quantity of DPU is defined by num_of_dialog_presentation_units.Yet the label of DPU can not be designated individually.Exemplary cases is to use the (statement of processed_length＜end_of_file) as while.

Now, at length explain the data structure of DSU and DPU with reference to Fig. 5.Fig. 5 illustrates dialogue types unit shown in Figure 3 according to an embodiment of the invention.

With reference to Fig. 5, in DSU 310, define one group of dialogue types data entries, dialog_styleset () 510, the output type data entries of inciting somebody to action the dialogue that is shown as an illustration therein is collected.DSU310 comprise about its dialogue in explanation be shown the zone the position information, request translation dialogue information, about information of the controllable type of user or the like.Subsequently with the detailed content of decryption.

Fig. 6 explains the diagrammatic sketch of the example data structure of dialogue types unit (DSU) according to an embodiment of the invention.

With reference to Fig. 6, DSU 310 comprises palette collection 610 and area type collection 620.Palette collection 610 is the one group of a plurality of palette that defines the color that will use in explanation.What comprise in palette collection 610 can be applied to the whole a plurality of DPU that are positioned at after the DSU as the color combination of transparency or colouring information.

Area type collection 620 is one group of output type data entries that form each dialogue of explanation.Each area type comprises: area information 622 is used to the position of indicating dialogue to show on screen; Text type information 624 is used to indicate the output type of the text that will be applied to each dialogue; Can change set of types 626 with the user, be used to indicate the type of the text that will be applied to each dialogue that the user can change arbitrarily.

Fig. 7 is a diagrammatic sketch of explaining the example data structure of dialogue types unit according to another embodiment of the present invention.

With reference to Fig. 7, different with Fig. 6, do not comprise palette collection 610.That is, in DSU 310, do not define the palette collection, but in DPU, define palette collection 610, explain DPU with reference to Figure 12 A and Figure 12 B.The data structure of each area type 710 is with described identical above with reference to Fig. 6.

Fig. 8 illustrates according to Fig. 6 of the embodiment of the invention or dialogue types unit shown in Figure 7.

With reference to Fig. 8 and Fig. 6, DSU 310 comprises

palette collection

860 and 610, and a plurality of area type 820 and 620.As mentioned above, palette collection 610 is one group of a plurality of palette that define the color that will use in explanation.What comprise in palette collection 610 can be applied to all as the color combination of transparency or colouring information and be positioned at a plurality of DPU after the DSU.

Simultaneously, each

area type

820 and 620 comprises that indication will be displayed on the

area information

830 and 622 of the information of the window area on the screen about explanation therein, and

area information

830 and 622 comprises about X, Y coordinate, width, highly, background color and its explanation will be displayed on the information of window area on the screen etc.

In addition, each

area type

820 and 620 comprises that indication will be applied to the

text type information

840 and 624 of output type of the text of each dialogue.Promptly, the text that can be included in wherein dialogue will be displayed on X, the Y coordinate of the position in the above-mentioned window area, outbound course as the text from the left side to the right or from the top to the bottom, classification, between-line spacing, with the identifier of the font that is cited, as the runic or the oblique font type of font, font size and about the information of font color etc.

And each

area type

820 and 620 can comprise that also the user of the type that the indication user can change arbitrarily can change set of types 850 and 626.Yet it is selectable that the user can change set of types 850 and 626.The user can change the change information that set of

types

850 and 626 can comprise outgoing position, font size and the line space between text output

type data entries

840 and 624 of position about window area, text.Each changes data entries and can be expressed as and the value of comparing relative increase or minimizing about the output type 840 of the text that will be applied to each dialogue with 624 information.

Sum up above, three types of relevant informations are arranged, in

area type

820 and 620 type information (region_style) 620 of definition, will explain and emphasize that type information (inline_style) 1510 and user can change type information (user_changeable_style) 850 in the row that part illustrates subsequently, and it is as follows to use the order of these data entries:

1) the area type information 620 that defines in area type basically, is employed.

2) if type information in the row is arranged, then type information 1510 is used to be layered in the part that area type information wherein is employed in this row, and emphasizes the part explanatory text.

3) if there is the user can change type information 850, then this information is used at last.The existence that the user can change type information is selectable.

Simultaneously, in the text

type information project

840 and 624 of the text that will be applied to each dialogue, will can be defined as follows by the font file information that the identifier (font_id) 842 of font is quoted.

Fig. 9 A illustrates the exemplary clip information file 910 that comprises a plurality of font sets of being quoted by font information 842 shown in Figure 8 according to an embodiment of the invention.

With reference to Fig. 9 A, Fig. 8, Fig. 2 and Fig. 1, in StreamCodingInfo () 930, comprise information about the multiple stream that on storage medium according to the present invention, writes down as the stream encryption message structure that in clip information file 910 and 110, comprises.That is, comprise information about video flowing 202, audio stream, presentation graphic stream, interactive graphic stream, text based caption stream etc.Specifically, about text based caption stream 220, can comprise information (textST_language_code) 932 about the language that is used for explicit declaration.In addition, the font name 936 and the file name 938 of definable storage and the file of specifying the font_id 842 that will be cited and accord with at font ID shown in Figure 8 and 934 corresponding font informations.Subsequently, with reference to Figure 10 explanation is used to search the method for the font file corresponding here with the identifier of the font that will be cited and define.

Fig. 9 B illustrates the exemplary clip information file 940 that comprises a plurality of font sets that font information shown in Figure 8 842 is quoted according to another embodiment of the present invention.

With reference to Fig. 9 B, definable structure in clip information file 910 and 110, ClipInfo ().In this structure, a plurality of font sets that definable is quoted by font information shown in Figure 8 842.That is, specify with indicating and to be cited and in the corresponding font file title 952 of the font_id 842 of the identifier of font shown in Figure 8.Now, explanation is used to search method with definition here and font file that the identifier of the font that is cited is corresponding.

Figure 10 is the diagrammatic sketch that the position of a plurality of font files of being quoted by the

font file title

938 and 952 shown in Fig. 9 A and Fig. 9 B is shown.

With reference to Figure 10, the bibliographic structure about the file of multi-media image that writes down according to an embodiment of the invention is shown on storage medium.Particularly, by using bibliographic structure, can be easy to find the position in auxiliary data (AUXDATA) catalogue as the font file of 11111.font 1010 or 99999.font 1020.

Simultaneously, now, explain the structure of the DPU that forms dialog unit in more detail with reference to Figure 11.

Figure 11 is a diagrammatic sketch of explaining the example data structure of DPU shown in Figure 3 320 according to other embodiments of the invention.

With reference to Figure 11 and Fig. 3, comprise about the text message of the conversation content that will be output with about the DPU 320 of the information that shows the time comprising: temporal information 1110 is used to indicate the time that will be output to the dialogue on the screen; Palette is used to specify the palette that will be cited with reference to information 1120; The dialog region information 1130 of the dialogue on the screen will be output to being used for.Particularly, the dialog region information 1130 that is output to the dialogue on the screen is comprised: type is used to specify and will be used in the output type of dialogue with reference to information 1132; With dialog text information 1134, be used to specify by the actual text that outputs to the dialogue on the screen.In this case, suppose in DSU definition by palette with reference to the palette collection of information 1120 appointments (with reference to Fig. 6 610).

Simultaneously, Figure 12 A is a diagrammatic sketch of explaining the example data structure of the DPU 320 that shows among Fig. 3 according to an embodiment of the invention.

With reference to Figure 12 A and Fig. 3, DPU 320 comprises: temporal information 1210 is used to indicate the time that will be output to the dialogue on the screen; Palette collection 1220 is used to define the palette collection; With the dialog region information 1230 that is used to be output to the dialogue on the screen.In this case, do not define palette collection 1220 among the DSU as shown in the figure, and directly definition in DPU 320.

Simultaneously, Figure 12 B is a diagrammatic sketch of explaining the example data structure of DPU shown in Figure 3 according to an embodiment of the invention 320.

With reference to Figure 12 B, DPU 320 comprises: temporal information 1250 is used to indicate the time that will be output to the dialogue on the screen; Color updating mark 1260; Palette collection 1270 uses when the color updating mark is configured to 1; The dialog region information 1280 of the dialogue on the screen will be output to being used for.In this case, palette collection 1270 also is defined in DSU shown in Figure 11, and is stored among the DPU 320.Specifically, to fade in/fade out in order to use to reproduce to express continuously, except the basic palette collection that in DSU, defines, will be used for expressing the palette collection that fades in/fade out 1270 and be defined, and color updating mark 1260 can be configured to 1 at DPU320.Explain in more detail with reference to Figure 19.

Figure 13 illustrates the DPU shown in Figure 11 to Figure 12 B 320 according to an embodiment of the invention.

With reference to Figure 13, Figure 11, Figure 12 A and Figure 12 B, DPU comprises beginning of conversation temporal information (dialog_start_PTS) and the end-of-dialogue temporal information (dialog_end_PTS) 1310 as the temporal information 1110 of indicating the time that will be output to the dialogue on the screen.In addition, comprise the dialogue palette identifier (dialog_palette_id) as palette with reference to information 1120.Under the situation of Figure 12 A, can comprise that palette collection 1220 rather than palette are with reference to information 1120.Dialog text information (region_subtitle) 1334 is comprised as the dialog region information 1230 of the dialogue that is used for being output, and will be in order to specify to the output type of its application, area type identifier (region_style_id) 1332 can be comprised.Example shown in Figure 13 only is the embodiment of DPU, and the DPU with the data structure shown in Figure 11 to 12B can realize by the modification of variety of way.

Figure 14 is a diagrammatic sketch of explaining the data structure of dialog text information (region_subtitle) shown in Figure 13.

With reference to Figure 14, the dialog text information 1284 shown in dialog text information 1134 shown in Figure 11, the dialog text information 1234 shown in Figure 12 A, Figure 12 B and shown in Figure 13 1334 comprises that type information 1410 and dialog text 1420 are as the output type of emphasizing subdialogue in the row.

Figure 15 illustrates dialog text information shown in Figure 13 according to an embodiment of the invention 1334.As shown in figure 15, dialog text information 1334 is carried out by type information (inline_style) 1510 and dialog text (text_string) 1520 in the row.In addition, preferably the information of the end of the interior type of indication row is included among the embodiment shown in Figure 15.Unless the latter end of type is defined in the row, after type can be applied to continuously in the once appointed row, this purpose with the producer was opposite.

Simultaneously, Figure 16 is a diagrammatic sketch of explaining the constraint in reproducing continuous DPU continuously.

With reference to Figure 16 and Figure 13, in the time need reproducing continuously, need following constraint to above-mentioned a plurality of DPU.

1) 1310 indications of the beginning of conversation temporal information (dialog_start_PTS) that defines in DPU begin to be output to time on the diagram face (GP) when the dialogue object, explain the diagram face with reference to Figure 17 subsequently.

2) the end-of-dialogue temporal information that defines in DPU (dialog_end PTS) 1310 indication is used to reset time of the text based subtitle decoder of handling the text based captions, explains the text based subtitle decoder with reference to Figure 17 subsequently.

3) in the time need reproducing continuously above-mentioned a plurality of DPU, the end-of-dialogue temporal information (dialog_end_PTS) of current DPU should be with identical by the beginning of conversation temporal information (dialog_start_PTS) of the DPU of successively reproducing subsequently.That is, in Figure 16, for successively reproducing DPU#2 and DPU#3, the end-of-dialogue temporal information that comprises in DPU#2 should be identical with the beginning of conversation temporal information that comprises among the DPU#3.

Simultaneously, preferably satisfy following restriction according to DSU of the present invention.

1) text based caption stream 220 comprises a DSU.

2) user who in All Ranges type (region_style), comprises can change the label of type information project (user_control_style) should be identical.

Simultaneously, preferably satisfy following constraint according to DPU of the present invention.

1) window area of at least two explanations should be defined.

Now, explain according to an embodiment of the invention structure with reference to Figure 17 based on the exemplary reproducer of the data structure of the text based caption stream 220 that on storage medium, writes down.

Figure 17 is a diagrammatic sketch of explaining the structure of the exemplary reproducer that is used to reproduce the text based caption stream according to an embodiment of the invention.

With reference to Figure 17, the reproducer 1700 of the replay device that is otherwise known as comprises buffer unit, and it comprises font preload buffer (FPB) 1712, is used for the store font file; And subtitle preload buffer (SPB) 1710, be used to store the text based subtitle file; With text based subtitle decoder 1730, be used to decode and reproduce the text based caption stream that before on storage medium, writes down, pass through diagram face (GP) 1750 and color look-up table (CLUT) 1760 then with its output.

Particularly, buffer unit comprises subtitle preload buffer (SPB) 1710, and preload text-based subtitle data stream 220 therein; With font preload buffer (FPB) 1712, preload font information therein.

Subtitle decoder 1730 comprises text subtitle processor 1732, dialogue combined buffers (DCB) 1734, dialog buffer (DB) 1736, text subtitle translater 1738, dialogue projection controller 1740 and bitmap object buffer (BOB) 1742.

Text subtitle processor 1732 receives text-based subtitle data stream 220 from text subtitle preload buffer (SPB) 1710, above-mentioned type relevant information that comprises and the dialogue output time information that comprises are sent to dialogue combined buffers (DCB) 1734, and the dialog text information that will comprise sends to dialog buffer (DB) 1736 in DPU in DPU in DSU.

Projection controller 1740 is controlled text translation device 1738 by use the type relevant information that comprises in dialogue combined buffers (DCB) 1734, and talks with the bitmap images time that output time information is controlled the translation in bitmap object buffer (BOB) 1740 that will be output to diagram face (GP) 1750 by using.

Control according to projection controller 1740, text subtitle translater 1738 by will be in font preload buffer (FPB) 1712 the corresponding font information project of the dialog text information with storage in dialog buffer (DB) 1736 among the font information of preload be applied to dialog text information and convert dialog text information to bitmap images, that is, carry out translation.The bitmap images of translation is stored in the bitmap object buffer (BOB) 1742, and according to the control of showing controller 1740, is output to diagram face (GP) 1750.At this moment, be applied in the color of appointment among the DSU by reference color look-up table (CLUT) 1760.

As the type relevant information that will be applied to dialog text, the information that is defined in DSU by the producer can be used, and also can be employed by the type relevant information of consumer premise justice.As shown in figure 17, reproducer 1700 uses type information defined by the user to have precedence over the type relevant information that is defined by the producer.

As described in reference Fig. 8, as the type relevant information that will be applied to dialog text, basically use the area type information (region_style) that in DSU, defines by the producer, if and type is included among the DPU that comprises dialog text in the row, then type information (inline_style) is applied to corresponding part in the row, and area type information is applied to DPU.In addition, can change type and user defined by the user can to change one of type selected if the producer additionally defines the user in DSU, then type is employed in area type and/or the row, and the user can change type and used at last then.In addition, as described in reference Figure 15, the information of the constraint of type was included in the content of type in the row in preferably indication was used and gone.

And the producer can specify to allow still to stop and use with producer's definition and in the type relevant information that defines in reproducer itself of the relevant information separated of the type that writes down on the storage medium.

Figure 18 explains for example diagrammatic sketch of the preload process of text-based subtitle data stream 220 in reproducer 1700 shown in Figure 17 according to an embodiment of the invention.

With reference to Figure 18, text-based subtitle data stream 220 shown in Figure 2 is defined in the subpath of above-mentioned playlist.In subpath, support multilingual a plurality of text-based subtitle data stream 220 to be defined.In addition, the font file that is applied to the text based captions can be defined in above clip information file 910 or 940 with reference to Fig. 9 A and Fig. 9 B description.There are 255 text-based subtitle data streams 220 that can in a storage medium, comprise in each playlist, to be defined.In addition, there are 255 font files that can in a storage medium, comprise to be defined.Yet in order to guarantee seamless projection, the size of text-based subtitle data stream 220 should be less than or equal to the size of the preload buffer 1710 of reproducer 1700 as shown in figure 17.

Figure 19 is the diagrammatic sketch of explaining according to the reproduction process of the DPU in reproducer of the present invention.

With reference to Figure 19, Figure 13 and Figure 17, the process of reproducing DPU is shown.Shown in Figure 17, projection controller 1740 specifies in the beginning of conversation temporal information (dialog_start_PTS) of output time 1310 of the dialogue that comprises among the DPU and the time that end-of-dialogue temporal information (dialog_end_PTS) is controlled the dialogue of the translation that will be output to diagram face (GP) 1750 by use.At this moment, temporal information specifies the dialogue bitmap images that is stored in the translation in the bitmap object buffer (BOB) 1742 that will comprise in text based subtitle decoder 1730 to send to the time that diagram face (GP) 1750 is done beginning of conversation.That is,, then should prepare to use set up the required message bit pattern of dialogue after finishing information being sent to diagram face (GP) 1750 in DPU if it is time beginning of conversation that defines.In addition, the end-of-dialogue temporal information is specified and is reproduced the time that DPU finishes.At this moment, subtitle decoder 1730 and diagram face (GP) 1750 all are reset.Best is is reset between the time started of DPU and concluding time as the buffer in subtitle decoder 1730 of bitmap object buffer (BOB) 1742, and irrelevant with successively reproducing.

Yet, when the successively reproducing of a plurality of DPU of needs, subtitle decoder 1730 and diagram face 1750 are not reset, and the content of storing in each buffer as dialogue combined buffers (DCB) 1734, dialog buffer (DB) 1736 and bitmap object buffer (BOB) 1742 should be retained.That is, when the end-of-dialogue temporal information of current reproduced DPU with subsequently will be when the beginning of conversation of the DPU of successively reproducing,, temporal information be identical, the content of each buffer is not reset and is retained.

Particularly, as the example of the successively reproducing of using a plurality of DPU, there is the effect of fading in/fade out.Can carry out the effect of fading in/fade out by the color look-up table (CLUT) 1760 that change is sent to the bitmap object of diagram face (GP) 1750.That is, a DPU comprises the combined information as color, type and output time, and continuous subsequently a plurality of DPU have the combined information identical with a DPU, but only upgrades palette information.In this case, by changing the transparency in the color project gradually, from 0% to 100%, the effect of fading in/fade out can be performed.

Particularly, when using the data structure of the DPU shown in Figure 12 B, can carry out the effect of fading in/fade out effectively by using color updating mark 1260.That is, if dialogue projection controller 1740 checks and determine that the color updating mark 1260 that comprises is configured to ' 0 ' in DPU, that is, if the general case of the effect that do not need to fade in/fade out is then used the colouring information that comprises among the DSU shown in Figure 6 basically.Yet, if projection controller 1740 determines that color updating mark 1260 is configured to ' 1 ', promptly, the effect of fading in if desired/fade out, then by using the colouring information 1270 that in DPU, comprises, rather than use the colouring information 610 that in DSU shown in Figure 6, comprises, carry out the effect of fading in/fade out.At this moment, by being adjusted in the transparency of the colouring information 1270 that comprises among the DPU, can carry out the effect of fading in/fade out simply.

Therefore, after the effect of fading in demonstration/fade out, preferably color Query List (CLUT) 1760 is updated to the priming color information that comprises in DSU.This is because if it is not updated, after then the colouring information of appointment once can be applied to continuously, this purpose with the producer was opposite.

Figure 20 be explain according to an embodiment of the invention its text based caption stream by with motion image data synchronously and output to the diagrammatic sketch of the process in the reproducer.

With reference to Figure 20, in the DPU of text-based subtitle data stream 220, comprise beginning of conversation temporal information and the end-of-dialogue temporal information should be defined by time point about the length of a game's axle that in playlist, uses, with synchronous with the output time of the AV data flow 210 of multi-media image.Therefore, the interruption between the dialogue output time (PTS) of the system clock (STC) of AV data flow and text-based subtitle data stream 220 can be prevented from.

Figure 21 explains that text-based subtitle data stream according to an embodiment of the invention is output to the diagrammatic sketch of the process on the screen in reproducer.

With reference to Figure 21, shown in be the translation information 2102 that comprises the type relevant information by application, dialog text information 2104 is converted into bitmap images 2106, and based on the output position information that comprises in combined information 2108 (as region_horizontal_position and region_vertical_position), the bitmap images of conversion is output to the processing of the correspondence position on diagram face (GP) 1750.

Translation information 2102 expression as peak width, highly, the type information of foreground color, background color, text type, font name, font type and font size.As mentioned above, the area type centralized definition translation information 2102 in DSU.Simultaneously, the time started and the concluding time of combined information 2108 indication projections, the horizontal and vertical position information of the window area that is output to diagram face (GP) 1750 etc. is described.This information defines in DPU.

Figure 22 is the diagrammatic sketch of explaining according to an embodiment of the invention as shown in figure 17 of the process of translation text-based subtitle data stream 220 in reproducer 1700.

With reference to Figure 22, Figure 21 and Fig. 8, be displayed on zone on the diagram face (GB) 1750 by using window area as region_horizontal_position, region_vertical_position, region_width and the region_height appointment of the positional information 830 of the window area of the explanation that in DSU, defines to be designated as explanation.The bitmap images of the dialogue of translation is from by as the text_horizontal_position of the outgoing position 840 of dialogue window area and the starting point position display of text_vertical_position appointment.

Simultaneously, reproducer according to the present invention is stored the type information of being selected by the user (style_id) in the system registry zone.Figure 23 illustrates according to an embodiment of the invention the exemplary status register that is provided with at the reproducer that is used for reproducing text-based subtitle data stream.

With reference to Figure 23, status register (player status registers is hereinafter referred to as PSR) is stored the type information of being selected by the user (type 2310 of selection) in the 12nd register.Therefore, for example, even after reproducer shown in Figure 17 1700 is carried out menu calling or another operation, change button if the user presses type information, then the type information of before having been selected by the user can at first be employed with reference to PSR 12.The register of stored information can be changed.

Now, explain structure with reference to Figure 24 based on the reproducer of the method for the reproduction text-based subtitle data stream 220 of the storage medium of record text based caption stream 220 and above-mentioned reproducing caption data flow 220.Figure 24 is the flow chart of operation that reproduces the method for text-based subtitle data stream 220 according to an embodiment of the invention.

In operation 2410, the storage medium 230 that shows from Fig. 2 for example reads the text-based subtitle data stream 220 that comprises DSU information and DPU information, and in operation 2420, based on the translation information that comprises in DSU information, the explanatory text that will comprise in DPU information converts bitmap images to.In operation 2430, the bitmap images of changing is outputed on the screen according to temporal information and positional information as the combined information that in DPU information, comprises.

As mentioned above, the present invention advantageously provide the text-based subtitle data stream that storage separates with view data storage medium, be used to reproduce the reproducer and the reproducting method of this text-based subtitle data stream, thereby the editor of the generation of caption data and caption data becomes simpler.In addition, no matter the quantity of caption data project can furnish an explanation with multilingual.

In addition, because caption data forms with a type information project (DSU) and a plurality of presentation in-formation project (DPU), so will be applied to the output type of whole presentation data can be by pre-defined, and can be changed in every way, and emphasize that type and user can change type in the row of part explanation and also can be defined.

And by using a plurality of adjacent presentation in-formation projects, the successively reproducing of explanation becomes possibility, and by using this mode, fades in/fades out and other effects can easily be carried out.

Exemplary embodiment of the present invention also can be write as computer program and can come the general purpose digital computer of executive program to carry out at the computer-readable recording medium that uses a computer.The example of computer-readable medium comprises magnetic storage medium (for example, ROM, floppy disk, hard disk etc.), optical recording media (for example, CD-ROM, DVD etc.) and as the storage medium of the carrier wave transmission of the Internet (for example, by).Computer-readable medium also can be distributed on the network that connects computer system, thereby computer-readable code is stored and carries out with distributed.

Although illustrated and described exemplary embodiment of the present invention, those skilled in the art it will be appreciated that and can carry out various changes and modification in technical development that under the situation that does not break away from the spirit and scope of the present invention, available equivalents replaces parts wherein.Can make many modifications and not break away from its scope so that instruction of the present invention is suitable for special circumstances.For example, as long as text-based subtitle data and AV data are recorded in computer-readable medium or data storage device separatedly, then can use any computer-readable medium or data storage device.In addition, as Fig. 3 or shown in Figure 4, text-based subtitle data can also differently be disposed.And reproducer shown in Figure 17 also can be implemented as partial record equipment, perhaps is embodied as the single equipment that is used for storage medium executive logging and/or representational role in addition.Similarly, CPU can be implemented as the chipset with firmware, perhaps is embodied as programming in addition to carry out the reference example general or special purpose computer of method as described in Figure 24.Therefore, mean to the invention is not restricted to disclosed each exemplary embodiment, and the present invention includes the embodiment that all fall within the scope of claims.

Utilizability on the industry

The present invention is applied to the storage medium that text based caption stream therein is recorded and reproducer and the method that is reproduced in the text based caption data that records on this storage medium.

The present invention has advantageously provided the storage medium of storing discretely the text based subtitle data stream with view data, with the reproducer and the reproducting method that reproduce this text based subtitle data stream, simpler thereby the editor of the caption data of the generation of caption data and generation can become. In addition, no matter the quantity of caption data project can furnish an explanation with multilingual.

Claims

1. the storage medium from storing image data and text-based subtitle data reproduces data to show the method for talking with based on view data at image, comprising:

View data is decoded;

Reception comprises the text-based subtitle data of dialogue projection unit and dialogue types unit, and will talk with based on the dialogue types unit that to show the text-converted that is used to talk with that comprises in the unit be bitmap images;

The text that is used to talk with and the decoded image data of conversion are synchronously exported,

Wherein, dialogue projection unit comprises that described text that is used to talk with and indication dialogue will be output to the output time information of the time on the screen, the dialogue types unit comprise appointment will be applied to respective dialog text output type text type information and comprise that definition will be applied to the palette collection of one group of a plurality of palette of color of the text of respective dialog.

2. be that unit form becomes dialogue projection unit and dialogue types unit the method for claim 1, wherein, and be the unit parsing and handle dialogue projection unit and dialogue types unit with the basic stream of packing with the basic stream of packing.

3. the method for claim 1, wherein, if support multilingual a plurality of caption data project to be recorded on the storage medium, then receive the selection information of language, and be reproduced in caption data project corresponding in a plurality of caption data projects with selection information about expectation from the user.