CN1936903A

CN1936903A - Data processing method and storage medium, and program for causing computer to execute the data processing method

Info

Publication number: CN1936903A
Application number: CN 200610141614
Authority: CN
Inventors: 宗续敏彦; 荣藤稔; 荒木昭一; 江村恒一
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Intellectual Property Corp of America
Priority date: 2000-03-16
Filing date: 2001-03-16
Publication date: 2007-03-28
Anticipated expiration: 2021-03-16
Also published as: CN100474308C; CN1936902A; CN100474307C

Abstract

A context of media content is represented by context description data having a hierarchical stratum. The context description data has the highest hierarchical layer, the lowest hierarchical layer, and other hierarchical layers. The highest hierarchical layer is formed from a single element representing content. The lowest hierarchical layer is formed from an element representing a segment of media content which corresponds to a change between scenes of video data or a change in audible tones. In a selection step of a data processing method, the context of the media content is expressed, and one or a plurality of scenes of the media content is or are selected on the basis of the score of the context description data. Further, in the extraction step of the data processing method, only data pertaining to the scenes selected in the selection step are extracted.

Description

Data processing method and medium and make computing machine carry out the program of this method

The present invention is dividing an application of following patented claim: application number: 01111292.1, and the applying date: 2001.3.16, denomination of invention: data processing method and medium and make computing machine carry out the program of this method

Technical field

The present invention relates to a kind of media content data disposal route, a kind of medium and a kind of program and summary generating apparatus and method, all these all relate to watching, playing and transmitting such as the continuous audio frequency viewdata (media content) of moving image, video frequency program or audio program, wherein, the summary of broadcast and transfers media content high brightness scene or the only scene of the desirable media content of spectators.

Background technology

Traditional media content is play traditionally, is transmitted or stored on the basis of the unique file of storing media content.

Described at Japanese unauthorized patented claim No.Hei-10-111872,, detect variation between the scene (after this being referred to as " scene is cut apart ") of two moving images according to the method for extracting a moving image special scenes.Being added to each scene such as the additional data of the timing code of the timing code of start frame, end frame and described scene key word cuts apart.

As a kind of replacement method, Carnegie Mellon university (CMU) attempts to cut apart, detect people's face or explain captions and moving image of a phrase indexing summary of process speech recognition detection [Mochael A.Smith and Takeo KANADE " strengthen video clipping and the characteristic that makes up through image and language " CMU-CS-97-111, on February 3rd, 97] by the scene that detects a moving image.

When with each file serving as the described moving image of basis broadcast, the summary of looking back described moving image is impossible.In addition, even when extracting the desirable a plurality of scene of a brightness scene or user, also must begin to search for described scene or described a plurality of scene from the head of media content.In addition, under the situation that transmits a moving image, all data sets of a file all must be transmitted, thereby need the very long delivery time.

According to the method for in Japanese unauthorized patented claim No.Hei-10-111872, describing, can extract a plurality of scenes by using a key word, thereby help to extract the desirable scene of user.Described additional data does not comprise relation or the contact between the described scene.For this reason, described method runs into a lot of difficulties aspect the sub-plot of a story for example extracting.In addition, when only extracting scene on the basis of a key word, the user is obtaining for running into a lot of difficulties aspect the very important consciousness of understanding scene context.Therefore, the preparation of summary or the high brightness scene very difficulty that becomes.

Method by the CMU exploitation can be summarized a moving image.But this summary has caused single, fixed mode summary.For this reason, a moving image is summarized in the summary that needs different reproduction times, for example supposes that reproduction time is that three or five minutes summary is difficult.In addition, summarize the desirable for example selection of user and comprise that the moving image of the scene of a specific character also is difficult.

Summary of the invention

An object of the present invention is to provide a kind of method that can in the media content reproduction time, only select, play and only transmit the scene that a summary, high brightness scene or spectators wish.

Another object of the present invention provides summary and generates generating apparatus and method.

Another object of the present invention provides a kind of method that can play the scene that a summary, high brightness scene or spectators wish user in the desirable time cycle, at the time place of selecting summary, high brightness scene or a desirable scene.

A further object of the present invention provides a kind of the user in the desirable time period, in the method that only transmits described summary, high brightness scene collection or the desirable scene of user during the transfers media content under user's request.

A further object of the present invention provides a kind of relying according to the user and controls the method for the data volume that will transmit with the portfolio that server is set up circuit communicate by letter.

In order to solve the problem that prior art exists, according to an aspect of the present invention, a kind of data processing method is provided, comprise step: the description data that input is described with hierarchy, wherein, described hierarchy comprises: the highest ranked layer, and in this layer, the context of time dependent media content and described media content forms a single-element of presentation medium content; Lowest hierarchical layer, in this layer, element representation is assigned to the temporal information relevant with score with the respective media section by a media content and this element of cutting apart described media segment and forming as attribute, and other hierarchical layer comprises direct or indirect relevant with at least one media segment and represent the element of a plurality of scenes or one group of scene; On the basis of the score that is assigned to described description data, from described media content, select at least one section.

Described data processing method preferably also comprises an extraction step, is used for only extracting and the corresponding data of selecting described selection step of section from described media content.

Described data processing method comprises also that is preferably play a step, is used for only playing and the corresponding data of selecting during selecting step of section from described media content.

The context importance of the best presentation medium content of described score.

Described score is preferably represented from the interested scene context of the angle of key word significance level, selects a scene in described selection step, in this scene, uses described score from least one angle.

Described media content is preferably corresponding to video data or voice data.

Described media content is preferably corresponding to the data that comprise mutual synchronized video data and voice data.

Described description data preferably describe video data or voice data structure.

Described description data is preferably described each structure of sets of video data and audio data set.

In described selection step, preferably select a scene with reference to the description data relevant with video data or voice data.

Described selection step preferably includes the description data that is used for by reference video data and selects the video of a scenes in video data to select step or be used for selecting by the description data of reference audio data the audio selection step of the scene of a voice data.

Described selection step preferably includes a description data that is used for by reference video data and selects the video of a scenes in video data to select step and be used for selecting by the description data of reference audio data the audio selection step of a voice data scene.

The data that will be extracted in extraction step are preferably corresponding to video data or voice data.

The data that will be extracted in extraction step are preferably corresponding to comprising the data of synchronized video data and voice data each other.

Described media content preferably includes a plurality of different media data collection in a single time period.In addition, described data processing method preferably include structrual description data that are used to receive the data structure that has therein the media content of describing and determine on the basis of the condition that will be used for data are defined as select target that described media data concentrates which to get definite step as select target.In addition, in described selection step, by only selecting data from the data centralization that is defined as select target by described definite device with reference to described structrual description data.

Described data processing method preferably also comprise the structrual description data that are used to receive the data structure that has therein the media content of describing and to be used for data are defined as select target determine on the basis of fixed condition really be only with video data, only with voice data or video data and both audio definite step as select target.In addition, in described selection step, by only selecting data from the data centralization that determining step, is confirmed as select target with reference to described structrual description data.

Described media content preferably includes a plurality of different media data collection in a single time period.Best, in described determining step, reception has the structrual description data of the data structure of the described media content of describing therein and determines that also which quilt that sets of video data and/or voice data are concentrated is got as select target.In addition, in described selection step, by only selecting data from the data centralization that described determining step, is confirmed as select target with reference to described structrual description data.

The representative data relevant with the respective media section preferably appends on each element of description data in the lowest hierarchical layer as attribute; In described selection step, select total data relevant and/or the representative data relevant with the respective media section with described media segment.

The described total data relevant with described media segment is preferably corresponding to media data, described media content preferably includes a plurality of different media data collection in a single time period, described data processing method preferably also comprises a determining step, is used to receive the structrual description data of the data structure that has therein the media content of describing and determines described media data collection and/or representative data is concentrated which will be taken and does select target.In addition, in described selection step, by only selecting data from the data centralization that determining step, is confirmed as select target with reference to described structrual description data.

Described data processing method preferably includes a determining step, be used to receive the data structure that has therein the media content of describing the structrual description data and to be used for data are defined as select target determine only relevant total data on the basis of fixed condition really with described media segment, still be only with the respective media section relevant total data, only relevant representative data or the total data and the representative data of being correlated with the respective media section are got as select target with the respective media section.In addition, in described selection step, by only selecting data from the data centralization that determining step, is confirmed as select target with reference to described structrual description data.

Described definite condition preferably includes the ability of receiving end, the portfolio of conveyer line, user's request and at least one in user's hobby or their the mutual combination etc.

Described data processing method comprises also that preferably one forms step, is used for forming according to the data of extracting at extraction step the data stream of media content.

Described data processing method preferably also comprises a transfer step, is used for being transmitted in the data stream that described formation step forms through a circuit.

Described data processing method preferably also comprises a recording step, is used for traffic logging to a data recording medium that will form in described formation step.

Described data processing method preferably also comprises a data recording medium management step, is used for reorganizing media content that has been recorded and/or the media content of wanting new record according to the disk space of available described data medium.

Described data processing method preferably also comprises a memory contents management process, is used for reorganizing the media content that is stored in described data medium according to the memory cycle of described media content.

According to a further aspect of the invention, provide a kind of computer-readable recording medium, on this recording medium with the form record aforementioned data disposal route of the program that can carry out by computing machine.

According to a further aspect of the invention, provide a program, be used to make computing machine to carry out the aforementioned data disposal route.

In data processing method of the present invention, recording medium and program, selecting arrangement (step is corresponding with selecting) comprises by use that on the basis of the score of the lowest hierarchical layer that appends to description data as attribute or other hierarchical layer the description data of the hierarchical layer of utilizing highest ranked layer, lowest hierarchical layer and other hierarchical layer that input media (corresponding to input step) obtains selects at least one section from described media content.

Described extraction element (corresponding to described extraction step) preferably only extracts and the data of selecting in described selecting arrangement (corresponding to described selection step) that section is relevant.

Described playing device (corresponding to described broadcast step) is preferably only play and the described section relevant data of selecting in described selecting arrangement (corresponding to described selection step).

Therefore, can from described media content, select scene more importantly arbitrarily and the important segment so selected can be extracted or play.In addition, described description data is assumed to be a hierarchical layer and comprises described highest ranked layer, lowest hierarchical layer and other hierarchical layer.Can for example,, select scene with arbitrary unit on the basis of every chapter or on the basis at every joint.Can use various selection forms, delete unnecessary paragraph such as the selection of some chapters and sections with from described chapters and sections.

In data processing method of the present invention, recording medium and program, the significance level of a score presentation medium content context.Go to select important scene as long as be provided with this score, just can for example prepare the set of the important scenes of a program at an easy rate.

In addition, go to represent importance, can select a plurality of sections in high flexible ground by determining a key word from the interested scene of angle of key word as long as be provided with described score.For example, as long as determined key word, so, have only the desirable scene of user can be selected such as a role or an incident from a specific viewpoint.

In data processing method of the present invention, recording medium and program, described media content is corresponding to video data and/or voice data, and described description data is described the structure of each sets of video data and/or audio data set.Described video selecting arrangement (selecting step corresponding to described video) is by selecting a scene with reference to the description data relevant with video data.Described audio selection device (corresponding to described audio selection step) is by selecting a scene with reference to the description data relevant with voice data.

In addition, described extraction element (corresponding to described extraction step) extracts video data and/or voice data.

From video data and/or voice data, can select an important section, can extract and relevant video data and/or the voice data of so selecting of section.

In data processing method of the present invention, recording medium and program, comprise at described media content under the situation of a plurality of different pieces of information collection in a single time period that described definite device (corresponding to described determining step) determines that really which media data collection will get as select target on the basis of fixed condition will being used for data are defined as select target.Described selecting arrangement (corresponding to described selection step) is only from being selected data set described definite device (corresponding to described determining step) established data.

Described definite condition comprises the portfolio of receiving end ability, conveyer line, user's request and at least one in user's hobby or the mutual combination between them etc.For example, the ability of receiving end is corresponding to video display capabilities, voice playing ability or the decompressed speed of packed data.The portfolio of conveyer line is corresponding to the degree of congestion of described circuit.

Be split into for example a plurality of channels at media content and be assigned under the situation of described channel and described layer with a plurality of layers and different media data collection, described definite device (corresponding to described determining step) can be determined and a relevant media data of best section according to described definite condition.Therefore, described selecting arrangement (corresponding to described selection step) can be selected the media data of right quantity.Be used as under the situation of best section at a plurality of channels and layer, the video data with standard resolution can be assigned to channel-1/ layer-1 with the transmitting moving image, has high-resolution video data and can be assigned to channel-1/ layers-2.In addition, stereo data can be assigned to channel-1 with the transmission voice data, and mono data can be assigned to channel-2.

In data processing method of the present invention, recording medium and program, described definite device (corresponding to described determining step) is determined it only is video data, only is that voice data or video and both audio will be got as select target on the basis of described definite condition.

Before described selecting arrangement (corresponding to described selection step) was selected a section, described definite device (corresponding to described determining step) determined which media data collection will get as a select target or only be that video data, voice data or video data and both audio will be got as a select target.The result can shorten described selecting arrangement (corresponding to described selection step) to select a needed time of section.

In data processing method of the present invention, recording medium and program, representative data is used as attribute and appends on each element of description data in the lowest hierarchical layer, and described selecting arrangement is selected total data relevant with media segment and/or the representative data of being correlated with the respective media section.

In data processing method of the present invention, recording medium and program, the total data relevant with media segment is corresponding to media data, and described media content is included in a plurality of different media data collection in the single time period.Described definite device (corresponding to described determining step) determine on the basis of structrual description data and definite condition that described media data collection and/or representative data concentrate which to get as select target.

Described media content for example is split into a plurality of channels and a plurality of layer, and different media data collection is assigned to described channel and layer.Described definite device can determine that condition is determined and best section (channel or layer) relevant media data according to these.

In data processing method of the present invention, recording medium and program, described definite device (corresponding to described determining step) is determined it only is the total data relevant with the respective media section, only is that the representative data relevant with the respective media section or the total data and the representative data of being correlated with the respective media section will be got as select target on the basis of described definite condition.

Before described selecting arrangement (corresponding to described selection step) was selected a section, described definite device (corresponding to described determining step) determined which media data collection will get as select target or only be described total data or only be that described representative data or described total data and described representative data will be got as select target.The result is can shorten described selecting arrangement (corresponding to described selection step) and select the time that section is required.

In data processing method of the present invention, recording medium and program, form device (corresponding to described formation step) forms a media content according to the data of being extracted by described extraction element (corresponding to described extraction step) data stream.Therefore, can prepare to be used to describe data stream or file corresponding to one section content of the section of so selecting.

In data processing method of the present invention, recording medium and program, described conveyer (corresponding to described transfer step) transmits by described through route and forms the data stream that device (corresponding to the described step that forms) forms.Therefore, only relevant with important segment data can be sent to described user.

In data processing method of the present invention, recording medium and program, described data medium management devices (corresponding to described data medium management process) reorganizes media content of having stored so far and/or the media content that will newly store according to described data medium available disk space.Particularly, in data processing method of the present invention, recording medium and program, described memory contents management devices (corresponding to described memory contents management process) reappears according to the storing time intervals of described content organizes the media content that is stored in the described data medium.Therefore, in described data medium, can store a large amount of media contents.

Summary generating apparatus of the present invention comprises: input media, be used to import description data, and described description data has: the data structure part, a plurality of sections of each scene of the media content that expression is made of a plurality of scenes are described; Attribute section, comprise the temporal information of cutting apart of describing the described scene of expression, based on the score of the contextual significance level of each section of viewpoint and the expression link information with the situation that links of at least one correlation range, these attribute informations are attribute informations of described media content by the represented viewpoint of at least one key word of expression scene content, expression; Selecting arrangement is used for based on described described score of described attribute section and described temporal information, from the described data structure section of selection partly; Content input section is used to import corresponding media content; Extraction unit is used for extracting the data relevant with described selecteed section temporal information from described media content.

Abstraction generating method of the present invention may further comprise the steps: input step, be used to import description data with data structure part and attribute section both sides, described data structure part is described a plurality of sections of each scene of the media content that expression is made of a plurality of scenes; Described attribute section, comprise the temporal information of cutting apart of describing the described scene of expression, based on the score of the contextual significance level of each section of viewpoint and the expression link information with the situation that links of at least one correlation range, these attribute informations are attribute informations of described media content by the represented viewpoint of at least one key word of expression scene content, expression; Select step, be used for, from the described data structure section of selection partly based on described described score of described attribute section and described temporal information; The content input step is used to import corresponding media content; Extraction step is used for extracting the data relevant with described selecteed section temporal information from described media content.

Description of drawings

The block diagram of Fig. 1 shows the data processing method according to first embodiment of the invention;

Fig. 2 shows the structure according to the description data of described first embodiment;

Fig. 3 shows according to described first embodiment and uses XML to describe the part of example of the DTD (Document Type Definition) (DTD) of description data in computing machine, and a part of using the example of the description data that DTD describes according to described first embodiment;

Fig. 4-9 shows the part that continues of the description data of example shown in Figure 3;

Figure 10 shows the part by the example of the XML file that forms to the additional representative data of description data shown in Fig. 3-9, and a part that is used for describing at computing machine the example of DTD description data, that describe with Extensible Markup Language (XML);

Figure 11-21 shows the part that continues of description data shown in Figure 10;

Figure 22 is used to describe the method for specifying significance level according to described first embodiment;

The process flow diagram of Figure 23 shows according to first embodiment and the relevant processing of described selection step;

The block diagram of Figure 24 shows the formation according to the extraction step of first embodiment;

The flow process of Figure 25 show according to first embodiment in described extraction step by the processing of going multiplex machine to carry out;

The flow process of Figure 26 shows the processing of being carried out by the video clipping device according to first embodiment in described extraction step;

Figure 27 shows the structure of MPEG-1 video data stream;

The flow process of Figure 28 shows the processing of being carried out by the audio clips device according to first embodiment in described extraction step;

Figure 29 shows the structure of the AAU of described MPEG-1 audio data stream;

The block diagram of Figure 30 shows the application according to the media processing method of first embodiment;

Figure 31 shows the processing according to the significance level of second embodiment of the invention;

The flow process of Figure 32 shows according to described second embodiment and the relevant processing of described selection step;

The flow process of Figure 33 shows according to third embodiment of the invention and the relevant processing of described selection step;

Figure 34 is used to describe the method for specifying significance level according to fourth embodiment of the invention;

The flow process of Figure 35 shows according to fourth embodiment of the invention and the relevant processing of described selection step;

The block diagram of Figure 36 shows the media processing method according to fifth embodiment of the invention,

Figure 37 shows the structure according to fifth embodiment of the invention structrual description data;

Figure 38 shows the structure according to the fifth embodiment of the invention description data;

Figure 39 shows the part of example of using the DTD (Document Type Definition) (DTD) of XML description scheme data of description according to the 5th embodiment in computing machine, and according to an example of an XML file of fifth embodiment of the invention;

Figure 40 shows according to the 5th embodiment and uses XML to describe the part of example of the DTD (Document Type Definition) (DTD) of described description data in computing machine, and according to the first half of the example of an XML file of the 5th embodiment;

Figure 41-45 shows the part that continues of description data shown in Figure 40;

Figure 46 shows according to the 5th embodiment at an example selecting step output;

The block diagram of Figure 47 shows the extraction step according to the 5th embodiment;

The flow process of Figure 48 shows the processing of being carried out by interface arrangement according to the 5th embodiment in described extraction step;

An example that is born results when the described interface arrangement that provides in described extraction step is changed described output in described selection step according to the 5th embodiment is provided Figure 49;

The flow process of Figure 50 show according to the 5th embodiment in described extraction step by described processing of going multiplex machine to carry out;

The flow process of Figure 51 shows the processing of being carried out by described video clipping device according to the 5th embodiment in described extraction step;

The flow process of Figure 52 shows the processing of being carried out by described audio clips device according to the 5th embodiment in described extraction step;

Another process flow diagram of Figure 53 shows the processing of being carried out by described video clipping device according to the 5th embodiment in described extraction step;

The block diagram of Figure 54 shows the data processing method according to sixth embodiment of the invention;

The block diagram of Figure 55 shows formation step and the transfer step according to the 6th embodiment;

The block diagram of Figure 56 shows the media processing method according to seventh embodiment of the invention;

Figure 57 shows the structure according to the 5th embodiment description data;

Figure 58 shows according to the 7th embodiment and uses XML to describe the part of example of the DTD (Document Type Definition) (DTD) of description data in computing machine, and a part of using the example of the description data that XML describes according to the 7th embodiment;

Figure 59-66 shows the part that continues of description data shown in Figure 58;

Figure 67 shows the part by the example that representative data is appended to the XML file that forms on the description data shown in Figure 58-66, and the part of the example of the DTD that describes with the XML that is used to describe described description data in computing machine;

Figure 68-80 shows the part that continues of description data shown in Figure 67;

The flow process of Figure 81 shows according to the 7th embodiment and the relevant processing of described selection step;

The block diagram of Figure 82 shows the application according to the media processing method of the 7th embodiment;

The flow process of Figure 83 shows according to eighth embodiment of the invention and the relevant processing of described selection step;

The flow process of Figure 84 shows according to ninth embodiment of the invention and the relevant processing of described selection step;

The flow process of Figure 85 shows according to tenth embodiment of the invention and the relevant processing of described selection step;

The block diagram of Figure 86 shows the data processing method according to twelveth embodiment of the invention;

Figure 87 shows the structure according to the twelveth embodiment of the invention description data;

Figure 88 shows according to the 5th embodiment and uses XML to describe the part of example of the DTD (Document Type Definition) (DTD) of description data in computing machine, and according to the part of an XML file of the 5th embodiment example;

Figure 89-96 shows the part that continues of description data shown in Figure 88;

The block diagram of Figure 97 shows the data processing method according to thriteenth embodiment of the invention;

The block diagram of Figure 98 shows the data processing method according to fourteenth embodiment of the invention;

The block diagram of Figure 99 shows the data processing method according to fifteenth embodiment of the invention;

The block diagram of Figure 100 shows the data processing method according to sixteenth embodiment of the invention;

The block diagram of Figure 101 shows the data processing method according to seventeenth embodiment of the invention;

Figure 102 shows a plurality of channels and a plurality of layer;

Figure 103 shows the part of the example of the DTD (Document Type Definition) (DTD) of using XML description scheme data of description, and the part of the example of the structrual description data of describing in DTD;

Figure 104 shows the part that continues in the data of structrual description shown in Figure 103;

The flow process of Figure 105 shows according to seventeenth embodiment of the invention and the relevant processing of definite step in example 1;

The flow process of Figure 106 shows according to the 17 embodiment will respond definite processing that user's request is carried out in definite step of example 1;

The flow process of Figure 107 shows the definite processing relevant with video data in definite step of example 1 according to the 17 embodiment;

The flow process of Figure 108 shows the processing relevant with voice data in definite step of example 1 according to the 17 embodiment;

The flow process of Figure 109 shows the first half according to the seventeenth embodiment of the invention processing relevant with definite step in the example 2;

The flow process of Figure 110 show according to the seventeenth embodiment of the invention processing relevant with definite step in the example 2 back half;

The flow process of Figure 111 shows according to seventeenth embodiment of the invention and the relevant processing of definite step in example 3;

The flow process of Figure 112 shows the definite processing relevant with video data in definite step of example 3 according to the 17 embodiment;

The flow process of Figure 113 shows the definite processing relevant with voice data in definite step of example 3 according to the 17 embodiment;

The flow process of Figure 114 shows the first half according to the seventeenth embodiment of the invention processing relevant with definite step in example 4;

The flow process of Figure 115 show according to the seventeenth embodiment of the invention processing relevant with definite step in example 4 back half;

The flow process of Figure 116 shows according to the 17 embodiment will respond definite processing that user's request is carried out in definite step of example 4;

The flow process of Figure 117 shows the definite processing relevant with video data in definite step of example 4 according to the 17 embodiment;

The flow process of Figure 118 shows the definite processing relevant with voice data in definite step of example 4 according to the 17 embodiment;

The flow process of Figure 119 shows the first half according to the 17 embodiment processing relevant with definite step in example 5;

The flow process of Figure 120 show according to the 17 embodiment processing relevant with definite step in example 5 back half;

The flow process of Figure 121 shows according to the 17 embodiment will respond definite processing that user's request is carried out in definite step of example 5;

The block diagram of Figure 122 shows the data processing method according to eighteenth embodiment of the invention;

The block diagram of Figure 123 shows the data processing method according to nineteenth embodiment of the invention;

The block diagram of Figure 124 shows the data processing method according to twentieth embodiment of the invention;

The block diagram of Figure 125 shows the data processing method according to 21st embodiment of the invention;

The block diagram of Figure 126 shows the data processing method according to 22nd embodiment of the invention;

Figure 127 shows the example that described context data and described structrual description data merge to DTD wherein, and the example of XML file;

Figure 128-132 shows the part that continues of the file of XML shown in Figure 127;

Figure 133 shows the structure according to the eleventh embodiment of the invention description data;

Figure 134 shows a viewpoint (view point) of using in the 11 embodiment;

Figure 135 shows the significance level according to the 11 embodiment;

Figure 136 shows and is used for the example of DTD will be used for describing at the XML that computing machine is expressed description data the description data of the 11 embodiment by using, and the example of a part of description data of describing with XML;

Figure 137-163 shows the part that continues of description data shown in Figure 136;

Figure 164 shows and is used for another example of DTD will be used for describing at the XML that computing machine is expressed described context data the described description data of the 11 embodiment by using, and the example of a part of description data of describing with XML;

Figure 165-196 shows the part that continues of description data shown in Figure 164;

Figure 197 shows the another kind of structure according to the description data of eleventh embodiment of the invention;

Figure 198 shows and is used for the example of DTD will be used for describing at the XML that computing machine is expressed described description data the described description data (corresponding to Figure 197) of the 11 embodiment by using, and the example of a part of description data of describing with XML;

Figure 199-222 shows the part that continues of description data shown in Figure 164;

Figure 22 3 shows and is used for an example will being used for describing at the XML that computing machine is expressed described description data the DTD of the described description data of the 11 embodiment (corresponding to Figure 197) by using, and an example of a part of description data of describing with XML;

Figure 22 4-252 shows the part that continues of description data shown in Figure 164;

Figure 25 3 shows the link of viewpoint in the program of description data is expressed;

Figure 25 4-256 shows the link of viewpoint table and viewpoint in the program of description data is expressed;

Figure 25 7-260 shows the data structure of the description data that is formed by data structure part and attribute section;

Figure 26 1-263 shows the data structure of second description data that first embodiment by the description data conversion method forms;

Figure 26 4 shows original context data of description and an example＜section〉and＜key word, priority 〉;

Figure 26 5 shows the data structure of second description data that is formed by the original context data of description shown in the transition diagram 264 by first embodiment of description data conversion method;

Figure 26 6 shows the data structure of second description data that second embodiment by the description data conversion method forms;

Figure 26 7 shows the data structure of second description data that is formed by the original context data of description shown in the transition diagram 264 by second embodiment of description data conversion method;

Figure 26 8 shows the data structure of second description data that the 3rd embodiment by the description data conversion method forms;

Figure 26 9 shows the another kind of data structure of second description data that the 3rd embodiment by the description data conversion method forms; With

Figure 27 0 shows the data structure of second description data that is formed by the original context data of description shown in the transition diagram 264 by the 3rd embodiment of description data conversion method.

Embodiment

Below in conjunction with accompanying drawing embodiments of the invention are described.

First embodiment

Various details first embodiment.In this embodiment, the moving image of MPEG-1 system data flow is used as described media content.In this case, a media segment is cut apart corresponding to a single scene, and a score is represented the objective degree of scene of interest context importance.

The block diagram of Fig. 1 shows the data processing method according to first embodiment of the invention.In Fig. 1, label 101 is pointed out described selection step; Label 102 is pointed out described extraction step.In selecting step 101, from described description data, select a scene of media content, and the start time and the concluding time of exporting described scene.In extraction step 102, extract and the start time of output in selecting step 101 and the data that the media content section is relevant of concluding time regulation.

Fig. 2 shows the structure according to the described description data of described first embodiment.In this embodiment, described context is described according to tree construction.Element in the tree construction is arranged from left to right according to chronological order.In Fig. 2, be designated as＜content (contents)〉the root of tree represent the content of single part, the exercise question of described content is used as attribute and is assigned to described.

Utilization＜joint (section)〉appointment＜program (program) sub level.The priority of expression scene of interest context significance level is used as attribute and appends to described element＜joint〉on.Described significance level hypothesis is from 1 to 5 round values, wherein, and 1 minimum significance level of expression and the maximum significance level of 5 expressions.

Utilization＜joint〉or＜section (segment) appointment＜joint sub level.Here, element＜joint〉can be used as another height＜joint son＜joint.But, single-element＜joint〉can not have son＜joint and son＜section potpourri.

An element＜section〉single scene of expression cuts apart, and, the priority that is assigned to it be assigned to its mother＜joint priority identical.Additional giving＜section〉attribute be " beginning " of expression start time and " end " of representing the concluding time.Use available software or can cut scene on market through network available software.In addition, also can use manually described scene is cut.Though be start time of cutting apart according to a scene and concluding time express time information in current embodiment, but, also can realize similar result during express time information when duration according to start time of scene of interest and this scene of interest.In this case, the concluding time of scene of interest is gone up acquisition by being added to the start time the described duration.

Under situation [KIK16], by using the element＜joint in the multi-layer classification layer such as a photoplay 〉, chapter, joint and the paragraph of described story can be described on the basis of described description data.In another example, when describing baseball, the element＜joint in highest ranked〉can be used to description office, their son＜joint〉can be used to describe half innings.In addition, described element＜joint〉the second generation＜joint can be used to describe swinging of each baseball player, described element＜joint the third generation＜joint can also be used to describe time period and batting result between each throwing, twice throwing.

Description data with this structure can be used for example Extensible Markup Language (XML) expression in computing machine.Described XML is a kind of data description language (DDL), and its standardization is the target that World Wide Web Consortium (World Wide Web Consortium) is pursued.Recommend version 1.0 to recommend February 10 in 1998.The explanation of XML1.0 version can obtain from http://www.w3.org/TR/1998/REC-xml-19980210.Fig. 3 shows an example of the DTD (Document Type Definition) (DTD) that is used to according to the present invention to use XML to describe described description data and uses an example of the description data that DTD describes to Fig. 9.Figure 10 shows an example by will appending to the description data that Fig. 3 prepared on the description data shown in Figure 9 such as the representative data of the media segment of presentation graphics (being video data) and key word (voice data) and the example of the DTD that is used to use XML to describe described description data to Figure 19.

To describe below and the relevant processing of selection step 101.The processing relevant with described selection step 101 is particularly related to the form of description data and a score is assigned to the method for the context of each scene.In current embodiment, the processing relevant with described selection step 101 only is at having son＜section〉element＜joint carry out (step S1 shown in Figure 23, S4 and S5) as shown in figure 22.Select its priority to surpass the element＜joint of certain threshold value〉(step S2 shown in Figure 23), and output element＜joint of selection like this start time and concluding time (step S3 shown in Figure 23).Be assigned to and have son＜section〉described element＜joint priority corresponding to all elements＜joint in the described content in the middle of the significance level shared, described element＜joint〉in each all have son＜joint.Specifically, among Figure 22 by the element＜joint of dotted line in the middle of the significance level shared be set to priority.Be assigned to except preceding surface element＜joint〉element＜joint and＜section priority can be provided with arbitrarily.Thereby described significance level is not must be provided with to suppose a unique value, and identical significance level can be assigned to different elements.The flow process of Figure 23 shows according to first embodiment and the relevant processing of described selection step 101.Consider the element＜joint of selection like this 〉, by described element＜joint start time of scene of expression and concluding time can be according to the element＜joints of selection like this element＜the section of son joint definite.Start time and concluding time that output is so determined.

Though in that to select described in the current embodiment be all to have son＜section at wherein each〉element＜joint execution,, described selection also can be at element＜section〉execution.In this case, priority is corresponding to all elements＜section in described content〉the central significance level of sharing.In addition, selecting also can be at from not having son＜section〉the element＜joint of higher classification the element＜joint of central same hierarchical execution.Specifically, described selection can be at from given mother＜content〉or give stator＜section〉element＜joint the same paths that begins to count number〉execution.

With reference now to Figure 24, the processing relevant with described extraction step 102 described.The block diagram of Figure 24 shows the extraction step 102 according to described first embodiment.As shown in figure 24, according to the extraction step 102 of this first embodiment by going multiplex machine 601, video clipping device 602 and audio clips device 603 to realize.In current embodiment, the MPEG-1 system data flow is got as media content.Described MPEG-1 data stream is by being multiplexed into a single data stream formation with a video data stream and an audio data stream.Go multiplex machine 601 with described video data stream and audio data stream from being separated the multiplexed system data flow.Video clipping device 602 receives video data stream of so separating and a section of selecting in described selection step 101, and only output and the relevant data of so selecting of section from the video data stream that is received.Audio clips device 603 receives the audio data stream of separating and in selecting step 101 selected described section, and from the audio data stream that is received only output and selected section relevant data.

Describe by the processing of going multiplex machine 601 to carry out below with reference to accompanying drawing.The flow process of Figure 25 shows by the processing of going multiplex machine 601 to carry out.The method of multiplexed described MPEG-1 system data flow meets international standard ISO/IEC IS 11172-1 standardization.Wrap by means of described video and audio data stream are divided into the data stream of the suitable length that is referred to as to wrap and will append to each such as the additional information of title, video data stream and audio data stream are multiplexed in the bag.A plurality of video data streams and a plurality of audio data stream also can be multiplexed in the single signal in an identical manner.In the title of each bag, all described one and be used for that a bag is identified as the data stream ID of video data stream or audio data stream and one and be used to make video data and the synchronous timestamp of described voice data.Described data stream ID does not limit to and is used for a bag is identified as video data stream or audio data stream.When a plurality of video data streams when multiplexed, described data stream ID can be used for having the video data stream of bag interested from a plurality of video data streams identifications.Similarly, when a plurality of audio data streams when multiplexed, described data stream ID can be used for having the audio data stream of bag interested from described a plurality of audio data streams identifications.In described MPEG-1 system, a plurality of bags are processed into a single bag and are used as title with multiplexed speed that acts on the reference time of carrying out synchronous playing and additional information and append to described wrapping.In addition, with appended to described first by the relevant additional information of the quantity of multiplexed video and audio data stream as system's title and wrap.Go multiplex machine 601 from system's title of described first bag, to read and guarantee to be used to store the Data Position (S3 and S4) of the data set of each data stream by multiplexed video and the quantity of audio data stream (S1 and S2).Then, the data of going multiplex machine 601 to check that the data stream ID of each bag also will be included in the described bag are written to storage by in the Data Position of described data stream ID predetermined data stream (S5 and S6).All bags all are carried out above-mentioned processing (S8, S9 and S10).After all bags all have been carried out above-mentioned processing, serve as that video clipping device 602 is exported to video data stream in the basis with each data stream, audio data stream is exported to audio clips device 603 (S11) in an identical manner.

The operation of video clipping device 602 will be described below.The flow process of Figure 26 shows the processing of being carried out by video clipping device 602.Described MPEG-1 system data flow is standardized with international standard ISO/IEC IS11172-2.As shown in figure 27, described video data stream comprises a sequential layer, an image sets layer (GOP), an image layer, a bit slice layer, a macrodata piece layer and a data block layer.On the basis of GOP layer that is minimum unit, carry out random access, be included in the described image layer each the layer corresponding to a single frame.Video clipping device 602 is the based process data with each GOP.The counter C that is used for that the quantity of output frame is counted is initialized to 0 (S3).At first, the title of video clipping device 602 definite described video data streams is also stored the data (S5) that are included in the described title corresponding to the title (S2 and S4) of described sequential layer.Then, described video clipping device is exported described data.The title of described sequential layer can appear at during the subsequent treatment.Unless it is relevant with a quantization matrix that described value relates to, otherwise, do not allow to change the value of described title.Therefore, when the described order title of input, the value of input header compares (S8 and S14) with the value of the title of storing.If the title of being imported is different with the title of being stored aspect the value except the value relevant with described quantization matrix, the title of being imported will be thought (S15) of mistake.Then, described video clipping device 602 detects the title (S9) of the GOP layer of input data.Described in the title of described GOP layer and the data (S10) that timing code is relevant, this timing code is described from start of header of described order elapsed time section.Video clipping device 602 compares (S1) (S11) with described timing code and the section of selecting step 101 output.If described timing code is confirmed as being not included in described section, then video clipping device 602 is discarded in all data sets that occur before the next GOP layer of described sequential layer.On the contrary, if described timing code is included in selected section, so, all data sets (S13) that 602 outputs of video clipping device occurred before the next GOP layer of described sequential layer.For the data set that guarantees to have exported and the continuity of the current data set of exporting, the timing code of described GOP layer must change (S12).The value that the timing code of utilizing counter C to calculate described GOP layer will change over.Counter C keeps the quantity of the frame of having exported.According to equation 1, the time T v that shows the heading frame of the current described GOP layer of exporting is from counter C and from describing described order title and representing that the image rate " Pr " of the frame number that per second will show calculates.

Tv = \frac{C}{pr} - - - (1)

" Tv " is that unit specifies a value with 1/ per second, and then, the value of described Tv is changed according to the timing code form of MPEG-1.So be arranged on will be in the timing code of the described GOP layer of this time output for the value of conversion.When exporting the data relevant with described GOP layer, the quantity of output image layer is added on the value of described counter C.Repeat previously described processing, finish (S7 and S16) up to described video-frequency band.Export under the situation of a plurality of video data streams at the described multiplex machine 601 that goes, each video data stream is all carried out above-mentioned processing.

Below with the processing of description audio montage device 603.The flow process of Figure 28 relates to the processing of being carried out by described audio clips device 603.Described MPEG-1 audio data stream is standardized according to international standard ISO/IEC IS11172-3.Described audio data stream is to be formed by a series of frames that are referred to as Audio Access Unit (AAU).Figure 29 shows the structure of an AAU.Described AAU is the minimum unit that voice data can be deciphered separately, and it comprises the sampled data collection Sn to determined number.The reproduction time of single AAU can calculate according to bit rate " br ", sampling frequency " Fs " and the bit quantity L of the expression transfer rate of described AAU.At first, detect the title (S2 and S5) that is included in the AAU in the described audio data stream, whereby to obtain the described bit quantity L of a single AAU.And then, described bit rate " br " and sampling frequency Fs are described in the title of described AUU.Calculate the quantity of sampling quantity Sn of a single AAU according to equation 2.

Sn = \frac{L \times Fs}{br} - - - (2)

Calculate the reproduction time of a single AAU according to equation 3.

Tu = \frac{Sn}{Fs} = \frac{L}{br} - - - (3)

As long as calculated the value of Tu, by obtaining described AAU counting from start of header of described data stream elapsed time.The number count of 603 couples of AAU that occurred of described audio clips device is also calculated from start of header of described data stream elapsed time (S7).So time of calculating and section compare (S8) that in selecting step, exports.If the time that described AAU occurs is included in selected section, described audio clips device 603 outputs all data sets (S9) relevant with that AAU.On the contrary, if the time that described AAU occurs is not included in selected section, described audio clips device 603 will abandon the data set relevant with described AAU.Repeat aforementioned processing till described audio data stream finishes (S6 and S11).When going a plurality of audio data stream of multiplex machine 601 output, in the described audio data stream each is all carried out aforementioned processing.

As shown in figure 30, the video data stream of output is transfused to video play device in extraction step 102, and the audio data stream of output is transfused to audio playing apparatus in extraction step 102.Described video data stream and audio data stream be by synchronous playing, summary or high brightness scene that whereby can the playing media content.In addition, so video that produces and audio data stream can be prepared summary or the relevant MPEG-1 system data flow of described media content high brightness scene set with described media content whereby by multiplexed.

Second embodiment

Various details second embodiment.This second embodiment only is being different from first embodiment aspect the processing relevant with selecting step.

Describe below with reference to accompanying drawings according to second embodiment and the relevant processing of selection step 101.In selection step 101 according to second embodiment, utilized the scope that is assigned to from highest ranked＜joint to minimum＜section the priority value of all elements.Be assigned to each element＜joint〉and＜section priority represent the objective degree of context importance.Describe and the relevant processing of selection step 101 below with reference to Figure 31.In Figure 31, label 1301 expression is included in a plurality of elements＜joint in the highest ranked in the described description data〉in one; 1302 expression element＜joints〉a daughter element＜joint of 1301 〉; 1303 expression element＜joints〉a daughter element＜joint of 1302 〉; 1304 expression daughter element＜joints〉a daughter element＜section of 1303 〉.In selection step 101, calculate the leaf＜section that is assigned to from described highest ranked according to second embodiment〉extend to its ancestors＜joint the arithmetic mean of all priority values in path.When the arithmetic mean in described path surpasses a threshold value, select described element＜section 〉.In example shown in Figure 28, calculate element＜section〉1304,＜joint 1303,＜joint 1302 and＜save the arithmetic mean " pa " of 1301 attribute, be the arithmetic mean of their attribute priority value p4, p3, p2 and p1.Described arithmetic mean " pa " calculates according to equation 4.

pa = \frac{p 1 + p 2 + p 3 + p 4}{4} - - - (4)

" pa " that calculates like this and described threshold (S1 and S2).If " pa " surpasses described threshold value, selections＜section〉1304 (S3), with＜section property value that 1304 " beginning " is relevant with " end " exports (S4) as the start time and the concluding time of selected scene.All element＜sections〉all carry out aforementioned processing (S1 and S6).The flow process of Figure 32 shows according to this second embodiment and the relevant processing of selection step 101.

In this second embodiment, calculating is from being assigned to the described＜section of lowest hierarchical〉priority value up to the ancestors＜joint that is assigned to limit priority the arithmetic mean of priority value, and on the basis of the arithmetic mean that so calculates selection described leaf＜section.In addition, can calculate to be assigned to and have son＜section〉element＜joint priority value to the ancestors＜joint that is assigned to highest ranked the arithmetic mean of priority value, by arithmetic mean and the described threshold value of relatively so calculating, can select to have described son＜section〉element＜joint.Similarly, in another hierarchical layer, can calculate from being assigned to element＜joint〉priority value to the ancestors＜joint of the highest ranked that is assigned to it the arithmetic mean of priority value, so arithmetic mean that calculates and described threshold ratio are, whereby, can be chosen in element＜joint in the described hierarchical layer 〉.

The 3rd embodiment

A third embodiment in accordance with the invention is described below.Described the 3rd embodiment is only different with first embodiment aspect the processing relevant with selecting step.

Describe below with reference to the accompanying drawings according to the 3rd embodiment and the relevant processing of selection step 101.With identical under the situation in conjunction with first processing that embodiment describes, in the selection step 101 according to the 3rd embodiment, described selection only all has a son＜section at wherein each〉element＜joint carry out.In the 3rd embodiment, be provided with one and consider and duration of all scenes that will select threshold value of sum at interval.Specifically, by the end of element＜joint of having selected so far〉duration at interval sum for maximum but still before keeping less than described threshold value, the described element of the select progressively＜joint that reduces according to priority 〉.The flow process of Figure 33 shows according to the 3rd embodiment and the relevant processing of selection step 101.Wherein each all has son＜section〉a plurality of＜joint set got as one the collection Ω (S1).Element＜joint of described collection Ω〉according to the descending sort (S2) of attribute priority.Element＜joint that selection has highest priority value from collection Ω〉(S4 and S5), and element＜joint that deletion is so selected from described collection Ω.By checking described element＜joint〉all son＜sections obtain element＜joint of so selecting start time and concluding time, and calculate described element＜joint duration (S6).Calculating is by the end of described element＜joint of having selected so far〉duration at interval and (S7).If described and surpassed described threshold value, (S8) finishes dealing with.If described and be lower than described threshold value, output is at the described element＜joint of this selection of time〉start time and concluding time (S9).Then, handle to turn back to from described collection Ω, select element＜joint with highest priority value step.Repeat above-mentioned processing, up to selected element＜joint〉duration at interval sum surpass described threshold value or described collection Ω and become (S4 and S8) till the sky.

In the 3rd embodiment, at having son＜section〉element＜joint carry out to select.But described selection also can be at described element＜joint〉and carry out at element＜section.In this case, priority value is corresponding to all elements＜joint in described media content〉the central significance level of sharing.In addition, selecting also can be at there not being son＜section in the same classification〉element＜joint carry out.Specifically, selection can be at being arranged in from described ancestors＜content〉or leaf＜section the element＜joint in the same path that begins to count carry out.

With identical in the situation of second embodiment, be assigned to each element＜joint〉and element＜joint the objective degree of priority value as context importance, calculate from being assigned to described element＜joint〉to the ancestors＜joint of its highest ranked the mean value " pa " of all priority.Select with the descending of " pa " that wherein each all has son＜section〉element＜joint or element＜section, up to interval sum maximum of described duration but till less than described threshold value.Even in this case, also can obtain the useful result identical with second embodiment.

The 4th embodiment

Various details the 4th embodiment.Described the 4th embodiment is only different with first embodiment aspect the processing relevant with selecting step.

Describe according to the 4th embodiment and the relevant processing of selection step 101 below with reference to accompanying drawing.Identical with the situation in the selection of selecting step 101 to carry out in first embodiment, selection relevant with selecting step 101 in the 4th embodiment is at element＜section〉and have son＜section element＜joint carry out.Identical with the situation of the 3rd embodiment, the duration of considering all scenes that will select in current embodiment is sum at interval, and a threshold value is set.With identical in the situation of first embodiment, be assigned to have son＜section element＜joint priority value all have son＜section corresponding to wherein each in the described media content all elements＜joint in the middle of the significance level shared.Specifically, described priority value is got as the described element＜joint that is surrounded by dot-and-dash line in Figure 34〉the central significance level of sharing.In addition, be assigned to described element＜section〉priority value corresponding to same parent element＜joint described element＜joint of sharing in the middle of shared significance level; That is, by the described element＜section of a dotted line shown in Figure 34〉in the middle of the significance level shared.

The flow process of Figure 35 shows according to the 4th embodiment and the relevant processing of selection step.Wherein each all has son＜section〉element＜joint set got as the collection Ω (S1).Element＜joint in the described collection Ω〉according to the descending sort (S2) of priority.Then, select element＜joint in the described collection Ω with highest priority value〉(S3, S4 and S5).If a plurality of elements＜joint〉all have the highest priority value, select all these elements so.Like this element＜joint of selecting〉got as another and collect the element of Ω ' and from described collection Ω, delete.By checking described element＜joint〉son＜section obtain and store the element＜joint of selection like this in advance start time, concluding time and duration (S6) of a scene of expression.If select a plurality of elements＜joint 〉, then obtain and store start time, concluding time and duration in advance by each scene in a plurality of scenes of each element representation.Obtain described element＜joint of described collection Ω '〉duration interval sum (S7 and S8).Described and with a threshold (S9).If described duration at interval sum equal described threshold value, then output and described start time and the concluding time is relevant and by the end of all data sets of having stored so far, processing end (S10) then.On the contrary, if the described duration at interval sum be lower than described threshold value, handle to return once more and select an element＜joint from described collection Ω〉step (S4 and S5).If described collection Ω is empty, then output all data sets relevant with the concluding time, processing end (S4) then with the described start time of being stored.If interval sum of described duration has surpassed described threshold value, then carry out following processing.Specifically, element＜joint that selection has minimum priority from described collection Ω '〉(S11).At this moment, if a plurality of element＜joint〉have described minimum priority, then select all these elements.At element＜joint of so selecting〉son＜section in, delete son＜section with minimum priority〉(S12).Change and the son＜section of so deleting〉corresponding element＜joint start time, concluding time and duration (S13).As the deletion described element＜section the result, scene may be interrupted.In this case, for each scene of having interrupted, store described start time, concluding time and duration.In addition, as deletion described son＜section〉the result, if an element＜joint〉all son＜sections all deleted, so, the described element of deletion＜joint from described collection Ω ' 〉.If selected a plurality of elements＜joint 〉, so all elements all pass through above-mentioned processing.As deletion described son＜section〉the result, therefrom deleted described son＜section〉element＜joint duration become shorter, thereby reduced described duration interval sum.Repeat this deletion and handle, up to duration of the element of described collection Ω ' at interval sum become be lower than described threshold value till.When duration of the element of described collection Ω ' at interval sum become (S14) when being lower than described threshold value, export all data sets relevant with the concluding time, then processing end (S15) with the start time of having stored.

Though in that to select described in the 4th embodiment is at wherein each all has son＜section〉element＜joint or son＜section execution, but described selection also can be at an element＜joint〉and its son＜joint or an element＜joint and its son＜section carry out.Even in this case, also can obtain the useful result identical with the 4th embodiment.

Consider described element＜section of carrying out when sum surpasses described threshold value at interval when the described duration〉deletion, in current embodiment, begin to delete described element＜joint according to ascending order from lowest priority.But, can be element＜joint〉priority a threshold value is set, can be from being lower than all elements＜joint of described threshold value〉deletion have the son＜section of minimum priority.In addition, also can be element＜section〉priority another threshold value is set and can deletes element＜section that its priority is lower than described threshold value.

The 5th embodiment

Below with reference to accompanying drawing the fifth embodiment of the present invention is described.In this embodiment, the moving image of MPEG-1 form is got as media content.In this case, media content is cut apart corresponding to single scene.Score is corresponding to the objective degree of the context importance of scene of interest.

The block diagram of Figure 36 shows the media processing method according to fifth embodiment of the invention.In Figure 36, step is selected in one of label 1801 expression; Extraction step of 1802 expressions; One of 1803 expression form step; Transfer step of 1804 expressions; Database of 1805 expressions.In selecting step 1801, from description data, select the scene of a media content, and the start time of output and scene of selection like this and concluding time related data and expression are used to store the data of the file of described data.In extraction step 1802, receive the data set of described scene start time of expression and concluding time and be illustrated in the data set of selecting the file of output in the step 1801.In addition, in extraction step 1802,, from the file of described media content, extract and the relevant data of stipulating by start time of exporting and concluding time of section in selection step 1801 by the reference configuration data of description.In forming step 1803, the data of output and are constituted the system data flow of MPEG-1 form thus by multiplexed in extraction step 1802.In transfer step 1804, the system data flow of the MPEG-1 form of preparing in forming step 1803 transmits through a circuit.Label 1805 expressions are used for a database of storing media content, its structrual description data and description data.

Figure 37 shows the structure according to the structrual description data of the 5th embodiment.The physical content of described data is described with tree construction in this embodiment.Consider the storage characteristics of media content in described database 1805, must be with the form storage monolithic media content of single file.In some cases, the media content of monolithic can be stored in a plurality of independent files.The velamen of the tree construction of structrual description data is described as＜content〉and the expression monolithic content.The exercise question of a respective flap content is used as attribute and appends to described＜content〉on.Described＜content〉son＜content corresponding to a file of the described media content of expression storage＜media object.Described son＜media object〉be used as link＜steady arm that attribute appends to the link of the described file of representing to store described media content〉and represent on the identifier ID of link of description data.Under the situation that described media content is made up of a plurality of files, " seq " is used as attribute and appends to described element＜media object〉on, be used to be illustrated in the order of file interested in the described media content.

Figure 38 shows the structure according to the 5th embodiment description data.The description data of present embodiment has to the element＜media object of described structrual description data corresponding to additional〉the description data of first embodiment of link.Specifically, the root＜content of described description data〉have a son＜media object 〉, element＜media object〉have a son＜joint 〉.Element＜joint〉with＜section with the element＜joint that in first embodiment, uses and＜section identical.Element＜the media object of described structrual description data〉with the element＜media object of described description data relevant.Described element＜media object by means of described description data〉son＜media object the scene of the described media content described is stored in the element＜media object by the structrual description data with identical value attribute ID in the file of appointment.In addition, be assigned to an element＜section〉temporal information " beginning " and " end " set up from the head of each file and begin elapsed time.Specifically, comprise under the situation of a plurality of files at a monolithic media content, time at each file header place corresponding to 0, the start time of each scene is represented to a scene of interest institute elapsed time by beginning from described file header.

In computing machine, can use for example open-ended SGML (XML) described structrual description data of expression and description data.Figure 39 shows and is used to use XML to describe an example of structrual description data document type definition shown in Figure 37 (DTD), and an example that uses the structrual description data of described DTD description.Figure 40 to 45 shows the example of the DTD that is used to use XML to describe description data shown in Figure 38, and an example that uses the described description data that described DTD describes.

Describe below and the relevant processing of described selection step 1801.In selecting step 1801, can be used as the method for selecting a scene in conjunction with the described any method of first to the 4th embodiment.To described structrual description data＜target link last export synchronously with the start time of selected scene and the output of concluding time.Figure 46 show use DTD shown in Figure 39 with the structrual description data of the formal description of XML file and use DTD shown in Figure 40 and 45 with the situation of the formal description description data of XML file under from an example of the data of described selection step 1801 output.In Figure 46, the element＜media object of structrual description data is followed in " id " back〉ID; The described start time has been followed in " beginning " back; The described concluding time has been followed in " end " back.

The processing relevant with extraction step 1802 is described below.The block diagram of Figure 47 shows the extraction step 1802 according to the 5th embodiment.In Figure 47, according to the extraction step 1802 of the 5th embodiment by interface arrangement 2401, go multiplex machine 2402, video clipping device 2403 and audio clips device 2404 concrete execution.Interface arrangement 2401 is received in structrual description data and section of selecting step 1801 output, from database 1805, extract a medium content file, to the file that goes multiplex machine 2402 outputs so to extract, and described section start time and concluding time of the output in selection step 1801 to video clipping device 2403 and 2404 outputs of audio clips device.The media content of present embodiment is multiplexed into the system data flow of MPEG-1 form corresponding to video data stream and audio data stream.Therefore, go multiplex machine 2402 that the system data flow of described MPEG-1 form is divided into described video data stream and described audio data stream.The video data stream of so cutting apart and be transfused to video clipping device 2403 from described section of interface arrangement 2401 output.In the video data stream of being imported, 2403 outputs of described video clipping device and selected section relevant data.Similarly, audio data stream and in selecting step 2402 described section of output input to described audio clips device 2404.From the audio data stream of being imported, 2402 outputs of audio clips device and selected section relevant data.

The processing relevant with interface arrangement 2401 is described below.The flow process of Figure 48 shows the processing of being carried out by interface arrangement 2401.Relevant with corresponding contents as shown in figure 46 structrual description data and the section of exporting in selection step 1801 input to interface arrangement 2401.From being assigned to the element＜media object of described structrual description data〉attribute " id " obtain the sequential of file, therefore, in the section of selecting step 1801 output by rank order (S1) according to sequential and " id ".In addition, described section is converted into all data as shown in figure 49.Identical file is integrated into together and according to the series arrangement of start time.Then, 2401 pairs of data sets shown in Figure 49 of interface arrangement are carried out following processing in accordance with the order from top to bottom.At first, interface arrangement 2401 uses " id " element＜media object with reference to structrual description data 〉, and in this element＜media object the basis of attribute " steady arm " on read a filename.From described database, read with corresponding to the relevant data of the file of described filename, the data of so reading are exported to multiplexer 2402 (S2 and S3).The start time and the concluding time of the selected file section of describing later at " id " are exported to video clipping device 2403 and audio clips device 2404 (S4).After all data sets had all carried out above-mentioned processing, processing finished (S5).If still remaining some data set is not handled, so, after finishing, repeat aforesaid processing (S6 and S7) by the processing of going multiplex machine 2402 to carry out, the processing of carrying out by video clipping device 2403 and processing by 2404 execution of audio clips device.

The processing relevant with removing multiplex machine 2402 is described below.The flow process of Figure 50 shows by the processing of going multiplex machine 2402 to carry out.Remove the system data flow of multiplex machine 2402, and the system data flow of the MPEG-1 form that will so receive is divided into a video data stream and an audio data stream from the interface arrangement 2401 receptions MPEG-1 form corresponding with media content.Described video data stream exports to video clipping device 2403 and described audio data stream is exported to audio clips device 2404 (S1 is to S10).In the output of finishing described video data stream and described audio data stream (S9 and S11) afterwards, finish (S12) to interface arrangement 2401 reports by the processing of going multiplex machine 2402 to carry out.Flow process as shown in figure 50 points out, except transmitting processing finishes to determine, by the processing of going multiplex machine 2402 to carry out with identical by the processing of going the multiplex machine execution of first embodiment.

The processing of being carried out by video clipping device 2403 is described below.The flow process of Figure 53 shows the processing of being carried out by video clipping device 2403.Flow process shown in Figure 53 is pointed, finish to determine (S15 and S17) that the processing of being carried out by video clipping device 2403 is with identical by the processing of described video clipping device execution according to first embodiment except when processing finishes, transmitting processing to interface arrangement 2401.

The processing of being carried out by audio clips device 2404 is described below.The flow process of Figure 52 shows the processing of being carried out by audio clips device 2404.Flow process shown in Figure 52 is pointed, except when processing finishes, finishing to determine (S11 and S12), identical with the processing of describing in conjunction with first embodiment by described audio clips device execution by the processing that audio clips device 2404 is carried out to interface arrangement 2401 transmission processing.

In forming step 1803, the video data and the voice data of output carry out time division multiplexing by means of the multichannel multiplexing method that is used for standardization MPEG-1 under international standard ISO/IEC IS 11172-1 in extraction step 1802.Under described media content was stored in situation in a plurality of unique files, each in extraction step 1802 in the multiplexed described file was so that output video data stream and audio data stream.

In transfer step 1804, in forming step 1803, transmitted through described circuit by the system data flow of multiplexed MPEG-1 form.When exporting the system data flow of a plurality of MPEG-1 forms in forming step 1803, all system data flows transmit in proper order according to their output.

In the present embodiment, under described media content stores situation in a plurality of unique files into, each file wherein all obtains handling in extraction step 1802, in forming step 1803, all associated videos of a plurality of files of wherein said media content and audio data stream connect together and export data stream of getting in touch like this each other, even when described video and audio data stream are multiplexed in the triangular web data stream of MPEG-1 form, the identical useful result that also can realize and in forming step 1803, obtain.In this case, must utilize video clipping device 2403 to change described timing code, only increase with the quantity of video data stream and measure accordingly so that be used in the counter C that the quantity of output frame is counted.Counter C only when a file begins, be initialised (S3 shown in Figure 51 and S18).The processing of being carried out by video clipping device 2403 this moment is provided in the flow process shown in Figure 53.Though description data described in the 5th embodiment and physically context data be described separately each other, but, element＜the media object that appends to described description data by means of attribute " seq (in proper order) " and " locator (steady arm) " with the structrual description data〉attribute on, these data sets also can be combined to be a single data set.

The 6th embodiment

The sixth embodiment of the present invention is described with reference to the accompanying drawings.In the present embodiment, the moving image of MPEG-1 form is got this content as the matchmaker.In this case, media segment is cut apart corresponding to single scene.In addition, score is corresponding to the objective degree of the context importance of scene of interest.

The block diagram of Figure 54 shows the media processing method according to sixth embodiment of the invention.In Figure 54, step is selected in label 3101 expressions; 3102 expression extraction steps; 3103 expressions form step; 3104 expression transfer step and database of 3105 expressions.In selecting step 3101, from description data, select a media content scene, and output and the start time data relevant with the concluding time of so selecting scene, and the data of a file of the described data of expression storage.Like this, relevant with selecting step 3101 processing is identical with the processing of carrying out in the selection step of the 5th embodiment.In extraction step 3102, be received in described scene start time of expression and the data set of concluding time and the data of representing described file of selecting output in the step 3101.In addition, by the reference configuration data of description, from described medium content file, extract and the described section relevant data of stipulating by start time of exporting and concluding time in selection step 3101.Identical with the processing that extraction step 3102 is correlated with the processing of carrying out at extraction step described in the 5th embodiment.In forming step 3103, according to the portfolio of in transfer step 3104, determining, be multiplexed in the part or all of data stream of output in the extraction step 3102, whereby, constitute the system data flow of MPEG-1 form.In transfer step 3104, be identified for transmitting the portfolio on the described circuit of MPEG-1 format system data stream, transmit described definite result and be used in and form in the step 3103.In addition, in transfer step 3104, be transmitted in the system data flow that forms the MPEG-1 form of preparing in the step 3103 through described line.Label 3105 expression storage described media content, its structrual description data and databases of description data.

The block diagram of Figure 55 shows the processing of carrying out according to the 6th embodiment during forming step 3103 and transfer step 3104.In Figure 55, form step 3103 by data stream selecting arrangement 3201 and multiplex machine 3202 concrete execution.Transfer step 3104 is determined device 3203 and conveyer 3204 execution by portfolio.Data stream selecting arrangement 3201 is received in the extraction step 3102 video of output and audio data stream and determines the portfolio of output in the device 3203 in portfolio.If the portfolio of described line is low as to be enough to allow to transmit all data sets, so, all system data flows all will be exported to multiplex machine 3202.If owing to described trunks busy or have high traffic and make that transmitting all data sets needs for a long time, so, only select the part of described a plurality of video and audio data stream to export to described multiplex machine 3202.In this case, can carry out described selection in many ways, that is: only select the basic unit of video data stream; Only select the monophonic sounds of audio data stream; Only select the left stereophonic signal of audio data stream; Only select the right stereophonic signal of audio data stream; Or their mutual combination.Here, iff having single video data stream and single audio data stream, can not consider described portfolio so and export described data stream.Multiplex machine 3202 makes from the video of data stream selecting arrangement 3201 outputs and audio data stream by time division multiplexing by means of the multichannel multiplexing method of standardization MPEG-1 form under international standard ISO/ISE IS 11172-1.Portfolio determines that device 3203 inspections transmit the current state and the portfolio of the described circuit of described data stream, and the outgoing inspection result gives data stream selecting arrangement 3201.Conveyer 3204 transmits by the system data flow of the multiplexed MPEG-1 form of multiplex machine 3202 through described circuit.

In the present embodiment, exist under the situation of single video data stream, data stream selecting arrangement 3201 is not considered described portfolio and is exported described video data stream.But,, can only select and transmit the presentation graphics of described video data stream so if all data sets relevant with described video data stream need the plenty of time when transmitting through described circuit.When selecting described presentation graphics, the timing code of described presentation graphics is described in described description data.In addition, only being referred to as I image and a single frame can being deciphered separately can choose from a plurality of frames.

The 7th embodiment

Below with reference to accompanying drawing the seventh embodiment of the present invention is described.In the present embodiment, the moving image of the system data flow of MPEG-1 form is used as media content.In this case, media segment is cut apart corresponding to single scene.In addition, in the present embodiment, score is corresponding to the objective degree from the context importance of the interested scene of angle of the key word relevant with role who is selected by described user or incident.

The block diagram of Figure 56 shows the disposal route according to seventh embodiment of the invention.In Figure 56, step is selected in label 3301 expressions, 3302 expression extraction steps.In selecting step 3301, by means of append on the described description data key word and its score from description data, select the scene of a media content.Output and the start time data relevant of so selecting scene with the concluding time.In extraction step 3302, extract and the start time of output in selecting step 3301 and the relevant data of section of concluding time regulation.

Figure 57 shows the structure according to the description data of the 7th embodiment.In the present embodiment, described context is described according to tree construction.Element in described tree construction is from left to right arranged in chronological order.In Figure 57, be designated as＜content〉the root of described tree represent the content of a single part, exercise question is used as attribute and is assigned to described.

Utilization＜joint〉appointment＜content son＜content.The priority of the key word of the interior perhaps character of a scene of expression and the described key word significance level of expression appends to element＜joint with key word and the right form of priority as attribute〉on.Suppose that described priority is an integer of scope from 1 to 5.Wherein, 1 point out that minimum significance level and 5 points out maximum significance level.Be provided with described to (key word and priority) so that make it can be used as desirable special scenes of retrieval user or role's index.For this reason, can have a plurality of (every pair all comprises key word and priority) is affixed to a single element＜joint on.The character quantity that quantity equals to occur in scene of interest a plurality of to being affixed to a single element＜joint for example, are being described under the situation of character〉on.Setting appends to the value of the priority on the described scene, so that become bigger when making its worthwhile a large amount of character appear in the scene of interest.

Utilization＜joint〉or＜section appointment＜joint son＜joint.Here, element＜joint〉itself can be used as another height＜joint son＜joint.But, a single element＜joint〉can not have son＜joint and son＜section potpourri.

An element＜section〉expression a single scene cut apart.With append to described element＜joint on to similarly to (key word and priority) and with a temporal information that scene of interest is relevant, " end " of promptly representing described start time " beginning " and described concluding time of expression is used as attribute and appends to＜save〉on.Can use at software available on the market or by the available software of network and cut apart described scene.In addition, can make and cut apart described scene by hand.The attribute of a scene start time of expression " from " can stipulate the start frame of a scene of interest.Though temporal information is to represent according to start time and concluding time that a scene is cut apart in the present embodiment, but, when representing described temporal information, also can realize similar result according to the duration of start time of a scene of interest and described scene of interest.In this case, the concluding time of described scene of interest is gone up acquisition by being added to the described start time the described duration.

Under the story of all films in this way, role's situation, can use element＜joint〉chapter, joint and section described on the basis of described description data.In another example, when describing baseball, the element＜joint of highest ranked〉can be used for description office, their son＜joint〉can be used to describe half innings.In addition, element＜joint〉second generation＜joint can be used to describe swinging of each baseball player.Element＜joint〉third generation＜joint also can be used to describe between each throwing and twice throwing the time interval and the batting result.

Description data with this spline structure for example can utilize Extensible Markup Language (XML) to represent in computing machine.XML is the data description language (DDL) that its standardization is engaged in by World Wide Web Consortium.Submitted to February 10 in 1998 and recommended version 1.0.The explanation that relevant XML is 1.0 editions can obtain from http:/www.w3.org/TR/1998/REC-xml-19980210.Figure 58 to 66 expression be used for utilizing XML explanation present embodiment description data DTD (Document Type Definition) (DTD) an example and utilize an example of the description data of DTD explanation.An example of the description data that Figure 67 to 80 expression obtains by representative data (domination data), for example presentation graphics (being video data) and key word (voice data) to the description data affix media segment shown in Figure 58 to 66, and be used for utilizing XML that the DTD of this description data is described.

Describe now and the relevant processing of selection step S3301.In the present embodiment, at element＜section〉with have son＜section element＜joint carry out relevant processing with selection step S3301.Figure 81 is the process flow diagram of the expression processing relevant with the selection step 3301 of the 7th embodiment.Select in the step 3301 at this, imported the key word of the index effect of selecting scene and the threshold value of priority thereof, thus from having the element＜section of description data〉those element＜joints select the index of its key word and input identical with the element＜joint of its priority above threshold value conduct＜joint (S2 and S3).Then, the element＜joint from so selecting〉son＜section only select its key word and its priority identical to surpass the son＜section of this threshold value with this index (S5 and S6).According to utilizing the selected son＜section of above-mentioned processing〉attribute " beginning " and " end " start time and concluding time of determining chosen scene, and export this start time and concluding time (S7, S8, S9, S10, S11, S1 and S4).

Though in the present embodiment at element＜section〉and have son＜section element＜joint select, also can select at other mother-subrelation; Element＜the joint in a certain hierarchical layer for example〉and son＜joint.In addition, this mother-subrelation is not only limited to two-layer hierarchical layer.Each progression of hierarchical layer can be increased to greater than 2, can be to the leaf of tree construction, i.e. son＜section〉carry out identical processing.Also have, it is right to set search index for comprise the condition that concerns between a plurality of key words and definite these key words index.Determine that the condition that concerns between each key word for example comprises " among both any ", " both " or " among both any or both " such combinations.The threshold value that can be identified for selecting under the situation of a plurality of key words, can be carried out each key word and handle.Playing the key word of search index effect can be imported by the user, or by system according to user profile (profile) automatic setting.

The processing relevant with extraction step 3302 is identical with the processing of describing in first embodiment, carry out in the extraction step.

Shown in Figure 82, the advantage of present embodiment is: by the video flowing of extraction step 3302 outputs being inputed to video play device and the audio stream of same step output being inputed to audio playing apparatus and plays these audio and video streams synchronized with each other, just can only play the scene of the interested media content of spectators.In addition, by multiplexed these video flowings and audio stream, also can prepare the system data flow of the MPEG-1 form relevant with the scene set of the interested media content of spectators.

The 8th embodiment

The eighth embodiment of the present invention is described now.The 8th embodiment only is and selects the relevant processing aspect of step with the difference of the 7th embodiment.

Describe now and the relevant processing of selection step S3301.In the present embodiment, only at element＜section〉carry out and the relevant processing of selection step S3301.Figure 83 is the process flow diagram of the expression processing relevant with the selection step S3301 of the 7th embodiment.Shown in Figure 83, in selecting step 3301, the key word of the index effect that is used for selecting scene and the threshold value of priority thereof have been imported.Element＜section from description data〉select the son＜section of its key word and its priority identical above threshold value with this index (S1 and S6).

Though in the 8th embodiment only at element＜section select, also can be at element＜joint with certain classification〉select.In addition, it is right also can to set search index for comprise the condition that concerns between a plurality of key words and definite these key words index.Determine that the condition that concerns between each key word for example comprises " among both any ", " both " or " among both any or both " such combinations.The threshold value that can be identified for selecting under the situation of a plurality of key words, can be carried out each key word and handle.

The 9th embodiment

The ninth embodiment of the present invention is described now.The 9th embodiment only is and selects the relevant processing aspect of step with the difference of the 7th embodiment.

Referring now to accompanying drawing, describes and the relevant processing of selection step S3301.As in the situation of the processing that the 7th embodiment is described, in the selection step 3301 of the 9th embodiment, at element＜section〉and have son＜section element＜joint select.In the present embodiment, consider all duration of waiting to select scene at interval sum come setting threshold; Specifically, select like this, the duration interval sum that makes the scene of up to the present having selected is for maximum but still less than this threshold value.Figure 84 is the process flow diagram of the expression processing relevant with the selection step of the 9th embodiment.In selecting step 3301, received the single key word of search index effect.Then from having son＜section〉element＜joint extract all elements＜joint with key word identical with this search index.So selected element＜joint〉set as set omega (S1 and S2).Element＜the joint of set omega〉according to priority descending sort (S3).Element＜joint of from the element of the set omega of ordering like this, selecting its key word or search index to have the greatest priority value then〉(S5).The so selected element＜joint of deletion from set omega〉(S6).In this case, if a plurality of element＜joint〉all have the greatest priority value, just extract all these element＜joints 〉.At so selected element＜joint〉son＜section in, only select to have the son＜section of search index 〉, son＜section of selecting like this append to another set omega ' in remove (S7).Set omega ' initial value be " sky " (S2).Obtain and duration of the scene of set omega ' relevant sum (S8) at interval, this and value and threshold value make comparisons (S9).If should the duration at interval sum equal this threshold value, the element＜section of output and set omega ' comprised just the relevant data of all sections, end process (S14).On the contrary, if the duration, sum was less than threshold value at interval, processing is just returned once more and is selected its search index or key word to have the element＜joint of limit priority from set omega〉(S5).Repeating above-mentioned selection handles.If set omega is empty, just output and set omega ' element＜section the relevant data of all sections, end process (S4).If with duration of the scene of set omega ' relevant at interval sum just carry out following the processing greater than threshold value.From set omega ' its search index of deletion or key word have the element＜section of minimum priority (S11).At this moment, if a plurality of element＜section〉all have minimum priority, just delete all these element＜sections 〉.Obtain set omega ' element＜section duration sum (S12) at interval, this and value and threshold value make comparisons (S13).If should the duration at interval sum greater than this threshold value, handle just return from set omega ' deletion element＜section.Repeating such deletion handles.At this, if set omega ' would be empty, with regard to end process (S10).On the contrary, if the duration at interval sum less than threshold value, just output and set omega ' element＜section the relevant data of all sections, end process (S14).

Though in the present embodiment at element＜section〉and have son＜section element＜joint carry out to select, also can be to other mother-subrelation, for example element＜joint〉and its son＜section in another classification the execution selection.Also have, mother-subrelation is not only limited to the hierarchical layer of two-stage; The branch progression of hierarchical layer can increase.For example, to being in element＜joint from highest ranked〉to its son＜section〉under the element of hierarchical layer in the scope situation about handling, select the element＜joint of highest ranked 〉.Also select so selected element＜joint〉follow-up＜save, and further select chosen elements like this＜joint second generation＜joint.Repeat this and take turns selection operation up to son＜section〉chosen till.Element＜section of selecting like this〉the composition set omega '.

In the present embodiment, element sorts according to the descending of search index or key word priority, can consider that priority value comes setting threshold, can select element according to the descending of priority.Can consider element＜joint〉and can consider element＜section〉come setting threshold respectively.

In the present embodiment, search index is defined as single key word.But, can set search index for and comprise a plurality of key words and determine that the index of condition of the relation between these key words is right.Determine that the condition that concerns between each key word for example comprises " among both any ", " both " or " among both any or both " such combinations.In this case, need determine selecting or deletion element＜joint〉and element＜section the time use the rule of priority of each key word.An example of this rule is as follows: if condition is " among both any ", then the greatest priority value of the priority value of corresponding each key word is set to " preferentially ".In addition, if condition is " both ", then the minimum priority value of the priority value of corresponding each key word is set to " preferentially ".Even when being " among both any or both " in condition, also can determine priority value by this rule.Also have, under search index or key word are a plurality of situation, can be considered as the priority level initializing threshold value of the key word of search index, can handle those elements that its priority value surpasses this threshold value.

The tenth embodiment

The tenth embodiment of the present invention is described now.The tenth embodiment only is and selects the relevant processing aspect of step with the difference of the 7th embodiment.

Referring now to accompanying drawing, describes and the relevant processing of selection step S3301.As in the situation of the processing that the 8th embodiment is described, in the selection step 3301 of the tenth embodiment, only at element＜section〉select.In addition, as the 9th embodiment, in the present embodiment, consider the duration interval sum setting threshold that all wait to select scene; Specifically, select element like this, the duration interval sum that makes the scene of up to the present having selected is for maximum but still less than threshold value.Figure 85 is the process flow diagram of the expression processing relevant with the selection step of the tenth embodiment.

In selecting step 3301, received the single key word of search index effect.Set omega ' be initialized as " sky " (S2).Then from element＜section〉extract all elements＜section have with this search index same keyword (S1).So selected element＜section〉set as set omega.Then according to the descending sort of priority as the identical element＜section of the key word of search index (S3).From the element that set omega so sorts, extract the element＜section that has the greatest priority value as the key word of search index then〉(S5), and element＜section that deletion is so extracted from this set omega.In this case, if a plurality of element＜section〉all have the greatest priority value, just select all these element＜sections 〉.If set omega is empty, just output and set omega ' element＜section the relevant data of all elements, end process (S4).Calculate element＜section of so extracting〉the duration duration interval sum T2 (S7) of each scene of sum T1 (S6) and set of computations Ω ' at interval.T1 and T2 sum and threshold value make comparisons (S8).If T1 and T2 sum surpass threshold value, the element＜section of output and set omega ' comprised just〉the relevant data of all sections, end process (S11).If T1 and T2 sum equal threshold value, just give set omega ' the element＜section of additional all extractions of element (S9 and S10), export element＜section with set omega ' comprised〉the relevant data of all sections, and end process (S11).On the contrary, if T1 and T2 sum less than threshold value, just give set omega ' the element＜section of additional all extractions of element, processing is returned then and select element＜section from set omega〉(S10).

Though in the present embodiment at element＜section〉select, also can be at the element＜joint in another classification carry out and select.In the present embodiment, element sorts according to the descending as the priority of the key word of search index.Can consider that priority value comes setting threshold, as long as the priority value of element just can be selected these elements according to the descending of priority greater than threshold value.

In addition, in the present embodiment, search index is defined as single key word.But it is right to set search index for comprise the condition that concerns between a plurality of key words and definite these key words index.The condition of determining the relation between each key word for example comprises " among both any ", " both " or " among both any or both " such combinations.In this case, need determine selecting or deletion element＜joint〉and element＜section the time use the rule of priority of each key word.An example of this rule is as follows: if condition is " among both any ", then the greatest priority value of the priority value of corresponding each key word is set to " preferentially ".In addition, if condition is " both ", then the minimum priority value of the priority value of corresponding each key word is set to " preferentially ".Even when being " among both any or both " in condition, also can determine priority value by this rule.Also have, under search index or key word are a plurality of situation, can consider that the priority of search index or key word is come setting threshold, can handle those elements that its priority value surpasses this threshold value.

The 11 embodiment

Eleventh embodiment of the invention is described now.The difference of the description data of the description data of present embodiment and the 7th to the tenth embodiment is the key word effect that viewpoint-play is used for selecting scene-and the explanation aspect of this viewpoint significance level.Shown in Figure 57, in the 7th to the tenth embodiment, the significance level of viewpoint and this viewpoint is by to element＜section〉or＜section the combination of distribution key word and significance level (that is: key word, priority) describes.In contrast, shown in Figure 133, in the 11 embodiment, the significance level of viewpoint and this viewpoint is by giving root＜content〉distributive property " povlist " and give element＜joint or＜section distributive property " povvalue " describes.

Shown in Figure 134, attribute " povlist " is corresponding to the viewpoint of representing with vector form.Shown in Figure 135, attribute " povvalue " is corresponding to the significance level of representing with vector form.Each set comprises that the composite set of the significance level of viewpoint and this viewpoint one-one relationship arranges by given sequence, forms attribute " povlist " and " povvalue " thus.For example, shown in Figure 134 and 135, the significance level value 5 of viewpoint 1, the significance level value 0 of viewpoint 2; The significance level value 2 of viewpoint 3; The significance level value 0 of viewpoint " n " (" n " is positive integer).In the situation of the 7th embodiment, the significance level value 2 of viewpoint 2, showing one's colors 2 is not assigned with key word; I.e. combination (key word, priority).

Figure 136 to 163 and Figure 164 to 196 expression is used for utilizing " Extensible Markup Language " that be used to illustrate description data in computing machine (XML) to describe " DTD (Document Type Definition) " some examples (DTD) of the description data of present embodiment, and with an example of the description data of DTD description.Even also utilize description data to realize those processing operations identical in the present embodiment with the processing operation of in the 7th to the tenth embodiment, describing.

In the present embodiment, attribute " povlist " is assigned to root＜content 〉, and attribute " povvalue " is attached to element＜joint〉or＜section 〉.Shown in Figure 197, attribute " povlist " also can be attached to element＜joint〉or＜section 〉.For the element＜joint that has been assigned with attribute " povlist "〉or＜section, attribute " povvalue " is corresponding to being assigned to element＜joint〉or＜section attribute " povlist ".And for the element＜joint that is not assigned with attribute " povlist " or＜section, attribute " povvalue " is corresponding to being assigned to root＜content〉attribute " povlist " or at the element＜joint that is not assigned with attribute " povlist " or＜section ancestors in be assigned with attribute " povlist " near element＜joint attribute " povlist ".

Figure 198 to 252 expression is corresponding to the example of the DTD of the description data DTD shown in Figure 197, that be used to utilize the XML explanation present embodiment that is used to illustrate description data in computing machine, and an example of the description data of describing with DTD.For these examples in, be assigned to element＜joint〉or＜section attribute " povvalue " corresponding to being assigned to root＜content attribute " povlist ".

The 12 embodiment

Referring now to accompanying drawing, twelveth embodiment of the invention is described.In the present embodiment, the moving image of the system data flow of MPEG-1 form is used as media content.In this case, media segment is equivalent to single scene and cuts apart.

Figure 86 is the block scheme of the media processing method of expression twelveth embodiment of the invention.In Figure 86, step is selected in label 4101 expressions; 4102 expression extraction steps; 4103 expressions form step; 4104 expression transfer step; 4105 representative.In selecting step 4101, the data that the file of these data has been stored in data that a based on context scene of data of description selection media content, and output is relevant with the concluding time with the start time of the scene of so selecting and expression.In extraction step 4102, receive the data set and the data set that is illustrated in the file of selecting step 4101 output of the start time and the concluding time of this scene of expression.Referring to the structrual description data, from the file of media content, extract the relevant data of determining with start time that in selecting step 4101, receives and concluding time of section.In forming step 4103, the data of extraction step 4102 outputs are carried out multiplexed, form the system data flow of MPEG-1 form thus.In transfer step 4104, be transmitted in the system data flow that forms the MPEG-1 form that forms in the step 4103 by circuit.The database of media content and structrual description data and description data has been stored in label 4105 expressions.

The structure of the structrual description data that the 12 embodiment adopts is identical with the 5th embodiment's.Specifically, use structrual description data with structure shown in Figure 37.

Figure 87 represents the structure of the description data of the 12 embodiment.The description data of present embodiment is corresponding to the element＜media object of giving the structrual description data of the 7th embodiment〉description data of having added link.Specifically, the root＜content of description data〉have son＜media object 〉, and element＜media object have son＜joint.Element＜joint〉with＜section with used identical of the 7th embodiment.Give the element＜media object of description data〉adeditive attribute " id ".Utilize this attribute " id " to make the element＜media object of structrual description data〉with the element＜media object of description data be associated.Utilize the element＜media object of description data〉the scene of the media content described of offspring be stored in element＜media object by the structrual description data of the attribute id with same value in the file of appointment.In addition, distributing to the temporal information " beginning " of element " section " and " end " determines from the beginning of each file elapsed time.Specifically, comprise at the single hop media content under the situation of a plurality of files that the time in the beginning of each file is corresponding to 0, and the start time of each scene use play interested scene from the beginning of this document till institute's elapsed time represent.

Structrual description data and description data can for example utilize " Extensible Markup Language " (XML) to represent in computing machine.The Figure 39 that uses relatively with the 5th embodiment represents an example of structrual description data.In addition, Figure 88 to 96 expression be used for utilizing " DTD (Document Type Definition) " that XML describes the description data shown in Figure 87 (DTD) an example and utilize an example of the description data that this DTD describes.

Describe now and the relevant processing of selection step 4101.In selecting step 4101, adopt any method of in the 7th to the tenth embodiment, describing as the method for selecting scene.Element＜the media object of corresponding construction data of description〉" id " final export simultaneously with the start time of selected scene and the output of concluding time.Utilizing DTD shown in Figure 39 with the formal description structrual description data of XML file with utilize under the situation of the DTD shown in Figure 88 and 96 with the formal description description data of XML file, shown in Figure 46 identical from example of the data of selecting step 4101 output and the 5th embodiment.

Identical with the processing that extraction step 4102 is correlated with the extraction step of in the 5th embodiment, describing.The processing relevant with forming step 4103 is also identical with the formation step of describing in the 5th embodiment.In addition, the processing of being correlated with transfer step 4104 is also identical with the transfer step of describing in the 5th embodiment.

The 13 embodiment

Referring now to accompanying drawing, thriteenth embodiment of the invention is described.In the present embodiment, the moving image of the system data flow of MPEG-1 form is used as media content.In this case, media segment is equivalent to single scene and cuts apart.

Figure 97 is the block scheme of the media processing method of expression thriteenth embodiment of the invention.In Figure 97, step is selected in label 4401 expressions; 4402 expression extraction steps; 4403 expressions form step; 4404 expression transfer step; 4405 representative.In selecting step 4401, the data that the file of these data has been stored in data that a based on context scene of data of description selection media content, and output is relevant with the concluding time with the start time of the scene of so selecting and expression.The processing relevant with selecting step 4401 is identical with the processing relevant with the selection step of describing in the 12 embodiment.In extraction step 4402, receive the data set and the data set that is illustrated in the file of selecting step 4401 output of the start time and the concluding time of this scene of expression.Referring to the structrual description data, from the file of media content, extract the relevant data of determining with start time that in selecting step 4401, receives and concluding time of section.The processing of being correlated with extraction step 4402 is identical with the processing relevant with the extraction step of describing in the 12 embodiment.In forming step 4403, according to the portfolio of the circuit of determining in transfer step 4404 part or all of the system data flow of extraction step 4402 outputs carried out multiplexedly, form the system data flow of MPEG-1 form thus.The processing relevant with forming step 4403 is identical with the processing relevant with the extraction step of describing in the 6th embodiment.In transfer step 4404, determine the portfolio of circuit, and the result who determines is sent to formation step 4403.In addition, be transmitted in the system data flow that forms the MPEG-1 form that forms in the step 4403 by circuit.The processing relevant with forming step 4404 is identical with the processing relevant with the formation step of describing in the 6th embodiment.The database of media content and structrual description data and description data has been stored in label 4405 expressions.

Though in the 13 embodiment the system data flow of MPEG-1 as media content, as long as other form also can obtain the time code of each screen, then use this form also can obtain the useful result identical with the MPEG-1 system data flow.

Following embodiment will describe the summary corresponding to the pattern of claims claimed invention.Below will be with " voice data " expression data relevant with sound, sound comprises audible sound accent, noiseless, speech, music, peace and quiet, external noise etc.Represent audible and visual data with " video data ", for example the character of moving image, rest image or automatic diascope (telop) and so on.The mark that calculates according to the content of voice data, for example audible tone, noiseless, speech, music, peace and quiet or external noise with " score " expression; According to the mark that the NULI character appointment is arranged in the video data; Or the combination of these two kinds of marks.In addition, also can use score except that above-mentioned.

The 14 embodiment

Describe the 14th embodiment of the present invention now, this embodiment is relevant with the invention that claim 2 is described.Figure 98 is the block scheme of the expression processing relevant with the data processing method of present embodiment.In the figure, step is selected in label 501 expressions; Label 503 expression extraction steps.In selecting step 501, based on context at least one section or a scene getting the component selections media content of data of description, and the section or the scene of output selection like this.The section of selecting is corresponding to for example start time and the concluding time of a selection section.In extraction step 503, only extract section (hereinafter referred to as " media segment ") relevant data of the media content of dividing with the section of selecting at selection step S501, promptly relevant data with the section of selection.

Particularly, in the described invention of claim 5, score is corresponding to the objective degree from the context importance of the interested scene of viewpoint of the relevant key word of the role that selects with the user or incident.

The 15 embodiment

Describe the 15th embodiment of the present invention now, this embodiment is relevant with the invention that claim 3 is described.Figure 99 is the block scheme of the expression processing relevant with the data processing method of present embodiment.In the figure, step is selected in label 501 expressions; Step is play in label 505 expressions.In playing step 505, only play and the relevant data of section of being divided in the selection section of selecting step S501 output.The processing relevant with selecting step 501 is identical with the processing of describing in the first to the 13 embodiment, for simplicity's sake, no longer describes herein.

The 16 embodiment

Describe the 16th embodiment of the present invention now, this embodiment is relevant with the invention that claim 12 is described.Figure 100 is the block scheme of the expression processing relevant with the data processing method of 16 embodiment.In the figure, label 507 expression videos are selected step; Label 509 expression audio selection steps.Video selects step 507 and audio selection step 509 all to be included among the 14 and 15 described selection of the embodiment steps 501.

Select in the step 507 at video, select the section or the scene of video data, and export the section of so selecting referring to the description data relevant with video data.In audio selection step 509, the section that selects a sound referring to the description data relevant, and the section so selected of output with voice data.At this, the section of selection is corresponding to for example this start time of selected period and concluding time.In the described extraction step 503 of the 14 embodiment, only play comfortable video to select the data of the video-data fragment of step 507 selection.In playing step 505, only play the data of the voice data section of comfortable audio selection step 509 selection.

The 17 embodiment

Describe the 17th embodiment of the present invention now, this embodiment is relevant with the inventions of

claim

15,16,17,18,19 and 20 descriptions.Figure 101 is the block scheme of the expression processing relevant with the data processing method of present embodiment.In the figure, label 511 expression determining steps; Step is selected in 513 expressions; 503 expression extraction steps; Step is play in 505 expressions.

(example 1)

In the described invention of claim 15, media content comprises a plurality of different media data collection in the single time period.In determining step 511, receive the structrual description data of describing the media content data structure.In this step, ask definite data as alternative according to the ability of definite condition, for example receiving end, the portfolio and the user of conveyer line.In selecting step 513, be received in the data, structrual description data and the description data that are defined as alternative in the determining step 511.In addition, only from the data that determining step 511, are confirmed as alternative, select the media data collection.Because extraction step 503 is identical with the extraction step of the 14 embodiment, and it is identical with the broadcast step of the 15 embodiment to play step 505, so omit description of them at this.Media data comprises several data sets, for example video data, voice data and text data.In the explanation of following each example, media data is particularly including at least one of video data and voice data.

In the present example, shown in Figure 102, in the single time period of media content, different video datas or voice data are distributed to channel, further these video datas or voice data are distributed to the classification collection of layer.For example, the channel of translatory movement image-1/ layer-1 is distributed to the video data with standard resolution, channel-1/ layers-2 is distributed to had high-resolution video data.The channel 1 that transmits voice data is distributed to the stereo sound data, channel 2 is distributed to the monophonic sound sound data." DTD (Document Type Definition) " that Figure 103 and 104 expressions are used for utilizing XML description scheme data of description (DTD) an example and utilize an example of the description data that this DTD describes.

Under the situation that media content is made of such channel and layer, the processing relevant with definite step 511 of this example described referring to Figure 105 to 108.Shown in Figure 105, determined whether that in step 101 user asks to exist.Ask to exist if define the user, just this user is asked to carry out the definite treatment S R-A shown in Figure 106 in step 101.

In step 101, if determine the no user request, handle just arriving step S103, determine further whether receivable data are video data, are voice data or video and voice data.If determine that at step S103 can receive data is video data, just carries out the definite treatment S R-C relevant with video data shown in Figure 107.Be determined just voice data if can receive data, just carry out the definite treatment S R-C relevant shown in Figure 108 with voice data.If video and voice data all are receivable, handle just arriving step S105.At step S105, determine whether receiving end has the ability of receiver, video and voice data; For example, video display capabilities, the speed of ability to play and depressurizing compression data.If determine that the ability of receiving end is stronger, handle just arriving step S107.On the contrary, if the ability of determining receiving end a little less than, handle just arriving step S109.At step S107, definite portfolio that will transmit the circuit of video data and voice data.If determine that the portfolio of this circuit is bigger, handle just arriving step S109.If determine that the portfolio of this circuit is less, handle just arriving step S111.

When portfolio more weak in the receiving end ability or circuit is big, the processing of execution in step S109.During this was handled, receiving end received the video data with standard resolution by channel-1/ layer-1, receives voice datas by channel 2.In the receiving end ability strong or portfolio hour, the processing of execution in step S111.During this was handled, receiving end had high-resolution video data by channel-1/ layer-2 reception, receives stereo sound by channel 1.

Describe shown in Figure 106 now and ask relevant definite treatment S R-A with the user.In this example, suppose that user's request is for selecting video layer and acoustic channel.In step S151, determine whether the user asks video data.If determine that in step S151 the user asks video data, handle just to arrive step S153.If determine that the user does not ask video data, handle just arriving step S159.At step S153, determine the user to the request of video data whether corresponding to the selection of layer 2.If S153 has selected "Yes" in step, handle just arriving step S155, select layer 2 as video data.If S153 has selected "No" in step, handle just arriving step S157, select layer 1 as video data.At step S159, determine whether the user asks voice data.If determine that at step S159 the user asks voice data, handle just arriving step S161.If determine that the user does not ask voice data, with regard to end process.At step S161, determine the user to the request of voice data whether corresponding to the selection of channel 1.If S161 has selected "Yes" in step, to handle and just arrive step S162, selective channel 1 is as voice data.If S161 has selected "No" in step, to handle and just arrive step S615, selective channel 2 is as voice data.

The definite treatment S R-B relevant with video data shown in Figure 107 described now.At step S171, determine the ability of receiving end receiving video data.If receiving end is confirmed as having stronger ability, handle just arriving step S173.If receiving end is confirmed as having more weak ability, handle just arriving step S175.At step S173, determine the portfolio of circuit.If it is bigger that the portfolio of circuit is confirmed as, handle just arriving step S175.On the contrary, less if the portfolio of circuit is confirmed as, handle just arriving step S177.

When portfolio more weak in the receiving end ability or circuit is big, the processing of execution in step S175.During this was handled, receiving end received only the video data with standard resolution by channel 1-/layer-1.The portfolio of or circuit more weak in the receiving end ability hour, the processing of execution in step S177.During this was handled, receiving end receives only by channel 1-/layer-2 had high-resolution video data.

The definite treatment S R-C relevant with voice data shown in Figure 108 described now.At step S181, determine that receiving end receives the ability of voice data.If receiving end is confirmed as having stronger ability, handle just arriving step S183.If receiving end is confirmed as having more weak ability, handle just arriving step S185.At step S183, determine the portfolio of circuit.If it is bigger that the portfolio of circuit is confirmed as, handle just arriving step S185.On the contrary, less if the portfolio of circuit is confirmed as, handle just arriving step S187.

When portfolio more weak in the receiving end ability or circuit is big, the processing of execution in step S185.During this was handled, receiving end received only monaural audio data by channel 2.The portfolio of or circuit more weak in the receiving end ability hour, the processing of execution in step S187.During this was handled, receiving end received only stereo audio data by channel 1.

(example 2)

The difference of the invention that claim 16 is described and example 1 described invention (invention of claim 15 description) only with in aspect the relevant processing of determining step S511.In determining step 511, receive the structrual description data of describing the media content data structure.In this step, according to portfolio and user's request of the ability of determining condition, for example receiving end, transmission line, the just video data of determining to select, just to select voice data still be video data and both audio.Owing to select step 513, extraction step 503 and play step 505, so omit description of them at this all with above-described identical.

Referring now to Figure 109 to 110, the processing relevant with definite step 511 of this example described.Shown in Figure 109, determined whether that in step S201 the user asks to exist.Ask to exist if define the user, handle just arriving step S203,, handle just arriving step S205 if determine the no user request at step S201.At step S203, determine whether the user only asks video data.If S203 has selected "Yes" in step, handle and just arrive step S253, only video data is confirmed as alternative.If S203 has selected "No" in step, handle just arriving step S207.At step S207, determine whether the user only asks voice data.If S207 has selected "Yes" in step, handle and just arrive step S255, only voice data is confirmed as alternative.If S207 has selected "No" in step, to handle and just to arrive step S251, video and voice data all are confirmed as the object selected.

Among the step S205 that when the no user request exists, handle to arrive, only determine video data, only voice data still be video and voice data both be receivable.If determine that at step S205 only video data is receivable, handle just arriving step S253, only video data is defined as alternative.If determine that at step S205 only voice data is receivable, handle just arriving step S255, only voice data is defined as alternative.If determine that at step S205 video and voice data all are receivable, handle just arriving step S209.

At step S209, determine the portfolio of circuit.If the portfolio of this circuit is less, handle and just arrive step S251, video and voice data all are defined as alternative.If the portfolio of this circuit is bigger, handle just arriving step S211.In step S211, determine whether to comprise voice data by the data that this circuit transmits.If S211 has selected "Yes" in step, handle and just arrive step S255, voice data is defined as alternative.If S211 has selected "No" in step, handle and just arrive step S253, video data is defined as alternative.

(example 3)

In the invention according to claim 17, media content comprises a plurality of different video and/or audio data sets in the single time period.Except that the just video data of determining to select, be only to select a sound data or video and the voice data, carry out in definite step 511 of second example of this being chosen in (invention according to claim 16), also in definite step S511 of the 3rd example, according to portfolio and user's request of the ability of determining condition, for example receiving end, transmission line, determine to select these sets of video data/audio data sets which as alternative.Owing to select step 513, extraction step 503 and play step 505 with above-mentioned identical, so do not repeat them here.

As example 1, in the single time period of media content, different video datas or voice data are distributed to channel or layer.For example, the channel of translatory movement image 1-/layer-1 is distributed to the video data with standard resolution, channel-1/ layers-2 is distributed to had high-resolution video data.The channel 1 that transmits voice data is distributed to the stereo sound data, channel 2 is distributed to the monophonic sound sound data." DTD (Document Type Definition) " that Figure 103 and 104 expressions are used for utilizing XML description scheme data of description (DTD) an example and utilize an example of the description data that this DTD describes.

Referring now to the definite step 511 relevant processing of Figure 111 to 113 description with the 3rd example.Shown in Figure 111, in the present example, determined like that definite data (alternative is determined SR-D) as alternative as what example 2 did.In step S301, determine to utilize alternative to determine treatment S R-D established data.In step S301, when having only video data to be confirmed as alternative, just carry out the definite treatment S R-E relevant shown in Figure 112 with video data.In step S301, when having only voice data to be confirmed as alternative, just carry out the definite treatment S R-F relevant shown in Figure 113 with voice data.In step S301, when video data and voice data all are confirmed as alternative, handle and just arrive step S303, determine the receiving ability of receiving end receiver, video and voice data.If determine that the ability of receiving end is stronger, handle just arriving step S305.If the ability of determining receiving end a little less than, handle and just arrive step S307, determine the ability of circuit, as transfer rate.If determine that the ability of this circuit is stronger, handle just arriving step S309.On the contrary, if the ability of determining this circuit a little less than, handle just arriving step S307.If determine that the portfolio of this circuit is bigger, handle just arriving step S307.If determine that the portfolio of this circuit is less, handle just arriving step S311.

In that the receiving end ability is weak, line capacity is more weak or the portfolio of circuit when big, the processing of execution in step S307.During this was handled, receiving end received the video data with standard resolution by channel-1/ layer-1, receives the monophonic sound sound datas by channel 2.On the contrary, strong in the receiving end ability, line capacity is strong or the portfolio of circuit hour, the processing of execution in step S311.During this was handled, receiving end had high-resolution video data by channel-1/ layer-2 reception, receives the stereo sound data by channel 1.

The definite treatment S R-F relevant with video data shown in Figure 112 described now.In step S351, determine the ability of receiving end receiving video data.If determine that the ability of receiving end is stronger, handle just arriving step S353.If the ability of determining receiving end a little less than, handle just arriving step S355.At step S353, determine the ability of circuit.If determine that the ability of this circuit is stronger, handle just arriving step S357.On the contrary, if the ability of determining this circuit a little less than, handle just arriving step S355.At step S357, determine the portfolio of this circuit.If determine that the portfolio of this circuit is bigger, handle just arriving step S355.On the contrary, if determine that the portfolio of this circuit is less, handle just arriving step S359.

In that the receiving end ability is weak, line capacity is more weak or the portfolio of circuit when big, the processing of execution in step S355.During this was handled, receiving end received only the video data with standard resolution by channel-1/ layer-1.On the contrary, strong in the receiving end ability, line capacity is strong or the portfolio of circuit hour, the processing of execution in step S359.During this was handled, receiving end receives only for 1/ layer 2 by channel had high-resolution video data.

The definite treatment S R-F relevant with voice data shown in Figure 113 described now.In step S371, determine that receiving end receives the ability of voice data.If determine that the ability of receiving end is stronger, handle just arriving step S373.If the ability of determining receiving end a little less than, handle just arriving step S375.At step S373, determine the ability of circuit.If determine that the ability of this circuit is stronger, handle just arriving step S377.On the contrary, if the ability of determining this circuit a little less than, handle just arriving step S375.At step S377, determine the portfolio of this circuit.If determine that the portfolio of this circuit is bigger, handle just arriving step S735.On the contrary, if determine that the portfolio of this circuit is less, handle just arriving step S379.

In that the receiving end ability is weak, line capacity is more weak or the portfolio of circuit when big, the processing of execution in step S375.During this was handled, receiving end received only the monophonic sound sound data by channel 2.On the contrary, strong in the receiving end ability, line capacity is strong or the portfolio of circuit hour, the processing of execution in step S379.During this was handled, receiving end received only stereo data by channel 1.

(example 4)

In

claim

18 and 19 described inventions, representative data that will be relevant with corresponding media segment appends in each element of the description data in the lowest hierarchical layer as attribute.Media content comprises a plurality of different media data collection in the single time period.In determining step S511, receive the structrual description data of the data structure of describing media content.Which in this step,, determine media data collection and/or representative data collection as alternative according to the portfolio of the ability of determining condition, for example receiving end, transmission line, ability and user's request of this circuit.

Owing to select step 513, extraction step 503 and play step 505 with described above identical, so do not repeat them here.Media data comprises video data, voice data or text data.In the present example, media data comprises at least one in video data and the voice data.Under the situation of representative data corresponding to video data, this representative data comprises for example the presentation graphics data or the low-resolution video data of relevant each media segment.Under the situation of representative data corresponding to voice data, this representative data comprises for example phrase indexing (key-phrase) data of relevant each media segment.

As example 3, in the single time period of media content, different video datas or voice data are distributed to channel or layer.For example, the channel of translatory movement image-1/ layer-1 is distributed to the video data with standard resolution, channel-1/ layers-2 is distributed to had high-resolution video data.The channel 1 that transmits voice data is distributed to the stereo sound data, channel 2 is distributed to the monophonic sound sound data.

Referring now to Figure 114 to 118, the processing relevant with definite step 511 of this example described.Shown in Figure 114, determined whether that in step S401 the user asks to exist.Ask to exist if define the user, just carry out shown in Figure 116 and ask relevant definite treatment S R-G with the user at step S401.

If determine the no user request at step S401, handle just to arrive step S403, only determine video data, only voice data still be video and voice data both be receivable.If determine that at step S403 only video data is receivable, just carry out definite treatment S R-H relevant shown in Figure 117 with video data.On the contrary, if determine that only voice data is receivable, just carry out definite treatment S R-I relevant shown in Figure 118 with voice data.If determine that video and voice data both are receivable, handle the step S405 that just arrives shown in Figure 115.

At step S405, determine the ability of receiving end.After the processing of execution in step S405, by the processing of the step S407 that carry out to determine line capacity to definite sequence with determine the processing of step S409 of the portfolio of this circuit.On the result's of the performed processing operation of step S405, S407 and S409 basis, in definite step S511 of this example, determine the video data that will receive or channel or layer, the still representative data that will receive of voice data.

Table 1

The receiving end ability	Line capacity	Is the portfolio of circuit big?	Receive data
The receiving end ability	Line capacity	Is the portfolio of circuit big?	Receive data	By force	By force	Not	Video data: channel 1, layer 2 voice data: channel 1 (S411)
By force	By force	Be	Video data: channel 1, layer 1 voice data: channel 1 (S413)	By force	By force	Not	Video data: channel 1, layer 2 voice data: channel 1 (S411)
By force	By force	Be	Video data: channel 1, layer 1 voice data: channel 1 (S413)	By force	A little less than	Not	Video data: channel 1, layer 1 voice data: channel 2 (S413)
By force	A little less than	Be	Video data: channel 1, layer 1 voice data: channel 2 (S415)	By force	A little less than	Not	Video data: channel 1, layer 1 voice data: channel 2 (S413)
By force	A little less than	Be	Video data: channel 1, layer 1 voice data: channel 2 (S415)	A little less than	By force	Not	Video data: channel 1, layer 1 voice data: channel 2 (S415)
A little less than	By force	Be	Video data: representative data voice data: channel 2 (S417)	A little less than	By force	Not	Video data: channel 1, layer 1 voice data: channel 2 (S415)
A little less than	By force	Be		A little less than	A little less than	Not	Video data: representative data voice data: channel 2 (S417)
A little less than	A little less than	Be	Video data: representative data voice data: representative data (S419)	A little less than	A little less than	Not

Describe shown in Figure 116 now and ask relevant definite treatment S R-G with the user.At step S451, determine whether the user only asks video data.If select "Yes", determine SR-H with regard to carrying out the processing relevant with video data at step S451.If select "No", handle just arriving step S453 at step S451.At step S453, determine whether the user only asks voice data.If select "Yes", just carry out the definite treatment S R-I relevant with voice data at step S453.If select "No", handle just arriving step S405 at step S453.

The definite treatment S R-H relevant with video data shown in Figure 117 described now.At step S461, determine the ability of receiving end.After the processing of execution of step S461, by the processing of the step S465 of the processing of the step S463 that carry out to determine line capacity to definite sequence and definite line traffic.Behind the processing EO relevant with S465 with these steps S461, S463, as long as the receiving end ability is strong, line capacity is strong and portfolio circuit is little, then during the definite treatment S R-H relevant, receive only video data (step S471) by channel-1/ layers-2 with the video data of this example.On the contrary, if the receiving end ability is weak, line capacity is weak and portfolio circuit is big, then receive only representative video data (step S473).If above-mentioned arbitrary condition all is not being met, then receive only video data (step S475) by channel-1/ layer-1.

The definite treatment S R-I relevant with voice data shown in Figure 118 described now.At step S471, determine the ability of receiving end.After the processing of execution of step S471, by the processing of the step S475 of the processing of the step S473 that carry out to determine line capacity to definite sequence and definite line traffic.Behind the processing EO relevant with S475 with these steps S471, S473, as long as the receiving end ability is strong, line capacity is strong and portfolio circuit is little, then during the definite treatment S R-I relevant, receive only voice data (step S491) by channel 1 with the voice data of this example.On the contrary, if the receiving end ability is weak, line capacity is weak and portfolio circuit is big, then receive only representative voice data (step S493).If above-mentioned arbitrary condition all is not being met, then receive only video data (step S495) by channel 2.

(example 5)

In the described invention of claim 20, according to the portfolio of determining condition, for example receiving end ability, transmission line ability, this circuit and user's request, determine a total data relevant, the only representative data of being correlated with or one of the total data of being correlated with or representative data with corresponding media segment with corresponding media segment with media segment which as alternative.

As example 4, representative data that will be relevant with corresponding media segment appends in each element of the description data in the lowest hierarchical layer as attribute.Under the situation of representative data corresponding to video data, this representative data comprises for example the presentation graphics data or the low-resolution video data of relevant each media segment.Under the situation of representative data corresponding to voice data, this representative data comprises for example phrase indexing (key-phrase) data of relevant each media segment.

Referring now to Figure 119 to 121, the processing relevant with definite step 511 of this example described.Shown in Figure 119, determined whether that in step S501 the user asks to exist.Ask to exist if define the user, just carry out shown in Figure 121 and ask relevant definite treatment S R-J with the user at step S501.

If determine the no user request at step S501, handle and just arrive step S503, determine that only relevant with media segment representative data, the total data of only being correlated with this media segment still are that relevant representative data and total data both is receivable with this media segment.If determine that at step S503 only representative data is receivable, handle the step S553 that just arrives shown in Figure 120, only representative data is defined as alternative.If only total data is receivable, handle just arriving step S555, only this total data is defined as alternative.If representative data and total data all are receivable, handle just arriving step S505.

At step S505, determine line capacity.If determine that line capacity is stronger, handle just arriving step S507.On the contrary, if line capacity a little less than, handle and just to arrive step S509.In each of step S507 and S509, determine the portfolio of circuit.In step S507, if determine that the portfolio of circuit is less, handle and just arrive step S551, total data and representative data all are defined as alternative.At step S509, if determine that the portfolio of circuit is bigger, handle and just arrive step S553, representative data as alternative.If determine that at step S507 the portfolio of circuit is big and determine that at step S509 the portfolio of circuit is bigger, handle and just arrive step S555, total data as alternative.

With during the user asks relevant definite treatment S R-J, determine that at step S601 the user asks whether only corresponding to representative data.If select "Yes" at step S601, handle just arriving step S553, only representative data as alternative.If select "No" at step S601, handle and just arrive step S603, determine that whether this user's request is only corresponding to total data.If select "Yes" at step S603, handle just arriving step S555, only total data as alternative.If select "No" at step S603, handle just arriving step S551, the total data corresponding and representative data with media segment all as alternative.

The 18 embodiment

Eighteenth embodiment of the invention is described now.Present embodiment relates to the described invention of claim 22.Figure 122 is the block scheme of the expression processing relevant with the data processing method of present embodiment.Particularly, this processing relates to the described invention of claim 2.In the accompanying drawings, step is selected in label 501 expressions; 503 expression extraction steps; 515 expressions form step; Owing to select the identical of step 501 and extraction step 503 and the 14 embodiment, so do not repeat them here.

In forming step 515, form data stream of media content according to the data relevant with the selection section of extracting at extraction step 503.Particularly, in forming step, by the data of exporting at extraction step 503 are carried out the multiplexed data stream that forms.

The 19 embodiment

Nineteenth embodiment of the invention is described now.Present embodiment relates to the described invention of claim 23.Figure 123 is expression and the block scheme of the processing of the data processing method of present embodiment.In the figure, step is selected in label 501 expressions; 503 expression extraction steps; 515 expressions form step; 517 expression transfer step.Owing to select step 501 described identical with the 14 embodiment, so do not repeat them here with extraction step 503.In addition, it is identical with the described formation step of 18 embodiment to form step 515, so also omit the description to it.

In transfer step 517, be transmitted in the data stream that forms in the formation step by circuit.This transfer step 517 can comprise the step of the portfolio of determining circuit, can comprise the step of adjusting the data volume of composing document in transfer step 517 according to the portfolio of the circuit of determining and form step 515.

The 20 embodiment

The 20th embodiment of the present invention is described now.Present embodiment relates to the described invention of claim 24.Figure 124 is the block scheme of the expression processing relevant with the data processing method of present embodiment.In the figure, step is selected in label 501 expressions; 503 expression extraction steps; 515 expressions form step; 519 expression recording steps; 521 representative data recording mediums.In recording step 519, the traffic logging that in forming step 515, forms on data medium 521.With data medium 521 record media content, the description data relevant and the structrual description data of being correlated with this media content with this media content.Data medium 521 can be for example hard disk, storer or DVD-ROM etc.Owing to select step 501 described identical with the 14 embodiment, so do not repeat them here with extraction step 503.In addition, it is identical with the described formation step of 18 embodiment to form step 515, so also omit the description to it.

The 21 embodiment

The 21st embodiment of the present invention is described now.Present embodiment relates to the described invention of claim 25.Figure 125 is the block scheme of the expression processing relevant with the data processing method of present embodiment.In the figure, step is selected in label 501 expressions; 503 expression extraction steps; 515 expressions form step; 519 expression recording steps; 521 representative data recording mediums; 523 representative data recording medium management steps.In data medium management process 523, according to the reorganization of available disk the space media content of having stored and the media content that will newly store of data medium 521.Specifically, the data recording management process/or 523 in, carry out one of following operation at least.When the available disk space of data medium 521 hour, after the media content that will newly store is edited, again it is stored.To selecting step 501 to transmit all relevant description data and structrual description data with the media content of having stored.To extraction step 503 transfers media content and structrual description data.Reorganize media content, and the content record that will so reorganize is on data medium 521.In addition, delete the media content that is not reorganized.

Owing to select the identical of step 501 and extraction step 503 and the 14 embodiment, do not repeat them here.In addition, it is identical with the described formation step of the 18 embodiment to form step 515, in the description of this omission to it.Also have, because recording step 519 is described identical with the 19 embodiment with data medium 521, so also omit description of them at this.

The 22 embodiment

The 22nd embodiment of the present invention is described now.Present embodiment relates to the described invention of claim 26.Figure 126 is the block scheme of the expression processing relevant with the data processing method of present embodiment.In the figure, step is selected in label 501 expressions; 503 expression extraction steps; 515 expressions form step; 519 expression recording steps; 521 representative data recording mediums; 525 expression memory contents management processs.In memory contents management process 525, reorganize the media content that has been stored on the data medium 521 according to the media content memory cycle.Specifically, memory contents management process 525 may further comprise the steps: the media content of managed storage on data medium 521; To selecting step 501 to transmit description data and physical content data, they are all relevant with stored media content in the section at the fixed time; To extraction step 503 transfers media content and structrual description data; Reorganize media content; The media content that so reorganizes is recorded on the data medium 521; And the media content that do not reorganized of deletion.

Owing to select step 501 described identical with the 14 embodiment, do not repeat them here with extraction step 503.In addition, it is identical with the described formation step of the 18 embodiment to form step 515, in the description of this omission to it.Also have, because recording step 519 and data medium 521 and the 19 embodiment's is identical, so also omit description of them at this.

In above-mentioned the 14 to the 22 embodiment, select step 501 and 513 can be embodied as selecting arrangement; Video selects step 507 can be embodied as the video selecting arrangement; Audio selection step 509 can be embodied as the audio selection device; Determining step 511 can be embodied as definite device; Form step 515 and can be embodied as the formation device; Transfer step 517 can be embodied as conveyer; Recording step 519 can be embodied as pen recorder; Data medium management process 523 can be embodied as the data medium management devices; Memory contents management process 525 can be embodied as the memory contents management devices.Therefore can be embodied as part or all the data processing equipment that comprises these devices.

In the various embodiments described above, media content can comprise data stream, for example the text data except that video and voice data.In addition, each step of the various embodiments described above can utilize the program that makes computing machine carry out the processing relevant with all or these steps of a part with form of software that is stored in the program memory medium to realize, or utilizes specialized designs and realize so that embody the hardware circuit of the feature of these steps.

Can utilize computing machine to obtain in the program representation of description data of software processes, when appending to＜saving〉or＜section viewpoint with another＜joint or＜section the viewpoint overlaid time, shown in Figure 25 3, viewpoint can be appended to＜saves〉or＜section one of in, and another＜joint or＜section can be expressed as being linked with so additional viewpoint.

And, shown in Figure 25 4, can with by concentrate on viewpoint table that all viewpoints of representing in the description data form be configured to as the root in the data structure of description data＜content daughter element, and to utilize one group of link general＜joint or＜section each mode that appends to the corresponding viewpoint in the viewpoint table and be used for the score of the viewpoint that links like this arrange.According to this configuration, owing to show that to the user tabulation (list) (viewpoint tabulation hereinafter referred to as) of the viewpoint of having registered becomes easily in advance, therefore, the user can know the viewpoint of having registered in advance before the preferred viewpoint of request.Like this, when having the desirable viewpoint of user in the viewpoint tabulation, the user can be by the selection process from viewpoint list request viewpoint.In this respect, the viewpoint table can be configured to be not only＜content〉daughter element, and be＜joint〉or＜section daughter element, perhaps it being configured to can be described separately.

In addition, shown in Figure 25 5, the viewpoint table can be arranged in the viewpoint table of mixed type, wherein not every viewpoint is expressed as and the linking of viewpoint table, but only the viewpoint of some links with the viewpoint epiphase.In this case, in the viewpoint table, there is no need to be described in all viewpoints of representing in the description data, and only be registered in the viewpoint table by the viewpoint that links indication.

In addition, when the viewpoint table of the tabulation that expression is present in the viewpoint in the description data is configured to shown in Figure 25 6 like that when described separately, before the user asks preferred viewpoint, can show viewpoint tabulation to the user based on the viewpoint table.In this case, the user can know the viewpoint that is present in the description data before the preferred viewpoint of request, and can be by the selection process from viewpoint list request viewpoint.

In addition, shown in Figure 25 7, can and describe in the configuration as the attribute section of the viewpoint of attribute and represent description data in the data structure part that is divided into the data of description structure, and with its with data structure partly and score be linked.In Figure 25 7, top (a) expression data structure part, bottom (b) representation attribute part.In this figure, although data structure partly is described as simple configuration, the data structure part can be arranged with configuration same as the previously described embodiments.In addition, attribute section with on each viewpoint all with object＜joint or＜section the mode that is linked arranges, and is unified into a set with the score relevant with linking of viewpoint.

Data structure part and attribute section can not described in identical file, but can describe in the file that separates.In addition, in being divided into the description data that data structure part and attribute section arrange, select step (selecting arrangement) select based on the score on each relevant viewpoint of attribute section＜section or＜save.In addition, shown in Figure 25 8, each viewpoint of attribute section and data structure part＜joint〉or＜section〉can be coupled by bi-directional chaining.In this case, can be undertaken by described method in the aforementioned embodiment by selecting step (selecting arrangement) to specify the selection of viewpoint to handle.

In addition, shown in Figure 25 9, can represent description data with the configuration of the data structure part that is divided into the data of description structure and the attribute section that wherein on each viewpoint, is described with the order of higher score to data structure link partly.But, according to such expression owing to exist the possibility that under the situation that the score in a plurality of viewpoints is compared, is difficult to do meticulous comparison, therefore, used " height " " in " or the thick order of " low ".

In addition, shown in Figure 26 0, can show description data with following allocation list, it is at data structure part that is being divided into the data of description structure and attribute section, and wherein that it is arranged in is two capable in the link that is described and describes same score to the link of data structure part with the order of higher score on each viewpoint.In this case, by selecting step (selecting arrangement) to specify the selection of viewpoint to handle and to carry out in the mode identical with the processing of shown in Figure 25 8, description data being carried out.

Then, the context data conversion method is described, this context data conversion method is used for the description data of tree construction is converted to the description data (second description data hereinafter referred to as) of the description data that is different from tree construction on data structure.In this respect, the description data of tree construction is arranged in the mode shown in Figure 57, in this mode, general＜content〉be configured to root, general＜joint〉be configured to node, and general＜section〉be configured to leaf (leaf), with at least one group of (key word, priority) append to＜section〉and＜save each in as attribute, should " key word " people's etc. key word perhaps in the expression, " priority " expression significance level, and, further will represent " beginning " of start time and " duration " representing " end " of concluding time or represent the duration append to＜section each in as the temporal information of scene.

In this manual, with the context data conversion method of describing three types.Following elder generation illustrates the data structure of second description data of preparing by various context data conversion methods, and then the example of corresponding context data conversion method is described.

(first embodiment of context data conversion method)

At first, second description data of being prepared with first embodiment of the tree construction shown in Figure 26 1 and 262 configuration context data conversion method, general＜content in this tree construction〉as root,＜key word〉conduct＜content daughter element, general＜level〉as each＜key word daughter element, and general＜section conduct＜level daughter element.In this respect, be present in the original context data of description shown in Figure 57＜joint〉element (node) in second description data, do not describe.In second description data, the brotherhood of tree construction also begins to arrange in chronological order from the left side.In addition, be attached to originally wherein temporal information (beginning finishes) append to each＜section in.

Second description data＜level〉be definite according to " priority " in the description data shown in Figure 57, used, and the expression significance level.When " priority " when using integer representation, the integer former state ground of distributing to " priority " is used for＜level 〉.On the contrary,, reset＜level according to the rank of the value of distributing to " priority " when " priority " when representing decimally, so that significance level is easy to comparison.For example, when three " priority " 0.2,0.5 and 1.0 are present in the original context data of description, with significance level minimum＜level 1 distribute to " priority " 0.2, with significance level medium＜level 2 distribute to " priority " 0.5 and with significance level the highest＜level 3 distribute to " priority " 1.0.

In this manner, the level of significance level is set, and the data structure of second description data can be arranged with the nested mode shown in Figure 26 1, perhaps arrange with the parallel mode shown in Figure 26 2, general＜level in nested mode〉height relation by like that in statu quo being expressed as set relations, general＜level in parallel mode shown in Figure 26 1 height concern by being expressed as same one deck like that shown in Figure 26 2.In addition, the data structure of second description data can be arranged in the mode shown in Figure 26 3, in this mode＜key word except＜level the information＜key word daughter element can be arranged to＜section and significance level can be according to＜key word coupling definite in proper order.In this respect because the possibility that can not carry out meticulous comparison when existing in a plurality of key words relatively, therefore, comparative result can by for example " height ", " in " and " low " rough expression.

When a plurality of＜section that is connecting continuously〉be present in belong to same＜key word and same＜level＜section in the time, these are a plurality of＜section can be concentrated into a set.For example, when exist connect continuously＜section 1 and＜section 2 time, these＜section can be concentrated into＜section A, in this case, be necessary according to appending to each＜section〉in temporal information (beginning, finish) prepare to append to so concentrate＜section A in temporal information.

Utilize first embodiment of the example explanation context data conversion method of original context data of description below.Shown in Figure 26 4, comprise (the key word a of having added as the original context data of description that the following describes employed example as attribute as leaf, priority 2) and (priority 1)＜section 1, added (priority 2) and (priority 2)＜section 2 and added (key word b, priority 4) and (priority 3)＜sections 3.

At first, obtain to append to the original context data of description＜section the set of " key word ".From the example (hereinafter only being called example) of original context data of description, obtain { key word a, key word b, key word c, key word d}.Then, on each key word, obtain added same key word＜section set.In this example, for key word a, obtained by＜section 1 〉/(priority 2) and＜sections 2/set that (priority 2) forms, for key word b, obtained by＜section 1 〉/(priority 1) and＜section 3/set that (priority 4) forms, for key word c, obtained by＜sections 2/set that (priority 2) forms, for key word d, obtained by＜section 3/set that (priority 3) forms.

Then, the set of section is divided in groups by each priority.For example, in the set of each section of key word a, because each section has all been added priority 2, so these two sections are concentrated, and to become priority be 2 group.For the set of the section of key word b because＜section 1 added priority 1 and＜section 3 added priority 4, so these two sections to be divided into priority be that 1 group (have only＜section 1 〉) and priority are 4 group (have only＜section 3 〉).The section of each of key word c and key word d is divided in groups by same way as.

Then, each priority is converted to " level " of expression significance level.As mentioned above, when " priority " when using integer representation, the integer of distributing to " priority " is similarly as " level ".Therefore, in above-mentioned example, added priority N (N=1,2,3,4,5)＜section be the section of grade N.

In addition, when nested form is applied to second description data, be arranged to set relations according to each section that the level of each section will be divided into groups.On the contrary, when using parallel form, each section of grouping is arranged to brotherhood, and with from more senior to more rudimentary series arrangement.

So the data structure of second description data of preparing is presented among Figure 26 5.In this figure, because existences＜section 1〉and＜section 2〉conduct＜key word a and＜level 2 section, therefore, for example, when these sections have no time to connect continuously with gap, these sections are concentrated into＜section A 〉.

(second embodiment of context data conversion method)

Second description data of being prepared with second embodiment of the tree construction shown in Figure 26 6 configuration context data conversion method, general＜content in this tree construction〉as root, general＜key word〉conduct＜content daughter element, and general＜section as each＜key word daughter element.In this respect, each＜section added " priority " as attribute.

In this second description data, second description data of being prepared with first embodiment by the context data conversion method is the same, does not describe＜save 〉, each＜section〉except having added " priority ", also added temporal information (beginning finishes).Because second embodiment of context data conversion method does not change staged with priority, therefore, the context data conversion method of this embodiment be not described in describe in second description data of preparing by first embodiment of context data conversion method＜level.

Utilize second embodiment of the example explanation context data conversion method of original context data of description below.At first, the same with first embodiment of context data conversion method, obtain to append to the original context data of description＜section the set of key word.Then, on each key word, obtain added same key word＜section set.Then, additional originally priority is appended to each＜section in.

So the data structure of second description data of preparing is presented among Figure 26 7.In this figure, because existences＜section 1〉and＜section 2〉conduct＜key word a section, therefore, for example,, these sections are concentrated into＜section A when these sections have no time to connect continuously with gap when identical with the priority that appends to these sections 〉.

(the 3rd embodiment of context data conversion method)

Second description data of being prepared with the 3rd embodiment of the tree construction shown in Figure 26 8 and 269 configuration context data conversion method, general＜content in this tree construction〉as root, general＜level〉conduct＜content daughter element, and general＜section be configured to each＜level daughter element.In this respect, each＜section added " key word " as attribute.

In this 3rd description data, second description data of being prepared with first embodiment by the context data conversion method is the same, does not describe＜save 〉, each＜section〉except having added " priority ", also added temporal information (beginning finishes).

Utilize the 3rd embodiment of the example explanation context data conversion method of the original context data of description of using above below.At first, according to append to the original context data of description＜section " priority ", on each priority, obtain added equal priority＜section set.In the example of the original context data of description shown in Figure 26 4, for priority 1, obtained by＜section 1, the set that (key word b) forms for priority 2, has been obtained by＜section 1 〉, (set that key word a) forms and by＜section 2, the set that (key word a, key word c) forms, for priority 3, obtained by＜section 3, the set that (key word d) forms is for priority 4, obtained by＜section 3, the set that (key word b) forms.

Then, each priority is converted to " level " of expression significance level.As mentioned above, when " priority " when using integer representation, the integer of distributing to " priority " is similarly as " level ".Therefore, in the present example, added priority N (N=1,2,3,4,5)＜section be the section of grade N.

Then, " key word " that appended to respective priority originally appended to each section.For example, the set of the section of level 1 contains＜section 1〉and＜section 1〉added key word b originally, make this section add key word b.The set of the section of level 2 contains＜section 1〉and＜section 2 〉.Especially, since added level 2＜section 2 comprise added originally key word a＜section 2 and added originally key word c＜section 2, therefore, newly prepared another＜sections 2, two＜section 2〉one of added key word a, another has then added key word c.For level 2 each＜section 1, level 3 each＜section 3 and level 4 each＜sections 3 carry out similar processing.

So the data structure of second description data of preparing is presented among Figure 27 0.In＜section〉added under the situation of several different key words, data structure can append to a plurality of different key words according to shown in Figure 26 9＜section〉in mode arrange.Therefore, in second description data shown in Figure 27 0, priority a and priority c can be appended to level 2＜section 2 on.Although the section of level 2 comprise added priority a＜section 1 and added priority a＜section 2,, for example, section is not free when connecting continuously with gap when these, these sections can be concentrated becomes＜section A 〉.

Second description data that first to the 3rd embodiment of above-mentioned context data conversion method is prepared is used in the process flow diagram of the illustrated selection step of the 7th embodiment S5 in the processing procedure of S9.Although utilize the selection step of the original context data of description shown in Figure 57 to have the dirigibility that can handle any request, the strong point of second description data is, can respond the request from the user, obtains corresponding＜section rapidly 〉.

Although in the aforementioned embodiment, by the threshold value of using significance level select its value be equal to or greater than threshold value＜joint or＜section, also can select to have specific significance level value＜joint or＜section.

Though describe description data and structrual description data in the above-described embodiments separately, they can be merged into a data set shown in Figure 127 to 132.

As mentioned above, in data processing method of the present invention, recording medium and program, by using the description data of hierarchical layer, according to from media content, selecting wherein one section at least by the additional score of selecting arrangement (corresponding to selecting step) to description data.Especially, utilize extraction element (corresponding to extraction step) only to extract the relevant data of selecting with selecting arrangement (corresponding to selecting step) of section.Perhaps, utilize playing device (corresponding to playing step) only to play the relevant data of selecting with selecting arrangement (corresponding to selecting step) of section.

Utilize said structure, can from media content, freely select prior scene, can extract or play the important segment of selection like this.In addition, description data is taked hierarchical layer, comprises highest ranked layer, lowest hierarchical layer and other hierarchical layer.Can be according to arbitrary unit, for example serve as that scene is selected on the basis with a chapter or a joint.Can adopt various selection forms, for example select a certain Zhanghe from this chapter, to delete unnecessary section.

In data processing method of the present invention, recording medium and program, the context significance level of score presentation medium content.As long as set score to select important scenes, just can easily prepare for example set of some important scenes of program.In addition, as long as set score, just can in very big degree of freedom, select section by determining key word with the importance of expression from the interested scene of viewpoint of key word.For example, as long as determined key word, just can only select the required scene of user from particular aspect, for example role or incident.

In data processing method of the present invention, recording medium and program, comprise in the single time period at media content under the situation of a plurality of different media data collection, determine device (corresponding to determining step) according to determining condition, determine which of these media data collection be elected to be and be alternative.Selecting arrangement (corresponding to selecting step) is only selected the media data collection from determine the determined data of device (corresponding to determining step).Owing to determine that device (corresponding to determining step) can be according to determining the relevant media data of the definite and best section of condition, so selecting arrangement (corresponding to selecting step) can be selected the media data of suitable quantity.

In data processing method of the present invention, recording medium and program, determine device (corresponding to determining step) according to determining condition, only determine video data, only still be voice data that video and both audio all are elected to be to being elected to be object.So, can shorten selecting arrangement (corresponding to selecting step) and select the required time of section.

In data processing method of the present invention, recording medium and program, added representative data as attribute to description data, and definite device can be according to media data or the representative data of determining the definite best section of condition.

In data processing method of the present invention, recording medium and program, determine device (corresponding to determining step) according to determining condition, only determine the total data relevant, only still be representative data this total data and representative data both to be elected to be alternative with the respective media section.So, determine that device can shorten selecting arrangement (corresponding to selecting step) and select the required time of section.

Claims

1. summary generating apparatus comprises:

Input media is used to import description data, and described description data has: the data structure part, a plurality of sections of each scene of the media content that expression is made of a plurality of scenes are described; Attribute section, comprise the temporal information of cutting apart of describing the described scene of expression, based on the score of the contextual significance level of each section of viewpoint and the expression link information with the situation that links of at least one correlation range, these attribute informations are attribute informations of described media content by the represented viewpoint of at least one key word of expression scene content, expression;

Selecting arrangement is used for based on described described score of described attribute section and described temporal information, from the described data structure section of selection partly;

Content input section is used to import corresponding media content;

Extraction unit is used for extracting the data relevant with described selecteed section temporal information from described media content.

2. summary generating apparatus as claimed in claim 1, described temporal information are the start time and the concluding times of each scene.

3. summary generating apparatus as claimed in claim 1, described temporal information are the start time and the duration of each scene.

4. as claim 2 or 3 described summary generating apparatus, when selecting based on the described described score of described attribute section and described temporal information, described selecting arrangement with duration of selecteed period and be chosen as setting-up time or shorter than setting-up time.

5. as claim 2 or 3 described summary generating apparatus, when selecting based on the described described viewpoint of described attribute section and described time, described selecting arrangement with duration of selecteed period and be chosen as setting-up time or shorter than setting-up time.

6. as claim 2 or 3 described summary generating apparatus, described summary generating apparatus also has reproducing part, is used to reproduce the media content that is extracted.

7. as claim 2 or 3 described summary generating apparatus, described summary generating apparatus also has component part, and the media content that is used to reproduce from being extracted constitutes stream.

8. as the described summary generating apparatus of any one claim in the claim 1 to 7, represent the chained address of representative data of described section content to be attached in each section of described data structure part.

9. summary as claimed in claim 8 generates treating apparatus, and described representative data is video information and/or audio-frequency information.

10. as the described summary generating apparatus of any one claim in the claim 1 to 7, described viewpoint and described score a plurality of groups by link information by interrelated in a section.

11. as the described summary generating apparatus of any one claim in the claim 1 to 7, the group of described link information and described score is concentrated by each viewpoint.

12. abstraction generating method may further comprise the steps:

Input step is used to import the description data with data structure part and attribute section both sides, described data structure part, a plurality of sections of describing each scene of representing the media content that is made of a plurality of scenes; Described attribute section, comprise the temporal information of cutting apart of describing the described scene of expression, based on the score of the contextual significance level of each section of viewpoint and the expression link information with the situation that links of at least one correlation range, these attribute informations are attribute informations of described media content by the represented viewpoint of at least one key word of expression scene content, expression;

Select step, be used for, from the described data structure section of selection partly based on described described score of described attribute section and described temporal information;

The content input step is used to import corresponding media content;

Extraction step is used for extracting the data relevant with described selecteed section temporal information from described media content.