TWI497959B

TWI497959B - Scene extraction and playback system, method and its recording media

Info

Publication number: TWI497959B
Application number: TW101138304A
Authority: TW
Inventors: Chia Hsiang Chang; Emery Jou; Jing Fung Chen; pei wen Huang
Original assignee: Inst Information Industry
Priority date: 2012-10-17
Filing date: 2012-10-17
Publication date: 2015-08-21
Also published as: KR101397331B1; US8744245B2; CN103780975B; US20140105572A1; KR20140049447A; CN103780975A; TW201417571A

Description

Scene capture and playback system, method and recording medium thereof

本發明係有關於一種場景的摘要擷取與播放系統、方法及其記錄媒體，特別是有關於藉由場景描述資訊以取出所需媒體區段的場景的摘要擷取與播放系統、方法及其記錄媒體。The present invention relates to a summary capture and play system and method for a scene, and a recording medium thereof, and particularly to a summary capture and play system and method for scenes for extracting a desired media segment by scene description information Record media.

先前技術中，媒體資料多是採用的線性播放方式。影像播放軟體會提供一個對應播放媒體資料的時間軸，使用者可以對時間軸的作位置點擊，或是拖曳時間軸上的軸閂，以決定影像播放片段。In the prior art, media materials are mostly used in linear playback. The video playback software provides a timeline corresponding to the playing media data. The user can click on the timeline or drag the axis latch on the timeline to determine the video playback segment.

然而，當一使用者對媒體資料的內容並不熟悉，若使用者想為此媒體資料建立一個摘要資料時，使用者需耗費較多的時間來找尋所需要的視像場景。其次，軸閂拖曳的精確度是取決於時間軸的長度，若時間軸太短，使用者很難將軸閂拖曳至需求的定點，反會增加使用者操作上的困擾。其三，使用者若想從媒體資料中取得針對性的影像或音訊，或是更進一步為媒體資料建構適當的媒體摘要資訊，必須手動性的進行時間軸控制作業，並無法直接尋得相關的視像場景。其四，若為提升影像擷取的精確度，一般需使用特定的影像擷取程式、軟體與工具等至少任一者。而上述問題不但會造成媒體擷取的成本增加，亦會提升使用者操作上的複雜性，而且使用者無法作個人化的選擇性觀看所需片段，媒體操作的自由度相對較低。However, when a user is unfamiliar with the content of the media material, if the user wants to create a summary material for the media material, the user needs to spend more time searching for the desired video scene. Secondly, the accuracy of the drag of the axle latch depends on the length of the time axis. If the time axis is too short, it is difficult for the user to drag the axle latch to the required fixed point, which will increase the user's operation trouble. Third, if users want to obtain targeted images or audio from media materials, or further construct appropriate media summary information for media materials, they must manually perform timeline control operations, and cannot directly find relevant information. Video scene. Fourth, in order to improve the accuracy of image capture, it is generally necessary to use at least one of a specific image capture program, software and tools. The above problems will not only increase the cost of media capture, but also increase the complexity of user operations, and users cannot make personalized choices. Selectively viewing the desired segments, the freedom of media operation is relatively low.

因此，如何簡化媒體資料的擷取、控制與操作，同時提供一種因應使用者需求的客制化媒體控制技術，為廠商應思考的課題。Therefore, how to simplify the acquisition, control and operation of media materials, and provide a customized media control technology that meets the needs of users, is a topic that manufacturers should consider.

為解決上述問題，本發明係揭露一種場景的摘要擷取與播放系統、方法及其記錄媒體，其是藉由場景描述資訊作為媒體的選取依據，以提供使用者需求的媒體區段。In order to solve the above problems, the present invention discloses a scene abstracting and playing system, method and recording medium thereof, which are used as a media selection basis by using scene description information to provide a media section required by a user.

本發明所揭露的場景的摘要擷取與播放系統，其包括一媒體提供設備、一描述伺服設備、一場景伺服設備與一終端裝置。The summary capture and play system of the scene disclosed by the present invention comprises a media providing device, a description server device, a scene server device and a terminal device.

此媒體提供設備用以提供一媒體資料。終端裝置用以供輸入一摘要擷取指令。描述伺服設備用以接收媒體資料，以提供對應媒體資料之場景片段的場景描述資訊，每一場景描述資訊記錄其對應的場景片段的播放內容。場景伺服設備用以取得摘要擷取指令，並依據摘要擷取指令分析各場景描述資訊的記載的播放內容，並依據分析結果從媒體資料取出複數個局部場景片段以形成一媒體摘要資料，並將媒體摘要資料輸出至終端裝置。This media providing device is used to provide a media material. The terminal device is configured to input a summary capture instruction. The description server is configured to receive the media data to provide scene description information of the scene segment corresponding to the media data, and each scene description information records the playing content of the corresponding scene segment. The scene servo device is configured to obtain a summary capture instruction, and analyze the recorded content of each scene description information according to the summary capture instruction, and extract a plurality of partial scene segments from the media data according to the analysis result to form a media summary data, and The media summary data is output to the terminal device.

本發明所揭露的場景的摘要擷取與播放方法，其包括：由一媒體提供設備提供一媒體資料；接著由一描述伺服設備接收該媒體資料，並提供對應該媒體資料之場景片段的場景描述資訊，每一該場景描述資訊記錄其對應之該場景片段的播放內容；由一場景伺服設備取得一終端裝置提供的一摘要擷取指令；由該場景伺服設備依據摘要擷取指令分析各場景描述資訊的記載的播放內容，從媒體資料取出複數個局部場景片段以形成一媒體摘要資料；以及，由場景伺服設備輸出媒體摘要資料至終端裝置。The method for extracting and playing the scene of the present invention includes: providing a media data by a media providing device; then receiving the media data by a description server, and providing a scene film corresponding to the media data The scene description information of the segment, each of the scene description information records the content of the scene segment corresponding to the scene; the scene servo device obtains a summary capture instruction provided by the terminal device; and the scene servo device retrieves the instruction according to the summary The recorded content of each scene description information is analyzed, and a plurality of partial scene segments are extracted from the media data to form a media summary data; and the scene summary device outputs the media summary data to the terminal device.

此外，本發明亦揭露一種記錄媒體，其儲存一電子裝置可讀取之程式碼。當電子裝置讀取該程式碼時執行一場景的摘要擷取與播放方法。此方法係如前所述。In addition, the present invention also discloses a recording medium that stores an executable code readable by an electronic device. A summary capture and playback method of a scene is executed when the electronic device reads the code. This method is as described above.

本發明之特點係在於本發明藉由分析場景描述資訊，可針對性的提供使用者需求的媒體摘要資料，使用者不需耗費較多的時間來找尋所需要的視像場景。其次，使用者不需費心於媒體資料的時間軸控制作業，即可取得需求媒體摘要資料，不但能維持場景視像提供的精度，更能簡化影像控制的複雜性，亦避免使用者不易拖曳軸閂至需求的定點的操作性困擾。其三，藉由針對性的場景片段擷取，使用者得以一次性的取得所需場景片段，以為媒體資料建構適當的媒體摘要資料，不但形成符合使用者需求的客制化媒體操控，亦能降低使用者操作上的複雜性。其四，藉由分析場景描述資訊，及媒體摘要資料的提供，使用者得以預先觀看/收聽媒體資料的大綱內容，選擇是否播放媒體資料的全部內容。The invention is characterized in that the present invention provides targeted media description data by analyzing the scene description information, and the user does not need to spend more time to find the desired video scene. Secondly, the user can obtain the required media summary data without worrying about the timeline control operation of the media data, which not only can maintain the accuracy of the scene video, but also simplify the complexity of the image control, and avoid the user's difficulty in dragging the axis. The operability of the fixed point of the latch to the demand. Third, by capturing the targeted scene segments, the user can obtain the required scene segments at one time, and construct appropriate media summary data for the media data, which not only forms a customized media control that meets the user's needs, but also can Reduce the complexity of user operations. Fourth, by analyzing the scene description information and the provision of the media summary data, the user can preview/listen to the outline content of the media material and select whether to play the entire content of the media material.

茲配合圖式將本發明較佳實施例詳細說明如下。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The preferred embodiments of the present invention will be described in detail below with reference to the drawings.

請參照圖1繪示本發明實施例之場景的摘要擷取與播放系統架構示意圖。此系統得以應用於具媒體播放能力的裝置、設備或系統，配置的型態並不設限。此場景的摘要擷取與播放系統包括客戶端(Client Side)與伺服端(Server Side)，此兩端的設備、裝置是透過網路(Network)連接。伺服端包括一媒體提供設備10(Media Supply Equipment)、一描述伺服設備20(Scene Description Server)與一場景伺服設備30(Scene Server)。客戶端包括一個以上的終端裝置(End Device)，其是使用者的電子裝置，如個人電腦(Personal Computer,PC)、筆記型電腦(NoteBook)、平板電腦(Tablet PC)、智慧型手機(Smart Phone)、數位機上盒(Set-Top Box,STB)…等具有供使用者操控的人機介面與連接網路能力的電子裝置。於此例，以一終端裝置40進行說明。1 is a schematic diagram of a schematic capture and playback system architecture of a scenario in accordance with an embodiment of the present invention. This system can be applied to devices, devices or systems with media playback capabilities, and the configuration type is not limited. The summary capture and playback system of this scenario includes a client side (Client Side) and a server side (Server Side), and devices and devices at both ends are connected through a network. The server includes a media supply device 10 (Media Supply Equipment), a description server device 20 (Scene Description Server), and a scene server device 30 (Scene Server). The client includes more than one terminal device (End Device), which is a user's electronic device, such as a personal computer (PC), a notebook (NoteBook), a tablet (Tablet PC), a smart phone (Smart). Phone), Set-Top Box (STB), etc., have a human-machine interface for user manipulation and an electronic device that connects to the network. In this example, a terminal device 40 will be described.

媒體提供設備10用以提供一媒體資料11(Media data)。此媒體資料11可為整體的影像、音訊或影音資料，或是即時傳輸的串流資料(Stream Data)。媒體提供設備10可為與描述伺服設備20及場景伺服設備30同屬一處的設備，或是為相異地點的第三方設備(Third-Party Equipment)，並不設限。媒體提供設備10輸出媒體資料11的模式包括：廣播(Broadcast)、寬頻放送(Broadband)、有線傳輸(如Community Antenna Television,Community Antenna Television,Cable Television,CATV)、網路協定傳輸(Internet Protocol Television,IPTV)…等有線、無線的資料傳輸方式，亦不設限。The media providing device 10 is configured to provide a media data 11 (Media data). This media material 11 can be an overall image, audio or video material, or stream data that is transmitted in real time. The media providing device 10 may be a device that is in the same place as the description of the server device 20 and the scene server device 30, or a third-party device (Third-Party Equipment). The mode in which the media providing device 10 outputs the media material 11 includes: broadcast, broadband, and Wired and wireless data transmission methods such as cable transmission (Community Antenna Television, Community Antenna Television, Cable Television, CATV), Internet Protocol Television (IPTV), etc. are also not limited.

此媒體提供設備10包括具媒體供給能力的硬體、或是軟體與硬體結合的單元、組件、裝置、設備與系統等至少一種以上型態的組成。媒體資料11包括多個不同內容的場景片段(Scene Segment)，如媒體資料11為影像資料(Image Data)時，影像資料即具有標的、場景、人物…等一種以上內容的影像片段。又如媒體資料11為音訊資料(Voice Data)，音訊資料即具有高音、低音、說話聲、音樂聲…等一種以上內容的音訊片段。The media providing device 10 includes a hardware having a media supply capability, or a composition of at least one type of a unit, a component, a device, a device, and a system, which are combined with a software and a hardware. The media material 11 includes a plurality of scene segments of different contents. For example, when the media material 11 is image data, the image data is an image segment having more than one content such as a target, a scene, a character, and the like. Another example is that the media material 11 is a voice data (Voice Data), and the audio data is an audio segment having more than one content such as a treble, a bass, a voice, a music, and the like.

描述伺服設備20取得此媒體資料11時，會提供對應媒體資料11的場景描述資訊21。場景描述資訊21是對媒體資料11，或媒體資料11包括的場景片段進行詮釋的註解資料。於此實施例中，每一場景描述資訊21記載其對應場景片段的播放內容。而場景描述資訊21的提供方式如下：(1)由描述伺服設備20直接依據媒體資料11或場景片段的播放內容進行場景描述資訊21的建構；(2)從外部裝置取得此媒體資料11對應的場景描述資訊21。When the description of the media data 11 is obtained by the servo device 20, the scene description information 21 corresponding to the media material 11 is provided. The scene description information 21 is an annotation material for interpreting the media material 11 or the scene segment included in the media material 11. In this embodiment, each scene description information 21 records the playing content of the corresponding scene segment. The scenario description information 21 is provided as follows: (1) The description of the scenario description information 21 is performed by the description server device 20 directly according to the media content 11 or the content of the scene segment; (2) the corresponding media material 11 is obtained from the external device. Scene description information 21.

此終端裝置40包括供使用者輸入資料的資料輸入介面，亦具播放媒體的能力。而終端裝置40介面的呈現端視設計人員的需求而定，並不設限。使用者利用終端裝置40 用輸入一摘要擷取指令41(Summary Collect Command)。The terminal device 40 includes a data input interface for the user to input data, and also has the ability to play media. The presentation end of the interface of the terminal device 40 depends on the needs of the designer and is not limited. The user utilizes the terminal device 40 Use the input a summary command 41 (Summary Collect Command).

摘要擷取指令41即是指使用者想要從媒體資料11中尋求特定的媒體區段(Media Segment)時，使用者輸入的尋求條件(Request Condition)，尋求條件為媒體資料11或是場景片段的播放內容。例如，媒體資料11為一籃球比賽的錄製影像時，使用者輸入其喜愛運動員的進球得分畫面，亦或此賽事所有運動員的三分投籃得分畫面，又或是其喜愛運動員的三分投籃得分畫面…等諸如此類的尋求條件。又如，媒體資料11為一歌劇類的音樂資料時，使用者輸入歌劇女主角的獨自演唱，亦或是純音樂片段演奏…等諸如此類的尋求條件。The summary capture command 41 refers to a request condition (Request Condition) input by the user when the user wants to seek a specific media segment from the media material 11, and the search condition is the media material 11 or the scene segment. Play content. For example, when the media material 11 is a recorded image of a basketball game, the user inputs a score map of the favorite player's goal, or a three-point shooting score of all the athletes of the event, or a three-point shooting score of the favorite player. Screen...etc. Seeking conditions like this. For another example, when the media material 11 is a musical material of a opera type, the user inputs the solo performance of the opera actress, or the pure music piece to play... and the like.

場景伺服設備30會取得摘要擷取指令41與媒體資料11，但媒體資料11的取得來源是描述伺服設備20或是媒體提供設備10。場景伺服設備30會依據摘要擷取指令41分析各場景描述資訊21記載的播放內容，並依據分析結果從媒體資料11中取出的複數個局部場景片段32，將此等場景片段形成一媒體摘要資料31以輸出至終端裝置40。The scene servo device 30 obtains the summary capture instruction 41 and the media material 11, but the source of the media data 11 is the description of the servo device 20 or the media providing device 10. The scene servo device 30 analyzes the play content recorded in each scene description information 21 according to the summary capture instruction 41, and forms a plurality of partial scene segments 32 from the media data 11 according to the analysis result, and forms the scene segments into a media summary data. 31 to output to the terminal device 40.

然而，摘要擷取指令41包括的尋求條件類型說明如下：However, the types of seeking conditions included in the summary capture instruction 41 are as follows:

(1)此摘要擷取指令41包括一內容指定資訊。場景伺服設備30依據摘要擷取指令41分析各場景描述資訊21的記錄資訊時，會取得記錄資訊符合內容指定資訊的場景描述資訊21，並取出此等場景描述資訊21對應的局部場景片段32，以形成媒體摘要資料31。(1) This summary capture instruction 41 includes a content specification information. When the scene servo device 30 analyzes the record information of each scene description information 21 according to the summary capture instruction 41, the scene description information 21 corresponding to the content designation information is obtained, and the local field corresponding to the scene description information 21 is taken out. The scene segment 32 is formed to form a media summary material 31.

(2)場景伺服設備30先依據各場景描述資訊21的記錄資訊，將各場景描述資訊21分類，再依據分類結果自媒體資料11劃分出一個以上的場景片段資料33。摘要擷取指令41包括一內容指定資料。當場景伺服設備30依據內容指定資料分析各該場景描述資訊21的記錄資訊時，會從所有場景片段資料33中取出一個以上的目標場景片段資料33，以形成媒體摘要資料31。(2) The scene servo device 30 first classifies each scene description information 21 according to the record information of each scene description information 21, and then divides one or more scene segment data 33 from the media material 11 according to the classification result. The summary capture instruction 41 includes a content specification material. When the scene servo device 30 analyzes the record information of each scene description information 21 according to the content designation data, one or more target scene segment data 33 are taken out from all the scene segment data 33 to form the media summary data 31.

而且，摘要擷取指令41包括的尋求條件可為不只一種，亦得以包括多種不同的擷取需求尋求條件。場景伺服設備30得以依據各尋求條件，以從媒體資料11形成一個或多個媒體摘要資料31。Moreover, the summary retrieval instruction 41 may include more than one seeking condition, and may also include a plurality of different retrieval demand seeking conditions. The scene servo device 30 is capable of forming one or more media summary data 31 from the media material 11 in accordance with each of the seeking conditions.

終端裝置40於接收到一個或多個媒體摘要資料31後，即進行播放，或是呈現為列表。使用者即透過終端裝置40的控制介面以擇一或多個媒體摘要資料31進行播放。After receiving the one or more media summary materials 31, the terminal device 40 plays the game or presents it as a list. The user plays the selected one or more media summary data 31 through the control interface of the terminal device 40.

更進一步，場景伺服設備30依據各場景描述資訊21的記錄資訊，取得各場景描述資訊21的資料相依賴關係、或是資料屬性的類別進行場景描述資訊21的分類，並依據此分類結果劃分出複數個場景片段資料33。場景伺服設備30會依據各場景描述資訊21的資料相依性、資料屬性與資料層級關係，將各場景片段資料33建構為一媒體播放樹結構。當場景伺服設備30會取得摘要擷取指令41時，會依據摘要擷取指令41從媒體播放樹結構擷取出相關的場景片段資料33，以形成媒體摘要資料31。而且，摘要擷取指令41亦得以包括一樹結構層級資料，其記載媒體播放樹結構的一個樹層級。當場景伺服設備30取得摘要擷取指令41時，僅會從樹結構層級資料指定的樹層級，從媒體播放樹結構擷取出至少其一的局部場景片段32及/或場景片段資料33，以形成媒體摘要資料31。Further, the scene servo device 30 obtains the data dependency relationship of each scene description information 21 or the category of the data attribute to classify the scene description information 21 according to the record information of each scene description information 21, and divides the scene description information 21 according to the classification result. A plurality of scene clip data 33. The scene servo device 30 constructs each scene segment data 33 into a media play tree structure according to the data dependency, data attribute and data level relationship of each scene description information 21. When the scene servo device 30 obtains the summary capture instruction 41, the related field is extracted from the media play tree structure according to the summary capture instruction 41. The segment data 33 is formed to form a media summary material 31. Moreover, the summary capture instruction 41 can also include a tree structure level data that records a tree hierarchy of the media playback tree structure. When the scene servo device 30 obtains the summary capture instruction 41, only at least one of the partial scene segments 32 and/or the scene segment data 33 is extracted from the media playback tree structure from the tree hierarchy specified by the tree structure level data to form Media summary information 31.

然而，終端裝置40的介面亦得以呈現媒體播放樹結構的輸入欄位，使用者僅依據各尋求條件的資料相依賴關係、或是資料屬性的類別，將各尋求條件輸入各欄位中，以供場景伺服設備30作為場景片段擷取與分類的依據。However, the interface of the terminal device 40 can also present an input field of the media play tree structure, and the user inputs each search condition into each field according to the data dependency relationship of each search condition or the category of the data attribute. The scene servo device 30 serves as a basis for capturing and classifying the scene segment.

然而，媒體提供設備10亦得以提供複數個媒體資料11，且由描述伺服設備20提供每一媒體資料11對應的場景描述資訊21。使用者在利用終端裝置40輸入摘要擷取指令41時，可針對每一媒體資料11設定不同的尋求條件，或是設定針對所有媒體資料11的尋求條件，端視使用者之需求而定。However, the media providing device 10 is also provided with a plurality of media materials 11 and the scenario description information 21 corresponding to each media material 11 is provided by the description server device 20. When the user inputs the summary capture command 41 by using the terminal device 40, the user can set different search conditions for each media material 11, or set the search conditions for all the media materials 11, depending on the needs of the user.

場景伺服設備30依據摘要擷取指令41分析相關的場景描述資訊21，從各媒體資料11中形成一個或多個媒體摘要資料31以回傳於終端裝置40。The scene server device 30 analyzes the related scene description information 21 according to the summary capturing instruction 41, and forms one or more media summary data 31 from each of the media materials 11 to be transmitted back to the terminal device 40.

然而，場景片段、場景描述資訊21、媒體播放樹結構、播放媒體…等擷取結果，其得以被儲存場景伺服設備30中，以供次回媒體摘要資料31提供作業時使用。更甚者，經過場景片段提供作業所建構而成的媒體摘要資料31，亦得以儲存於終端裝置40，以供終端裝置40的播放軟/硬體直接取用與播放。However, the captured segment, the scene description information 21, the media playing tree structure, the playing media, etc., are retrieved and stored in the scene server 30 for use in providing the secondary media summary data 31 for the job. What's more, the media summary data 31 constructed by the scene segment provides the work. It can be stored in the terminal device 40 for direct access and playback of the playback soft/hardware of the terminal device 40.

請參閱圖2繪示本發明實施例之場景的摘要擷取與播放方法的流程示意圖，圖3至圖4繪示本發明實施例之場景的摘要擷取與播放方法的細部流程示意圖。請配置參閱圖1至圖3以利於了解。此方法流程如下：2 is a schematic flowchart of a summary capture and play method of a scenario in the embodiment of the present invention. FIG. 3 to FIG. 4 are schematic diagrams showing details of a summary capture and playback method in a scenario according to an embodiment of the present invention. Please refer to Figure 1 to Figure 3 for help. The flow of this method is as follows:

由一媒體提供設備10提供一媒體資料11(步驟S110)。如前所述，媒體提供設備10提供的媒體資料11可為整體的影像、音訊或影音資料，或是即時傳輸的串流資料(Stream Data)。媒體資料11傳輸模式包括：廣播、寬頻放送、有線傳輸、網路協定傳輸…等有線、無線的資料傳輸方式。A media material 11 is provided by a media providing device 10 (step S110). As described above, the media material 11 provided by the media providing device 10 can be an overall image, audio or video material, or stream data that is transmitted in real time. The media data 11 transmission mode includes: broadcast, broadband transmission, wired transmission, network protocol transmission, etc., and other wired and wireless data transmission methods.

由一描述伺服設備20接收媒體資料11，並提供對應媒體資料11的場景描述資訊21，每一場景描述資訊21記錄其對應的場景片段的播放內容(步驟S120)。如前所述，場景描述資訊21的提供方式如：(1)由描述伺服設備20直接依據媒體資料11或場景片段的播放內容進行場景描述資訊21的建構；(2)從外部裝置取得此媒體資料11對應的場景描述資訊21。The media data 11 is received by a description server device 20, and the scene description information 21 corresponding to the media material 11 is provided, and each scene description information 21 records the content of the corresponding scene segment (step S120). As described above, the scene description information 21 is provided in the following manner: (1) The description of the scene description information 21 is performed by the description server device 20 directly according to the broadcast content of the media material 11 or the scene segment; (2) the medium is obtained from the external device. The scene description information 21 corresponding to the data 11.

由一場景伺服設備30取得一終端裝置40提供的一摘要擷取指令41(步驟S130)。其中，摘要擷取指令41即是指使用者想要從媒體資料11中尋求特定的媒體區段時，使用者輸入的尋求條件。尋求條件為媒體資料11或是場景片段的播放內容。A summary capture command 41 provided by the terminal device 40 is obtained by a scene servo device 30 (step S130). The summary capture instruction 41 refers to a search condition input by the user when the user wants to seek a specific media segment from the media material 11. Seeking conditions for media material 11 or scene film The playback content of the segment.

由場景伺服設備30依據摘要擷取指令41分析各場景描述資訊21記載的播放內容，並依據分析結果從媒體資料11取出局部場景片段32以形成一媒體摘要資料31(步驟S140)。此步驟中，依據摘要擷取指令41包括的尋求條件之不同，場景伺服設備30取出局部場景片段32的模式亦有不同。說明如下：The scene servo device 30 analyzes the play content recorded in each scene description information 21 according to the summary capture instruction 41, and extracts the partial scene segment 32 from the media material 11 according to the analysis result to form a media summary data 31 (step S140). In this step, according to the difference of the search conditions included in the summary capture instruction 41, the mode in which the scene servo device 30 takes out the partial scene segment 32 is also different. described as follows:

(1)如圖1與圖3，每一場景描述資訊21儲存有其對應之場景片段的播放內容，摘要擷取指令41包括內容指定資訊。則此步驟中，當場景伺服設備30依據摘要擷取指令41分析各場景描述資訊21記載的播放內容時，取得記錄資訊符合內容指定資訊的複數個目標場景描述資訊(步驟S142)。之後，場景伺服設備30將此等目標場景描述資訊對應的局部場景片段32形成媒體摘要資料31(步驟S148)。(1) As shown in FIG. 1 and FIG. 3, each scene description information 21 stores the play content of its corresponding scene segment, and the summary capture instruction 41 includes content designation information. In this step, when the scene server 30 analyzes the content of the scene description information 21 according to the summary capturing instruction 41, the scene servo device 30 obtains a plurality of target scene description information whose recording information conforms to the content specifying information (step S142). Thereafter, the scene servo device 30 forms the local scene segment 32 corresponding to the target scene description information into the media summary data 31 (step S148).

(3)如圖1與圖4，場景伺服設備30依據各場景描述資訊21的記錄資訊，將各場景描述資訊21分類，並依據分類結果自媒體資料11劃分出複數個場景片段資料33。摘要擷取指令41包括一內容指定資料。則此步驟中，場景伺服設備30會依據內容指定資料分析各場景描述資訊21的記錄資訊(步驟S143)，從所有場景片段資料33中取出至少一目標場景片段資料33，以形成媒體摘要資料31(步驟S149)。(3) As shown in FIG. 1 and FIG. 4, the scene servo device 30 classifies each scene description information 21 according to the record information of each scene description information 21, and divides a plurality of scene segment data 33 from the media material 11 according to the classification result. The summary capture instruction 41 includes a content specification material. In this step, the scene server 30 analyzes the record information of each scene description information 21 according to the content specification data (step S143), and extracts at least one target scene segment data 33 from all the scene segment data 33 to form the media summary data 31. (Step S149).

之後，由場景伺服設備30輸出媒體摘要資料31至終端裝置40(步驟S150)。當終端裝置40於接收到一個或多個媒體摘要資料31後，即進行播放，或是呈現為列表，亦或儲存於終端裝置40中。使用者即透過終端裝置40的控制介面以擇一或多個媒體摘要資料31進行播放。Thereafter, the media summary data 31 is output from the scene servo device 30 to the terminal device 40 (step S150). When the terminal device 40 receives the one or more media summary data 31, it is played, or presented as a list, or stored in the terminal device 40. The user plays the selected one or more media summary data 31 through the control interface of the terminal device 40.

請參閱圖5繪示本發明實施例之媒體播放樹結構的建構流程示意圖。請配合圖1以利於了解。此方法包括：FIG. 5 is a schematic diagram of a process of constructing a media play tree structure according to an embodiment of the present invention. Please cooperate with Figure 1 to facilitate understanding. This method includes:

由場景伺服設備30依據各場景描述資訊21的記錄資訊，取得各場景描述資訊21的資料相依賴關係、或是資料屬性的類別進行場景描述資訊21的分類，並依據此分類結果劃分出複數個場景片段資料33(步驟S210)。The scene servo device 30 obtains the data dependency relationship of each scene description information 21 or the category of the data attribute to classify the scene description information 21 according to the record information of each scene description information 21, and divides the plurality according to the classification result. Scene clip data 33 (step S210).

由場景伺服設備30依據各場景描述資訊21的資料相依性、資料屬性與資料層級關係，將各場景片段資料33建構為一媒體播放樹結構(步驟S220)。The scene segment device 33 constructs each scene segment data 33 into a media play tree structure according to the data dependency, data attribute and data level relationship of each scene description information 21 (step S220).

故於步驟S140中，當場景伺服設備30取得摘要擷取指令41時，會依據摘要擷取指令41從媒體播放樹結構擷取出局部場景片段，以形成媒體摘要資料31。Therefore, in step S140, when the scene servo device 30 obtains the summary capture instruction 41, the partial scene segment is extracted from the media play tree structure according to the summary capture instruction 41 to form the media summary data 31.

請參閱圖6至圖9繪示本發明實施例之一媒體控制的情境示意圖。於此，媒體資料11以一籃球比賽的錄製影像進行說明。6 to FIG. 9 are schematic diagrams showing the context of media control according to an embodiment of the present invention. Here, the media material 11 is described by a recorded image of a basketball game.

如圖6繪示本發明實施例之一媒體層級示意圖。籃球比賽的錄製影像可以劃分出不同影像層級。層級最高者，是整個比賽的錄像；其次為各節次的錄像；再次為特寫鏡頭的錄像。整個錄像是由許多場景片段所組成，且與場景描述資訊21對應。FIG. 6 is a schematic diagram of a media hierarchy according to an embodiment of the present invention. Recorded images of basketball games can be divided into different image levels. The highest level, It is the video of the whole game; the second is the video of each section; the video of the close-up is again. The entire video is composed of a plurality of scene segments and corresponds to the scene description information 21.

如圖7繪示本發明實施例之一場景描述資訊之示意圖，其呈現籃球比賽影像的場景說明與場景對應時間。FIG. 7 is a schematic diagram of scene description information according to an embodiment of the present invention, which presents a scene description of a basketball game image and a scene corresponding time.

當使用者僅想觀看「第三局三分球得分畫面」的摘要畫面時，可以「第三局三分球得分畫面」作為內容指定資訊設定於摘要擷取指令41。場景伺服設備30即會取第三局的倒數計時「11：39」、「09：16」、「08：58」、「07：47」…等與「第三局三分球得分場景」對應的時間點。場景伺服設備30會以此等時間點往前、後或是以其為中心，擷取對應的場景片段資料33，將其結合形成媒體摘要資料31，以供相關終端裝置進行播放。When the user only wants to view the summary screen of the "3rd round three-point scoring screen", the "third-third three-point scoring screen" can be set as the content specifying information in the summary capturing command 41. The scene servo device 30 will take the countdown of the third game "11:39", "09:16", "08:58", "07:47", etc., corresponding to the "third game three-point scoring scene" Time point. The scene servo device 30 captures the corresponding scene segment data 33 forward, backward or centered at the same time point, and combines them to form the media summary data 31 for the relevant terminal device to play.

如圖8繪示本發明實施例之媒體摘要資料形成之一示意圖。假設，「第三局三分球得分畫面」的場景片段包含在場景片段P1、P3與P5中。場景伺服設備30從媒體資料11中，擷取出場景片段P1、P3與P5，將其形成上述的媒體摘要資料31，以供相關終端裝置進行播放。FIG. 8 is a schematic diagram showing the formation of media summary data according to an embodiment of the present invention. It is assumed that the scene segment of the "third game three-point scoring screen" is included in the scene segments P1, P3, and P5. The scene servo device 30 extracts the scene segments P1, P3, and P5 from the media material 11, and forms the above-mentioned media summary data 31 for the related terminal device to play.

如圖9繪示本發明實施例之一媒體播放樹結構示意圖。於此，將圖6繪示的影像層級，配合前述的摘要擷取指令41的資料相依賴關係、或是資料屬性的類別，整個錄製影像、場景描述資訊21或場景片段資料33可被建構為一媒體播放樹結構。FIG. 9 is a schematic structural diagram of a media play tree according to an embodiment of the present invention. Here, the image hierarchy shown in FIG. 6 can be constructed as the data-dependent relationship of the abstract capture instruction 41 or the category of the data attribute, and the entire recorded image, the scene description information 21 or the scene segment data 33 can be constructed as A media playback tree structure.

此媒體播放樹結構第一層為整個比賽錄像。第二層為第一層錄像的分支，為各節次的錄像。第三層為第二層錄像的分支，為雙方於比賽中的特寫鏡頭錄像。第四層為第三層錄像的分支，為雙方的特定球員於比賽中的特寫鏡頭錄像。The first layer of this media playback tree structure is the entire game video. The second layer is the branch of the first layer of video, which is the video of each section. The third layer is the branch of the second layer of video, which is a close-up video of the two sides in the game. The fourth layer is the branch of the third layer video, which is a close-up video of the specific players in the game.

當使用者設定的摘要擷取指令41後，場景伺服設備30依據摘要擷取指令41包括的內容指定資料，藉由媒體播放樹結構，將所需的場景片段資料33擷取而出，以形成媒體摘要資料31並傳輸至終端裝置40。After the summary capture instruction 41 is set by the user, the scene servo device 30 extracts the required scene segment data 33 according to the content designation data included in the summary capture instruction 41, and forms a desired scene segment data 33 to form The media summary material 31 is transmitted to the terminal device 40.

然而，場景伺服設備30並不需顧及上述的影像層級，而是依據第一目標場景描述資訊42而取出任一層級的場景片段資料33，以形成上述的媒體區段資料。此外，媒體摘要資料31亦得以包括一樹結構層級資料，其指定場景伺服設備30能從媒體樹結構中取出場景片段資料33的層級。However, the scene servo device 30 does not need to take into account the above-mentioned image hierarchy, but extracts the scene segment data 33 of any level according to the first target scene description information 42 to form the above-mentioned media segment data. In addition, the media summary data 31 can also include a tree structure level data, which specifies that the scene server device 30 can retrieve the level of the scene segment data 33 from the media tree structure.

綜上所述，乃僅記載本發明為呈現解決問題所採用的技術手段之實施方式或實施例而已，並非用來限定本發明專利實施之範圍。即凡與本發明專利申請範圍文義相符，或依本發明專利範圍所做的均等變化與修飾，皆為本發明專利範圍所涵蓋。In the above, it is merely described that the present invention is an embodiment or an embodiment of the technical means for solving the problem, and is not intended to limit the scope of implementation of the present invention. That is, the equivalent changes and modifications made in accordance with the scope of the patent application of the present invention or the scope of the invention are covered by the scope of the invention.

10‧‧‧媒體提供設備10‧‧‧Media supply equipment

11‧‧‧媒體資料11‧‧‧Media Information

20‧‧‧描述伺服設備20‧‧‧Description of servo equipment

21‧‧‧場景描述資訊21‧‧‧Scenario description information

30‧‧‧場景伺服設備30‧‧‧Scenario Servo Device

31‧‧‧媒體摘要資料31‧‧‧Media summary information

32‧‧‧局部場景片段32‧‧‧Part scene clips

33‧‧‧場景片段資料33‧‧‧Scenario data

40‧‧‧終端裝置40‧‧‧ Terminal devices

41‧‧‧摘要擷取指令41‧‧‧Summary capture instruction

P1~P6‧‧‧場景片段P1~P6‧‧‧Scenario

S110~S150‧‧‧步驟S110~S150‧‧‧Steps

S210~S220‧‧‧步驟S210~S220‧‧‧Steps

圖1繪示本發明實施例之場景的摘要擷取與播放系統架構示意圖。FIG. 1 is a schematic structural diagram of a summary capture and play system of a scenario according to an embodiment of the present invention.

圖2繪示本發明實施例之場景的摘要擷取與播放方法的流程示意圖。FIG. 2 is a schematic flowchart diagram of a method for extracting and playing a summary of a scenario according to an embodiment of the present invention.

圖3至圖4繪示本發明之實施例之場景的摘要擷取與播放方法的細部流程示意圖。FIG. 3 to FIG. 4 are schematic diagrams showing details of a summary capture and playback method of a scenario according to an embodiment of the present invention.

圖5繪示本發明實施例之媒體播放樹結構的建構流程示意圖。FIG. 5 is a schematic diagram showing a process of constructing a media play tree structure according to an embodiment of the present invention.

圖6繪示本發明實施例之一媒體層級示意圖。FIG. 6 is a schematic diagram of a media hierarchy according to an embodiment of the present invention.

圖7繪示本發明實施例之一場景描述資訊之示意圖。FIG. 7 is a schematic diagram of scene description information according to an embodiment of the present invention.

圖8繪示本發明實施例之媒體摘要資料形成之一示意圖。FIG. 8 is a schematic diagram showing the formation of media summary data according to an embodiment of the present invention.

圖9繪示本發明實施例之一媒體播放樹結構示意圖。FIG. 9 is a schematic structural diagram of a media play tree according to an embodiment of the present invention.

S110~S150‧‧‧步驟S110~S150‧‧‧Steps

Claims

A summary and capture system for a scene, comprising: a media providing device for providing a media data; and a description server device for receiving the media data to provide a scene description information corresponding to a segment of the media data Each scene description information records the content of the scene segment corresponding to the scene segment; a scene server device is configured to obtain a summary capture instruction and the media data, and analyze each scene description information according to the summary capture instruction. The recorded content is extracted from the media data according to the analysis result to form a media summary data and output, wherein the scene servo device classifies each scene description information according to the record information of each scene description information. According to the classification result, a plurality of scene segment data are divided from the media data, and each scene segment data is constructed into a media playing tree structure according to the data dependency, the data attribute and the data hierarchical relationship of each scene description. When the scene servo device obtains the summary capture instruction, Receiving, by the summary capture instruction, at least one of the scene segment data from the media play tree structure to form the media summary data; and a terminal device for inputting the summary capture instruction and obtaining the media summary data .

The summary capture and play system of the scenario described in claim 1 , wherein the summary capture instruction includes a content specifying information, and the scene server analyzes each scene description information according to the summary capture instruction. And obtaining a plurality of target scene description information corresponding to the specified information of the content, and extracting the partial scene segments corresponding to the target scene description information to form the media summary data.

The summary capture and play system of the scenario described in claim 1 , wherein the scene servo device classifies each scene description information according to the record information of each scene description information, and divides the scene description information according to the classification result. And outputting a plurality of scene segment data, the summary capturing instruction includes a content specifying data, and the scene servo device analyzes the recording information of each scene description information according to the content specifying data, and extracts at least one target scene segment from the scene segment data. Information to form the media summary data.

The summary capture and play system of the scenario described in claim 1 , wherein the summary capture instruction includes a tree structure level data, and the scene servo device plays a tree from the media according to a tree level recorded by the tree structure level data. The structure extracts at least one of the scene segment data to form the media summary data.

A summary and playback method for a scene, comprising: providing a media data by a media providing device; receiving the media data by a description server device, and providing scene description information corresponding to the scene segment of the media data, each of the The scene description information records the content of the scene segment corresponding to the scene segment; a scene server provides a summary capture instruction provided by the terminal device; and the scene servo device analyzes each of the scenes according to the summary capture command. The scenario describes the content of the recorded content, and extracts a plurality of partial scene segments from the media data according to the analysis result to form a media summary data, wherein the scene servo device describes the data dependency relationship or data according to the scene. The category of the attribute is used to classify each of the scene description information, and the plurality of scene segment data are divided according to the classification result; the scene servo device describes the data dependency, the data attribute and the data hierarchy relationship of the information according to each scene description. The scene segment data is constructed as a media play tree structure, wherein the scene server device extracts the partial scene segments from the media play tree structure according to the summary capture instruction to form the media summary data; and the scene server device The media summary data is output to the terminal device.

The method for extracting and playing a summary of the scenario described in claim 5, wherein each of the scene description information stores a play content corresponding to the scene segment, and the summary capture instruction includes a content designation information, and The scene servo device analyzes each scene description information according to the summary capture instruction, and the step of extracting the partial scene segments from the media data to form a media summary data includes: when the scene servo device analyzes the summary according to the summary When the scene description information is obtained, the plurality of target scene description information that the record information meets the content specification information is obtained; and the partial scene segments corresponding to the target scene description information are retrieved by the scene server device to form the media summary data.

Abstract drawing and playing party for the scenario described in item 5 of the patent application scope The method, wherein the scene servo device classifies each scene description information according to the record information of each scene description information, and divides a plurality of scene segment data from the media data according to the classification result, where the summary capture instruction includes a content designation And the step of analyzing, by the scene server, the scene description information according to the summary capture instruction, and extracting the partial scene segments from the media data to form a media summary data, the step of: determining, by the scene servo device, the content The specified data is used to analyze the record information of each scene description information, and at least one target scene segment data is taken out from the scene segment data to form the media summary data.

A recording medium storing an executable code readable by an electronic device, and executing a method for extracting and playing a scene when the electronic device reads the code, the method comprising the steps of: providing a medium by a media providing device Receiving the media data by a description server, and providing scene description information corresponding to the scene segment of the media data, each scene description information recording the content of the corresponding scene segment; obtaining a scene servo device a summary capture instruction provided by the terminal device; the scene servo device analyzes the play content recorded in each scene description information according to the summary capture instruction, and extracts a plurality of partial scene segments from the media data to form a media according to the analysis result. The summary data, wherein the scene servo device classifies each scene description information according to the data dependency relationship or the category of the data attribute of each scene description information, and divides the plurality of scene fragments according to the classification result. The scene servo device constructs the scene segment data into a media play tree structure according to the data dependency, the data attribute and the data hierarchy relationship of the scene description information, wherein the scene servo device captures the instruction according to the summary And extracting the partial scene segments from the media play tree structure to form the media summary data; and outputting the media summary data to the terminal device by the scene server device.

The recording medium of claim 8, wherein each of the scene description information stores a play content corresponding to the scene segment, the summary capture instruction includes a content designation information, and the scene servo device is based on the abstract The step of extracting the scene description information from the media data to form the media summary data from the media data includes: when the scene servo device analyzes each scene description information according to the summary capture instruction, Obtaining a plurality of target scene description information corresponding to the specified information of the content; and acquiring, by the scene server device, the partial scene segments corresponding to the target scene description information to form the media summary data.

The recording medium according to claim 8 , wherein the scene servo device classifies each scene description information according to the record information of each scene description information, and divides a plurality of scene segment data from the media data according to the classification result. The summary capture instruction includes a content specifying data, and the scene server analyzes each scene description information according to the summary capture instruction, and extracts the partial scene segments from the media data. The step of forming a media summary data includes: analyzing, by the scene server, the record information of each scene description information according to the content specification data, and extracting at least one target scene segment data from the scene segment data to form the media summary. data.