TWI535282B

TWI535282B - Method and electronic device for generating multiple point of view video

Info

Publication number: TWI535282B
Application number: TW103123659A
Authority: TW
Inventors: 應文平; 王元綱; 希拉辛格維克; 武景龍; 李文銓; 陳家偉; 肯尼斯陶德古雷斯; 郭威志; 林嘉彥; 蔡明翰; 闕鑫地; 伊藤泰
Original assignee: 宏達國際電子股份有限公司
Priority date: 2013-07-10
Filing date: 2014-07-09
Publication date: 2016-05-21
Also published as: CN104284173A; CN104284173B; TW201503676A

Description

Method and electronic device for generating multi-view video

本揭露涉及用於產生多視點(multiple point of view,MPOV)視訊的方法和電子裝置。 The present disclosure relates to methods and electronic devices for generating multiple point of view (MPOV) video.

具有各種功能可讓例如智慧型電話、平板電腦等電子裝置變得更具移動性和多功能性。透過使用電子裝置的圖像擷取功能，使用者將能夠透過擷取其日常生活的事件並以不同媒體格式(例如，相片、視訊、音頻等)將事件儲存為媒體內容來記錄事件。使用者常擁有在不同視點與同一事件相關的多個媒體內容，且這些使用者之後可還想要透過電子郵件、社交網路或其他通信手段來分享其具有不同視點的媒體內容。 It has various functions to make electronic devices such as smart phones and tablets more mobile and versatile. By using the image capture function of the electronic device, the user will be able to record events by capturing events of their daily life and storing the events as media content in different media formats (eg, photos, videos, audio, etc.). Users often have multiple media content associated with the same event at different viewpoints, and these users may later want to share their media content with different viewpoints via email, social networking, or other means of communication.

然而，如此，使用者可能必須遍覽媒體內容以便手動地識別與感興趣的事件相關的媒體內容，且這些動作可極耗時間。此外，相關媒體內容可能未必及時地分類或同步，以致於使用者將必須手動地選擇並重新調整相關媒體內容，以便將所述媒體內容編錄為視訊合輯或相冊。 However, as such, the user may have to navigate through the media content to manually identify media content related to the event of interest, and these actions can be extremely time consuming. In addition, relevant media content may not necessarily be sorted or synchronized in time, so that the user will have to manually select and re-adjust the relevant media content in order to The content is cataloged as a video compilation or photo album.

因此，可需要自動地識別同一事件的相關媒體內容，且挑選並組合這些相關媒體內容以從多個視點呈現。 Thus, it may be desirable to automatically identify relevant media content for the same event, and to select and combine these related media content for presentation from multiple viewpoints.

本揭露提出用於產生多視點(MPOV)視訊的方法和電子裝置。 The present disclosure proposes methods and electronic devices for generating multi-view (MPOV) video.

根據示範性實施例中的一者，所述產生MPOV視訊的方法可包含至少(但不限於)以下步驟：獲得多個媒體內容；基於對應於所述媒體內容中的每一者的每一元資料而從所述多個媒體內容識別第一媒體內容和第二媒體內容，作為與同一事件相關的相關媒體內容，其中所述元資料至少包括時間資訊或位置資訊；以及根據所述相關媒體內容而產生所述MPOV視訊。 According to one of the exemplary embodiments, the method of generating MPOV video may include, but is not limited to, the following steps: obtaining a plurality of media content; based on each meta-data corresponding to each of the media content And identifying, from the plurality of media content, the first media content and the second media content as related media content related to the same event, wherein the metadata includes at least time information or location information; and according to the related media content The MPOV video is generated.

根據示範性實施例中的一者，本揭露提出一種電子裝置，其將包含至少(但不限於)處理器，所述處理器經配置以：獲得多個媒體內容；基於對應於所述媒體內容中的每一者的每一元資料而從所述多個媒體內容識別第一媒體內容和第二媒體內容，作為與同一事件相關的相關媒體內容，其中所述元資料至少包括時間資訊或位置資訊；以及根據所述相關媒體內容而產生所述MPOV視訊。 In accordance with one of the exemplary embodiments, the present disclosure provides an electronic device that will include, but is not limited to, a processor configured to: obtain a plurality of media content; based on corresponding to the media content Identifying the first media content and the second media content from the plurality of media content as each piece of media content, as related media content related to the same event, wherein the metadata includes at least time information or location information And generating the MPOV video based on the related media content.

然而，應理解，此概述可能不含有本揭露的所有方面和實施例，且因此並不意味以任何方式為限制性的。而且，本揭露將包含對於所屬領域的技術人員來說明顯的改進和修改。 However, it should be understood that the summary is not intended to be exhaustive or to be construed as limiting. Moreover, the disclosure Improvements and modifications obvious to those skilled in the art will be included.

為讓本發明的上述特徵和優點能更明顯易懂，下文特舉實施例，並配合所附圖式作詳細說明如下。 The above described features and advantages of the invention will be apparent from the following description.

10‧‧‧第一電子裝置 10‧‧‧First electronic device

11‧‧‧第一視點 11‧‧‧First viewpoint

12‧‧‧第一媒體內容 12‧‧‧ First media content

20‧‧‧第二電子裝置 20‧‧‧Second electronic device

21‧‧‧第二視點 21‧‧‧second viewpoint

22‧‧‧第二媒體內容 22‧‧‧Second media content

30‧‧‧第三電子裝置 30‧‧‧ Third electronic device

31‧‧‧第三視點 31‧‧‧ Third viewpoint

32‧‧‧第三媒體內容 32‧‧‧ Third Media Content

40‧‧‧打擊手 40‧‧‧Batterers

50‧‧‧投手 50‧‧‧Pitcher

100‧‧‧電子裝置 100‧‧‧Electronic devices

110‧‧‧處理器 110‧‧‧ processor

130‧‧‧顯示螢幕 130‧‧‧Display screen

150‧‧‧儲存媒體 150‧‧‧Storage media

170‧‧‧圖像擷取元件 170‧‧‧Image capture component

190‧‧‧收發器 190‧‧‧ transceiver

410‧‧‧音頻波形 410‧‧‧Audio waveform

420‧‧‧可區別特徵 420‧‧‧ distinguishable features

421‧‧‧時戳 421‧‧ ‧ time stamp

422‧‧‧值 422‧‧‧ value

430‧‧‧可區別特徵 430‧‧‧ distinguishable features

440‧‧‧可區別特徵 440‧‧‧ distinguishable features

510‧‧‧AP1 510‧‧‧AP1

520‧‧‧AP2 520‧‧‧AP2

530‧‧‧AP3 530‧‧‧AP3

540‧‧‧AP4 540‧‧‧AP4

550‧‧‧AP5 550‧‧‧AP5

560‧‧‧第一列表 560‧‧‧ first list

570‧‧‧第二列表 570‧‧‧Second list

610‧‧‧第一媒體內容 610‧‧‧First media content

620‧‧‧第二媒體內容 620‧‧‧Second media content

640‧‧‧重疊部分 640‧‧‧ overlap

650‧‧‧重點期間 650 ‧ ‧ focus period

710‧‧‧第一媒體內容 710‧‧‧First media content

720‧‧‧第二媒體內容 720‧‧‧Second media content

721‧‧‧幀 721‧‧ frames

740‧‧‧重疊部分 740‧‧‧ overlap

750‧‧‧重點期間 750‧‧ ‧ focus period

810‧‧‧第一媒體內容 810‧‧‧First media content

820‧‧‧第二媒體內容 820‧‧‧Second media content

850‧‧‧重點期間 850 ‧ ‧ focus period

910‧‧‧第一媒體內容 910‧‧‧First media content

920‧‧‧第二媒體內容 920‧‧‧Second media content

930‧‧‧第三媒體內容 930‧‧‧ Third Media Content

940‧‧‧重疊部分 940‧‧‧ overlap

950‧‧‧重點期間 950 ‧ ‧ focus period

S1010、S1011、S1012、S1013、S1014、S1020、S1030、S1110、S1120、S1130、S1140、S1150‧‧‧步驟 S1010, S1011, S1012, S1013, S1014, S1020, S1030, S1110, S1120, S1130, S1140, S1150‧‧

T1‧‧‧第一時間 T1‧‧‧ first time

T2‧‧‧第二時間 T2‧‧‧ second time

T3‧‧‧第三時間 T3‧‧‧ third time

T4‧‧‧第四時間 T4‧‧‧ fourth time

T5‧‧‧第五時間 T5‧‧‧ fifth time

圖1說明根據本揭露的示範性實施例從不同視點進行事件的合作擷取以產生MPOV視訊的概念圖。 1 illustrates a conceptual diagram of cooperative capture of events from different viewpoints to generate MPOV video in accordance with an exemplary embodiment of the present disclosure.

圖2A到圖2D為說明根據本揭露的實施例中的一者基於由第一電子裝置10擷取的第一媒體內容、由第二電子裝置20擷取的第二媒體內容和由第三電子裝置30擷取的第三媒體內容而產生MPOV視訊的概念圖。 2A through 2D are diagrams illustrating that one of the embodiments according to the present disclosure is based on the first media content captured by the first electronic device 10, the second media content captured by the second electronic device 20, and the third electronic device. The third media content retrieved by device 30 produces a conceptual map of MPOV video.

圖3為根據本揭露的示範性實施例中的一者的以功能框說明電子裝置的硬體的框圖。 3 is a block diagram illustrating the hardware of an electronic device in a functional block in accordance with one of the exemplary embodiments of the present disclosure.

圖4為說明根據本揭露的示範性實施例中的一者的媒體內容的音頻波形的圖式。 4 is a diagram illustrating audio waveforms of media content in accordance with one of the exemplary embodiments of the present disclosure.

圖5A和圖5B為說明根據本揭露的實施例中的一者的附近裝置的信號強度的排序的概念的圖式。 5A and 5B are diagrams illustrating concepts of ordering of signal strengths of nearby devices in accordance with one of the embodiments of the present disclosure.

圖6為說明根據本揭露的示範性實施例中的一者的時間軸上的第一媒體內容和第二媒體內容的同步的圖式。 6 is a diagram illustrating synchronization of first media content and second media content on a timeline in accordance with one of the exemplary embodiments of the present disclosure.

圖7為說明根據本揭露的示範性實施例中的一者的在第一媒體內容為靜止圖像時的第一媒體內容和第二媒體內容的同步的圖式。 7 is a diagram illustrating synchronization of first media content and second media content when the first media content is a still image, according to one of the exemplary embodiments of the present disclosure. formula.

圖8為說明根據本揭露的示範性實施例中的一者的第一媒體內容、第二媒體內容和第三媒體內容的同步的圖式。 FIG. 8 is a diagram illustrating synchronization of first media content, second media content, and third media content in accordance with one of the exemplary embodiments of the present disclosure.

圖9為說明根據本揭露的實施例中的一者的產生MPOV視訊的方法的流程圖。 9 is a flow chart illustrating a method of generating MPOV video in accordance with one of the embodiments of the present disclosure.

圖10A和圖10B為說明根據本揭露的實施例中的一者的基於相關媒體內容而產生MPOV視訊的方法的流程圖。 10A and 10B are flow diagrams illustrating a method of generating MPOV video based on related media content, in accordance with one of the embodiments of the present disclosure.

圖11為說明根據本揭露的實施例中的一者的基於重點期間內的媒體內容而產生MPOV視訊的方法的流程圖。 11 is a flow diagram illustrating a method of generating MPOV video based on media content within a focused period, in accordance with one of the embodiments of the present disclosure.

現將詳細參考本揭露的當前實施例，其實施例在附圖中得以說明。只要有可能，相同元件符號在圖式及描述中用來表示相同或相似部分。 Reference will now be made in detail to the present embodiments of the invention, Wherever possible, the same element symbols are used in the drawings and the description

透過使用例如智慧型電話、平板電腦、穿戴式裝置等電子裝置的圖像擷取(capture)功能性，人們可透過擷取日常生活事件並將其儲存為媒體內容(例如，相片、視訊、音頻等)而記錄這些事件。舉例來說，人們可使用智慧型電話、平板電腦、穿戴式裝置、相機等而在棒球比賽中對兒童進行記錄。因此，與同一棒球比賽相關的媒體內容可由不同的使用者從不同的觀點擷取。本揭露提供一種產生多視點(MPOV)視訊的方法，其識別媒體內容的相關性，例如，媒體內容如何在時間和位置上與事件相關。同一事件的相關媒體內容將接著用於產生MPOV視訊。在MPOV視訊中，從不同視點擷取的相關媒體內容將在時間上同步，以使得在幾乎同一時刻從不同視點擷取的事件可在MPOV視訊的每一幀中同時呈現。相關媒體內容可進一步被分析以識別事件的重點期間(highlight period)，以使得MPOV視訊將基於重點期間內的相關媒體內容而產生。 By using the image capture functionality of electronic devices such as smart phones, tablets, wearable devices, etc., people can capture daily life events and store them as media content (eg, photos, video, audio). Etc.) Record these events. For example, people can use a smart phone, tablet, wearable device, camera, etc. to record a child in a baseball game. Therefore, media content related to the same baseball game can be captured by different users from different perspectives. The present disclosure provides a method of generating multi-viewpoint (MPOV) video that identifies relevance of media content, eg, how media content relates to events in time and location turn off. The relevant media content of the same event will then be used to generate MPOV video. In MPOV video, related media content retrieved from different viewpoints will be synchronized in time so that events taken from different viewpoints at almost the same time can be simultaneously presented in each frame of the MPOV video. The relevant media content can be further analyzed to identify a highlight period of the event such that the MPOV video will be generated based on relevant media content within the focused period.

圖1說明根據本揭露的示範性實施例從不同視點進行事件的合作擷取以產生MPOV視訊的概念圖。參看圖1，可由不同電子裝置擷取事件(例如，圖1所示的擊球事件)的多個媒體內容。媒體內容將包含由第一電子裝置10從視點11擷取的第一媒體內容、由第二電子裝置20從視點21擷取的第二媒體內容和由第三電子裝置30從視點31擷取的第三媒體內容。 1 illustrates a conceptual diagram of cooperative capture of events from different viewpoints to generate MPOV video in accordance with an exemplary embodiment of the present disclosure. Referring to Figure 1, a plurality of media content of an event (e.g., a shot event as shown in Figure 1) can be retrieved by a different electronic device. The media content will include the first media content retrieved from the viewpoint 11 by the first electronic device 10, the second media content retrieved by the second electronic device 20 from the viewpoint 21, and the third electronic device 30 retrieved from the viewpoint 31. Third media content.

圖2A到圖2D為說明根據本揭露的實施例中的一者基於由第一電子裝置10擷取的第一媒體內容、由第二電子裝置20擷取的第二媒體內容和由第三電子裝置30擷取的第三媒體內容而產生MPOV視訊的概念圖。參看圖1和圖2A，由第一電子裝置10從第一視點11擷取擊球事件，其中從第一視點11擷取的第一媒體內容12將從打擊手40的側面將擊球事件作為重點。參看圖1和圖2B，由第二電子裝置20從第二視點21擷取擊球事件，其中從第二視點21擷取的第二媒體內容22將從打擊手40的背面將擊球事件作為重點。參看圖1和圖2C，由第三電子裝置30從第三視點31擷取擊球事件，其中從第三視點31擷取的第三媒體內容 32會將針對擊球事件投出球的投手50作為重點。 2A through 2D are diagrams illustrating that one of the embodiments according to the present disclosure is based on the first media content captured by the first electronic device 10, the second media content captured by the second electronic device 20, and the third electronic device. The third media content retrieved by device 30 produces a conceptual map of MPOV video. Referring to FIGS. 1 and 2A, a batting event is retrieved from the first viewpoint 11 by the first electronic device 10, wherein the first media content 12 retrieved from the first viewpoint 11 will take a batting event from the side of the striking hand 40. Focus. Referring to FIGS. 1 and 2B, a batting event is captured by the second electronic device 20 from the second viewpoint 21, wherein the second media content 22 retrieved from the second viewpoint 21 will take a batting event from the back of the striking hand 40 as Focus. Referring to FIGS. 1 and 2C, a batting event is captured from the third viewpoint 31 by the third electronic device 30, wherein the third media content captured from the third viewpoint 31 is captured. 32 will focus on the pitcher 50 who throws the ball for the batting event.

基於圖2A到圖2C所說明的第一媒體內容12、第二媒體內容22和第三媒體內容32，本揭露將識別由不同電子裝置從不同視點擷取的第一媒體內容12、第二媒體內容22和第三媒體內容32是否將在時間和位置上與同一事件相關。假設第一媒體內容12、第二媒體內容22和第三媒體內容32與同一事件相關，那麼將包含第一媒體內容12、第二媒體內容22和第三媒體內容32以產生同時從不同視點展示擊球事件的MPOV視訊，如圖2D所說明。參看圖2D，說明MPOV視訊同時從第一視點11、第二視點21和第三視點31展示擊球事件，且MPOV將透過在MPOV視訊的幀(frame)中組合/混合(combining/stitching)第一媒體內容12、第二媒體內容22和第三媒體內容32而以拼貼畫式樣(collage style)顯示事件。以拼貼畫式樣顯示同一事件的組合的媒體內容稍後還將稱為事件的拼貼畫視圖(collage view)。MPOV視訊的幀含有至少兩個部分，且每一部分可用於顯示媒體內容中的一者。 Based on the first media content 12, the second media content 22, and the third media content 32 illustrated in FIG. 2A to FIG. 2C, the disclosure will identify the first media content 12 and the second media captured by different electronic devices from different viewpoints. Whether content 22 and third media content 32 will be associated with the same event in time and location. Assuming that the first media content 12, the second media content 22, and the third media content 32 are associated with the same event, then the first media content 12, the second media content 22, and the third media content 32 will be included to produce simultaneous presentations from different viewpoints. The MPOV video of the batting event is illustrated in Figure 2D. Referring to FIG. 2D, the MPOV video is simultaneously displayed from the first viewpoint 11, the second viewpoint 21, and the third viewpoint 31, and the MPOV will be combined/combined in the frame of the MPOV video. A media content 12, a second media content 22, and a third media content 32 display events in a collage style. The media content that displays the combination of the same event in a collage style will later be referred to as the collage view of the event. The frame of the MPOV video contains at least two parts, and each part can be used to display one of the media contents.

在本揭露的實施例中的一者中，MPOV視訊的幀可(但不限於)分割為三個部分，即，左側部分、右上部分和右下部分，且每一部分可用於顯示由不同電子裝置擷取的媒體內容。舉例來說，在圖2D所說明的示範性實施例中，從第一視點11擷取的第一媒體內容12拼貼到MPOV視訊的幀的左側部分，從第二視點21擷取的第二媒體內容22拼貼到MPOV視訊的幀的右上部分，且從第三視點31擷取的第三媒體內容32拼貼到MPOV視訊的幀的右下部分。應注意，媒體內容可為視訊或相片，且由不同電子裝置擷取的第一媒體內容、第二媒體內容和第三媒體內容將在時間上同步，以使得事件可幾乎同一時刻播放。 In one of the embodiments of the present disclosure, the frame of the MPOV video may be, but is not limited to, divided into three parts, namely, a left part, an upper right part, and a lower right part, and each part may be used to display by different electronic devices. Captured media content. For example, in the exemplary embodiment illustrated in FIG. 2D, the first media content 12 captured from the first viewpoint 11 is tiled to the left portion of the frame of the MPOV video, and the second portion captured from the second viewpoint 21 The media content 22 is tiled to the upper right portion of the frame of the MPOV video, and the third media content 32 captured from the third view 31 is tiled to the frame of the MPOV video. The lower right part. It should be noted that the media content may be a video or a photo, and the first media content, the second media content, and the third media content captured by the different electronic devices will be synchronized in time so that the events can be played at almost the same time.

示範性實施例用於說明目的，且不希望限制MPOV視訊的拼貼畫視圖的位置或式樣。在另一示範性實施例中，可在MPOV視訊的幀或任何其他分區中平等地顯示不同視點的媒體內容。不同視點的媒體內容中的每一者的顯示位置可在事件的拼貼畫視圖中任意定位。 The exemplary embodiments are for illustrative purposes and it is not desirable to limit the location or style of the collage view of the MPOV video. In another exemplary embodiment, media content of different viewpoints may be displayed equally in a frame of MPOV video or any other partition. The display position of each of the media content of different viewpoints can be arbitrarily positioned in the collage view of the event.

圖3為根據本揭露的示範性實施例中的一者的以功能框說明電子裝置的硬體的框圖。示範性電子裝置100可為智慧型電話、移動電話、數碼相機、平板電腦、穿戴式裝置等。示範性電子裝置100可包含至少(但不限於)處理器110、顯示螢幕130、儲存媒體150、圖像擷取元件170和收發器190。下文詳細解釋示範性電子裝置100的每一元件。 3 is a block diagram illustrating the hardware of an electronic device in a functional block in accordance with one of the exemplary embodiments of the present disclosure. Exemplary electronic device 100 can be a smart phone, a mobile phone, a digital camera, a tablet, a wearable device, or the like. Exemplary electronic device 100 can include at least, but not limited to, processor 110, display screen 130, storage medium 150, image capture component 170, and transceiver 190. Each element of the exemplary electronic device 100 is explained in detail below.

處理器110可為(但不限於)用於一般用途或特殊用途的中央處理單元(central processing unit,CPU)、或可編程微處理器、數位信號處理器(digital signal processor,DSP)、可編程控制器、專用積體電路(application specific integrated circuit,ASIC)、可編程邏輯裝置(programmable logic device,PLD)或其他類似裝置或其組合。在本實施例中，處理器110可分別電耦接到顯示螢幕130、儲存媒體150、圖像擷取元件170和收發器190，其中處理器110將控制示範性電子裝置100的所有操作。 The processor 110 can be, but is not limited to, a central processing unit (CPU) for general purpose or special purpose, or a programmable microprocessor, a digital signal processor (DSP), programmable A controller, an application specific integrated circuit (ASIC), a programmable logic device (PLD), or the like, or a combination thereof. In the present embodiment, the processor 110 can be electrically coupled to the display screen 130, the storage medium 150, the image capture component 170, and the transceiver 190, respectively, wherein the processor 110 will control all operations of the exemplary electronic device 100.

顯示螢幕130可為在電子裝置100的顯示區域內提供顯示功能的顯示裝置。顯示裝置可為(但不限於)液晶顯示器(liquid crystal display,LCD)、發光二極體(light-emitting diode,LED)、場發射顯示器(field emission display,FED)等。 The display screen 130 may be a display device that provides a display function within a display area of the electronic device 100. The display device can be, but not limited to, a liquid crystal display (LCD), a light-emitting diode (LED), a field emission display (FED), or the like.

儲存媒體150可為易失性或非易失性記憶體，用於儲存經緩衝的資料或永久資料，例如，透過圖像擷取元件170擷取的媒體內容或用於執行示範性移動電子裝置100的功能的指令。 The storage medium 150 can be a volatile or non-volatile memory for storing buffered data or permanent data, such as media content retrieved by the image capture component 170 or for performing exemplary mobile electronic devices. 100 function instructions.

圖像擷取元件170可為(但不限於)相機、攝像機等，其透過光學元件和攝像元件來擷取場景作為媒體內容(例如，相片、視訊等)。在本揭露中，表示相關事件的場景的媒體內容可由圖像擷取元件170擷取且儲存在儲存媒體150中。 The image capturing component 170 can be, but is not limited to, a camera, a video camera, etc., which captures scenes as media content (eg, photos, videos, etc.) through optical components and imaging components. In the present disclosure, media content representing a scene of related events may be captured by image capture component 170 and stored in storage medium 150.

收發器190可為支援全球移動通信系統(global system for mobile communication,GSM)、使用者掌上型電話系統(personal handy-phone system,PHS)、碼分多址(code division multiple access,CDMA)系統、寬頻碼分多址(wideband code division multiple access,WCDMA)系統、長期演進(long term evolution,LTE)系統、微波接入全球互通(worldwide interoperability for microwave access,WiMAX)系統、無線保真(wireless fidelity,Wi-Fi)系統或藍牙的信號傳輸的元件，例如，協定單元。收發器190還可為其支援的組件。收發器190將為電子裝置100提供無線傳輸，包含多個元件，但不限於發射器電路、接收器電路、類比/數位(A/D)轉換器、數位/類比(D/A)轉換器、低噪音放大器(low noise amplifier,LNA)、混頻器、濾波器、匹配網路、傳輸線、功率放大器(power amplifier,PA)和一個或一個以上天線單元。發射器和接收器以無線方式發射下行鏈路信號和接收上行鏈路信號。接收器可包含執行例如低噪音放大、阻抗匹配、混頻、上變頻、濾波、功率放大等操作的功能元件。類比/數位(A/D)或數位/類比(D/A)轉換器經配置以在上行鏈路信號處理期間從類比信號格式轉換為數位信號形式且在下行鏈路信號處理期間從數位信號格式轉換為類比信號格式。在本揭露的示範性實施例中，收發器190可用于向不同電子裝置傳輸媒體內容或從不同電子裝置接收媒體內容。 The transceiver 190 can support a global system for mobile communication (GSM), a user's personal handy-phone system (PHS), a code division multiple access (CDMA) system, Wideband code division multiple access (WCDMA) system, long term evolution (LTE) system, worldwide interoperability for microwave access (WiMAX) system, wireless fidelity (wireless fidelity, Wi-Fi) System or component of Bluetooth signal transmission, for example, a protocol unit. Transceiver 190 can also be a component for its support. The transceiver 190 will provide wireless transmission for the electronic device 100, including multiple components, but is not limited to transmitter circuitry, receiver circuitry, analog/digital (A/D) converters, digital/analog ratio (D/A) converters, Low noise amplifier Amplifier, LNA), mixer, filter, matching network, transmission line, power amplifier (PA) and one or more antenna elements. The transmitter and receiver wirelessly transmit downlink signals and receive uplink signals. The receiver may include functional elements that perform operations such as low noise amplification, impedance matching, mixing, upconversion, filtering, power amplification, and the like. Analog/digital (A/D) or digital/analog (D/A) converters are configured to convert from analog signal format to digital signal form during uplink signal processing and from digital signal format during downlink signal processing Convert to analog signal format. In an exemplary embodiment of the present disclosure, the transceiver 190 can be used to transmit media content to or receive media content from different electronic devices.

在下文中，將關於圖3所說明的示範性電子裝置100來詳細解釋根據圖1和圖2A到圖2D所說明的相關媒體內容而產生MPOV視訊。 In the following, MPOV video will be generated in detail with respect to the exemplary media content illustrated in FIG. 1 and FIGS. 2A through 2D with respect to the exemplary electronic device 100 illustrated in FIG.

參看圖3，事件的場景可透過示範性電子裝置100的圖像擷取元件170來擷取，且接著處理器110將事件的場景作為多個媒體內容(例如，圖像、連續圖像、音頻記錄(audio recording)等)儲存在示範性電子裝置100的儲存媒體150中。應注意，連續圖像可指視訊中的多個圖像幀或連拍模式中擷取的多個圖像。 Referring to FIG. 3, the scene of the event can be captured by the image capture component 170 of the exemplary electronic device 100, and then the processor 110 views the scene of the event as multiple media content (eg, images, continuous images, audio). Audio recording, etc., is stored in the storage medium 150 of the exemplary electronic device 100. It should be noted that a continuous image may refer to a plurality of image frames in a video or a plurality of images captured in a continuous shooting mode.

示範性電子裝置100可使用處理器110以識別媒體內容的相關性，且接著基於與事件相關的媒體內容而產生MPOV視訊。在本揭露的示範性實施例中的一者中，媒體內容中的第一媒體內容和第二媒體內容用作實例以作說明。電子裝置100的處理器110將基於時間資訊和/或位置資訊而將第一媒體內容和第二媒體內容識別為與同一事件相關。詳細地說，處理器110將從分別嵌入在第一媒體內容和第二媒體內容中或分別與第一媒體內容和第二媒體內容相關聯的元資料(metadata)提取時間資訊和/或位置資訊，以確定第一媒體內容是否在時間和/或位置上與第二媒體內容相關。應注意，本揭露的本實施例不限制媒體內容的來源。也就是說，媒體內容可包含由示範性電子裝置100擷取的媒體內容或由附近的其他電子裝置(例如，圖1所說明的電子裝置10、20、30中的任一者)擷取且傳輸的媒體內容。 The exemplary electronic device 100 can use the processor 110 to identify the relevance of the media content and then generate MPOV video based on the media content associated with the event. In one of the exemplary embodiments of the present disclosure, the first media content and the second media content in the media content are used as examples for illustration. The processor 110 of the electronic device 100 will use the first media content and the second media based on time information and/or location information. Body content is identified as being related to the same event. In detail, the processor 110 extracts time information and/or location information from metadata respectively embedded in the first media content and the second media content or associated with the first media content and the second media content, respectively. And determining whether the first media content is related to the second media content in time and/or location. It should be noted that this embodiment of the present disclosure does not limit the source of media content. That is, the media content can include media content retrieved by the exemplary electronic device 100 or captured by other nearby electronic devices (eg, any of the electronic devices 10, 20, 30 illustrated in FIG. 1). Media content transmitted.

此外，在本揭露的示範性實施例中的一者中，第一媒體內容和第二媒體內容的音頻資訊可用於識別第一媒體內容和第二媒體內容是否在時間和/或位置上與同一事件相關。 Moreover, in one of the exemplary embodiments of the present disclosure, the audio information of the first media content and the second media content can be used to identify whether the first media content and the second media content are identical in time and/or location Event related.

在示範性實施例中的一者中，可獲得時間資訊(例如，時戳(timestamp))、音頻資訊和位置資訊(具有地理標籤和周圍信號資訊)，這些資訊在媒體內容的擷取後對應於媒體內容中的每一者。在下文中，將詳細描述時間資訊、音頻資訊和位置資訊。 In one of the exemplary embodiments, time information (eg, timestamp), audio information, and location information (with geotags and surrounding signal information) may be obtained, the information corresponding to the media content being captured Everyone in the media content. In the following, time information, audio information, and location information will be described in detail.

媒體內容的時間資訊可包含(但不限於)記錄擷取媒體內容中的每一者的時刻的日期和時間的時戳。時戳可獲自電子裝置100的系統時鐘，其中系統時鐘可由全球定位系統(Global Positioning System,GPS)、Wi-Fi存取點(access point)、無線電進接網路(radio access network)、伺服器等自動地同步。然而，示範性實施例不希望限制本揭露，時戳還可由用戶配置或透過任何其他方式來配置。 The time information of the media content may include, but is not limited to, a time stamp that records the date and time of the moment of capturing each of the media content. The time stamp can be obtained from the system clock of the electronic device 100, wherein the system clock can be a Global Positioning System (GPS), a Wi-Fi access point, a radio access network, and a servo. The device is automatically synchronized. However, the exemplary embodiments are not intended to limit the disclosure, and the time stamps may be configured by the user or by any other means.

音頻資訊可包含(但不限於)關於媒體內容的音頻波形的資訊，例如，波形形狀、在特定時間具有值的可區別特徵(distinguishable feature)等。圖4為說明根據本揭露的示範性實施例中的一者的媒體內容的音頻波形410的圖式。參看圖4，可區別特徵420、430、440可獲自音頻波形410。在示範性實施例中的一者中，值422和對應於值422的時戳421可獲自可區別特徵420，且隨後在對應於音頻資訊的媒體內容中的每一者的擷取後儲存在所述媒體內容的元資料中。然而，嵌入在元資料中的音頻資訊的類型並不限於上述示範性實施例所示，與媒體內容的音頻波形相關的其他資訊亦可被利用。在示範性實施例中的一者中，背景噪音的形狀(shape)亦可被利用。此外，本揭露並不限於所述的示範性實施例。在示範性實施例中的一者中，可從媒體內容提取音頻資訊，且音頻資訊同時識別媒體內容的相關性。也就是說，代替將音頻資訊嵌入到媒體內容的元資料中，處理器110可即時地分析媒體內容，以提取媒體內容的音頻資訊，以便識別媒體內容是否與同一事件相關。 The audio information may include, but is not limited to, information about the audio waveform of the media content, such as a waveform shape, a distinguishing feature having a value at a particular time, and the like. 4 is a diagram illustrating an audio waveform 410 of media content in accordance with one of the exemplary embodiments of the present disclosure. Referring to FIG. 4, distinguishable features 420, 430, 440 are available from audio waveform 410. In one of the exemplary embodiments, the value 422 and the timestamp 421 corresponding to the value 422 may be obtained from the distinguishable feature 420 and then stored after the capture of each of the media content corresponding to the audio information. In the metadata of the media content. However, the type of audio information embedded in the metadata is not limited to that shown in the above exemplary embodiment, and other information related to the audio waveform of the media content may be utilized. In one of the exemplary embodiments, the shape of the background noise can also be utilized. Moreover, the disclosure is not limited to the exemplary embodiments described. In one of the exemplary embodiments, audio information may be extracted from the media content, and the audio information simultaneously identifies the relevance of the media content. That is, instead of embedding audio information into the metadata of the media content, the processor 110 can analyze the media content in real time to extract audio information of the media content to identify whether the media content is associated with the same event.

位置資訊可包含(但不限於)地理標籤和周圍信號資訊。地理標籤可包含(但不限於)GPS位置、精度資料等。GPS位置和精度資料可在媒體內容的擷取後獲自擷取媒體內容的電子裝置的GPS晶片(未圖示)。媒體內容的GPS位置可記錄(但不限於)表示擷取媒體內容的位置的經度座標和緯度座標。精度資料將記錄獲取對應GPS位置時的經度座標和緯度座標的精度。 Location information can include, but is not limited to, geotags and surrounding signal information. Geotags can include, but are not limited to, GPS location, accuracy data, and the like. The GPS position and accuracy data can be obtained from a GPS chip (not shown) of the electronic device that retrieves the media content after the media content is captured. The GPS location of the media content may record, but is not limited to, a longitude coordinate and a latitude coordinate representing the location at which the media content was retrieved. The accuracy data will record the accuracy of the longitude coordinates and latitude coordinates when acquiring the corresponding GPS position.

位置資訊的周圍信號資訊可包含(但不限於)關於附近裝置(例如，其他移動電子裝置(例如，熱點模式中的智慧型電話)、接入點(AP，例如，Wi-Fi路由器)、無線電網路接入塔等)的信號強度的資訊。換句話說，電子裝置100與附近裝置之間的無線信號(例如，Wi-Fi、藍牙或無線電信號)可用於確定擷取媒體內容中的每一者的位置之間的相對距離。在示範性實施例中的一者中，在擷取媒體內容的電子裝置100周圍有多個無線裝置，且可分析關於電子裝置100的附近裝置中的每一者的信號強度並對信號強度進行排序以形成對附近裝置的信號強度進行排序的列表。舉例來說，當擷取媒體內容中的每一者時，對附近裝置的信號強度進行排序的列表可嵌入到媒體內容中的每一者的元資料中。 The surrounding signal information of the location information may include, but is not limited to, related devices (eg, other mobile electronic devices (eg, smart phones in hotspot mode), access points (APs, eg, Wi-Fi routers), radios Information on the signal strength of a network access tower, etc.). In other words, a wireless signal (eg, Wi-Fi, Bluetooth, or radio signal) between the electronic device 100 and a nearby device can be used to determine the relative distance between the locations of each of the captured media content. In one of the exemplary embodiments, there are a plurality of wireless devices around the electronic device 100 that captures the media content, and the signal strength of each of the nearby devices of the electronic device 100 can be analyzed and the signal strength is performed. Sort to form a list that sorts the signal strength of nearby devices. For example, when each of the media content is retrieved, a list of the signal strengths of nearby devices can be embedded in the metadata of each of the media content.

圖5A和圖5B為說明根據本揭露的實施例中的一者的建立基於信號強度而排序的附近裝置的列表的概念的圖式。參看圖5A和圖5B，假設接入點(AP)1510、AP2 520、AP3 530、AP4 540和AP5 550為附近的第一電子裝置10和第二電子裝置20。在所述實施例中，第一電子裝置10和第二電子裝置20將在媒體內容的擷取後基於信號強度來對附近裝置進行排序。舉例來說，可產生第一列表560，其以AP1 510、AP3 530、AP2 520、AP5 550等的序列關於第一電子裝置10而對附近裝置中的每一者的信號強度進行排序，且第一列表560可嵌入在由第一電子裝置10擷取的媒體內容的元資料中。還可產生第二列表570，其以AP1 510、AP5 550、 AP3 530、AP4 540等的序列關於第二電子裝置20而對附近裝置中的每一者的信號強度進行排序，且第二列表570可嵌入在由第二電子裝置20擷取的媒體內容的元資料中。在本實施例中，AP中的每一者的基本服務集識別(BSSID)用於在含有附近裝置的所排序的信號強度的列表中識別AP。然而，本揭露不限於此，這是因為AP可透過其他方式來識別。 5A and 5B are diagrams illustrating concepts of establishing a list of nearby devices ranked based on signal strength, in accordance with one of the embodiments of the present disclosure. 5A and 5B, it is assumed that the access point (AP) 1510, AP2 520, AP3 530, AP4 540, and AP5 550 are the nearby first electronic device 10 and second electronic device 20. In the described embodiment, the first electronic device 10 and the second electronic device 20 will sort nearby devices based on signal strength after the media content is captured. For example, a first list 560 can be generated, which sorts the signal strengths of each of the nearby devices with respect to the first electronic device 10 in a sequence of AP1 510, AP3 530, AP2 520, AP5 550, etc., and A list 560 can be embedded in the metadata of the media content retrieved by the first electronic device 10. A second list 570 can also be generated, which is AP1 510, AP5 550, The sequence of AP3 530, AP4 540, etc. sorts the signal strength of each of the nearby devices with respect to the second electronic device 20, and the second list 570 can be embedded in the element of the media content retrieved by the second electronic device 20. In the information. In this embodiment, the basic service set identification (BSSID) of each of the APs is used to identify the AP in a list of ranked signal strengths containing nearby devices. However, the disclosure is not limited thereto, because the AP can be identified by other means.

此外，在本揭露的替代實施例中，第一電子裝置10與第二電子裝置20之間的信號強度也可用於確定由第一電子裝置10擷取的第一媒體內容12是否在位置上與由第二電子裝置20擷取的第二媒體內容22相關。 Moreover, in an alternative embodiment of the present disclosure, the signal strength between the first electronic device 10 and the second electronic device 20 can also be used to determine whether the first media content 12 captured by the first electronic device 10 is in positional The second media content 22 retrieved by the second electronic device 20 is associated.

在下文中，詳細描述根據時間資訊、音頻資訊和位置資訊來識別第一媒體內容和第二媒體內容的相關性。 In the following, the correlation between the first media content and the second media content is identified based on time information, audio information, and location information.

在本實施例中，處理器110將根據嵌入在元資料中的時間碼而識別第一媒體內容和第二媒體內容是否在時間上與同一事件相關。詳細地說，處理器110將從元資料獲得分別指示擷取第一媒體內容和第二媒體內容的時刻的時戳的時間碼，且確定第一媒體內容和第二媒體內容是否與同一事件相關。在本揭露的示範性實施例中的一者中，處理器110可計算第一媒體內容與第二媒體內容的時間碼之間的時差，且確定第一媒體內容與第二媒體內容之間的時差是否在預定範圍內。舉例來說，預定範圍可為2小時。如果第一媒體內容與第二媒體內容之間的時差在2小時內，那麼處理器110將認為第一媒體內容和第二媒體內容在時間上與同一事件相關。 In this embodiment, the processor 110 will identify whether the first media content and the second media content are temporally related to the same event based on the time code embedded in the metadata. In detail, the processor 110 obtains a time code indicating the time stamp of the time of capturing the first media content and the second media content from the metadata, and determines whether the first media content and the second media content are related to the same event. . In one of the exemplary embodiments of the present disclosure, the processor 110 may calculate a time difference between a time code of the first media content and the second media content, and determine between the first media content and the second media content. Whether the time difference is within the predetermined range. For example, the predetermined range can be 2 hours. If the time difference between the first media content and the second media content is within 2 hours, the processor 110 will consider the first media content and the second media content to be temporally related Related to the same event.

然而，本揭露不限於此。在本揭露的示範性實施例中的一者中，當第一媒體內容和第二媒體內容的時間碼在指示事件的發生的預定時段內時，第一媒體內容和第二媒體內容將被識別為在時間上相互相關。舉例來說，棒球事件可在5月1日下午4：00到下午9：00之間發生，且預定時段可例如配置為下午4：00到下午9：00。如果時間碼指示第一媒體內容和第二媒體內容是在下午4：00到下午9：00之間擷取，那麼第一媒體內容和第二媒體內容將被識別為在時間上與棒球事件相關。應注意，預定時段可由媒體內容自動地確定或由使用者配置。舉例來說，可存在未在棒球事件之前和棒球事件之後擷取媒體內容的時段。處理器110將自動地將未擷取媒體內容的這些時段之間的時間間隔檢測為預定時段，且將在預定時段期間擷取的媒體內容分組為媒體合輯。應注意，預定時段將方便地充當用於識別任何媒體內容是否在時間上與事件相關的一組相關性準則的參數中的一者。 However, the disclosure is not limited thereto. In one of the exemplary embodiments of the present disclosure, when the time code of the first media content and the second media content is within a predetermined time period indicating the occurrence of the event, the first media content and the second media content are to be identified For correlation in time. For example, a baseball event may occur between 4:00 pm and 9:00 pm on May 1, and the predetermined time period may be configured, for example, from 4:00 pm to 9:00 pm. If the time code indicates that the first media content and the second media content are captured between 4:00 pm and 9:00 pm, the first media content and the second media content will be identified as being related to the baseball event in time. . It should be noted that the predetermined time period may be automatically determined by the media content or configured by the user. For example, there may be periods in which media content is not captured before a baseball event and after a baseball event. The processor 110 will automatically detect the time interval between these periods of uncaptured media content as a predetermined time period, and group the media content captured during the predetermined time period into a media mix. It should be noted that the predetermined time period will conveniently serve as one of the parameters for identifying whether any media content is temporally related to a set of relevance criteria.

在本揭露中，處理器110還將根據音頻波形410、地理標籤或周圍信號資訊而識別第一媒體內容和第二媒體內容是否在位置上與同一事件相關。下文描述識別媒體內容的位置相關性的詳細描述。 In the present disclosure, the processor 110 will also identify whether the first media content and the second media content are related in position to the same event based on the audio waveform 410, the geotag, or the surrounding signal information. A detailed description of identifying the location relevance of the media content is described below.

為了確定媒體內容是否在位置上與事件相關，處理器110可利用第一媒體內容和第二媒體內容的音頻波形410的可區別特徵420、430、440，其說明於圖4中。舉例來說，在擊球事件中，區別特徵410可為當棒球接觸球棒時產生的噪音(即，擊球噪音)，且區別特徵420、430可為由人群產生的歡呼噪音。處理器110將認為具有擊球噪音和歡呼噪音的第一媒體內容和第二媒體內容在位置上與同一事件相關。 To determine if the media content is locationally related to the event, the processor 110 can utilize the distinguishable features 420, 430, 440 of the audio waveform 410 of the first media content and the second media content, which are illustrated in FIG. For example, in the batting event, The distinguishing feature 410 can be the noise generated when the baseball contacts the bat (ie, the hitting noise), and the distinguishing features 420, 430 can be cheering noise generated by the crowd. The processor 110 correlates the first media content and the second media content that are considered to have hitting noise and cheering noise in position with the same event.

在本揭露的示範性實施例中的一者中，第一媒體內容和第二媒體內容可根據地理標籤而被識別為在位置上與同一事件相關。處理器110可從第一媒體內容和第二媒體內容的元資料獲得地理標籤，例如，GPS位置和精度資料。處理器110將透過利用第一媒體內容和第二媒體內容的GPS位置而將第一媒體內容和第二媒體內容識別為在位置上與同一事件相關。舉例來說，處理器110將確定擷取第一媒體內容和第二媒體內容的GPS位置之間的差是否在預定距離內。如果所述差在預定距離內，那麼認為第一媒體內容和第二媒體內容在位置上與同一事件相關。預定距離可根據實際應用來配置；本揭露不希望限制預定距離的範圍，其中所述範圍可針對棒球場而配置為500米內的任何數值或針對賽道而配置為5千米內的任何數值。 In one of the exemplary embodiments of the present disclosure, the first media content and the second media content may be identified as being related in position to the same event based on the geographic tag. The processor 110 may obtain geotags, such as GPS location and accuracy data, from metadata of the first media content and the second media content. The processor 110 will identify the first media content and the second media content as being related in position to the same event by utilizing the GPS location of the first media content and the second media content. For example, processor 110 will determine if the difference between the GPS locations of the first media content and the second media content is within a predetermined distance. If the difference is within a predetermined distance, then the first media content and the second media content are considered to be related in position to the same event. The predetermined distance may be configured according to the actual application; the disclosure does not wish to limit the range of the predetermined distance, wherein the range may be configured for any value within 500 meters for the baseball field or any value within 5 kilometers for the track. .

在本揭露的示範性實施例中的一者中，處理器110可根據第一媒體內容和第二媒體內容的GPS位置以及相關事件的預定地理座標而識別第一媒體內容和第二媒體內容與同一事件相關。預定地理座標可為表示已發生事件之處的GPS座標，其可自動地獲自已被識別為與事件相關的媒體內容的GPS位置。本揭露不希望限制相關事件的預定地理座標的獲取，表示已發生事件之處的 GPS座標還可由使用者手動地配置。舉例來說，使用者可手動地輸入特定位置(例如，棒球場)的GPS座標，作為預定地理座標。在示範性實施例中，處理器110將關於相關事件的預定地理座標來確定第一媒體內容和第二媒體內容的GPS位置是否在預定距離內。在所述實施例中，預定地理座標可為所述一組相關性準則的參數中的一者。 In one of the exemplary embodiments of the present disclosure, the processor 110 may identify the first media content and the second media content based on the GPS location of the first media content and the second media content and a predetermined geographic coordinate of the related event. Related to the same event. The predetermined geographic coordinates may be GPS coordinates indicating where an event has occurred, which may be automatically obtained from GPS locations that have been identified as media content associated with the event. The disclosure does not wish to limit the acquisition of predetermined geographic coordinates of related events, indicating where an event has occurred The GPS coordinates can also be manually configured by the user. For example, a user can manually enter a GPS coordinate of a particular location (eg, a baseball field) as a predetermined geographic coordinate. In an exemplary embodiment, processor 110 determines whether the GPS location of the first media content and the second media content is within a predetermined distance with respect to a predetermined geographic coordinate of the relevant event. In the described embodiment, the predetermined geographic coordinates may be one of the parameters of the set of correlation criteria.

此外，處理器110將分析地理標籤的精度資料以確定GPS位置的精度是否在預定範圍內。換句話說，處理器110將確定在擷取第一媒體內容和第二媒體內容時獲得的GPS位置是否值得信賴。在本示範性實施例中，預定範圍可配置為(但不限於)100米內的任何數值範圍。即，當地理標籤的精度資料在預定範圍內時，處理器110將GPS位置用於識別第一媒體內容和第二媒體內容是否在位置上與同一事件相關。另一方面，如果精度資料指示GPS位置不處於預定範圍內，那麼處理器110將不考慮地理標籤的GPS位置來識別第一媒體內容和第二媒體內容是否與事件相關。 In addition, processor 110 will analyze the accuracy data of the geotag to determine if the accuracy of the GPS location is within a predetermined range. In other words, the processor 110 will determine if the GPS location obtained when capturing the first media content and the second media content is trustworthy. In the present exemplary embodiment, the predetermined range may be configured as, but not limited to, any range of values within 100 meters. That is, when the accuracy information of the geotag is within a predetermined range, the processor 110 uses the GPS location to identify whether the first media content and the second media content are related in position to the same event. On the other hand, if the accuracy data indicates that the GPS location is not within the predetermined range, the processor 110 will identify whether the first media content and the second media content are related to the event, regardless of the GPS location of the geotag.

在示範性實施例中的一者中，處理器110將根據周圍信號資訊來識別第一媒體內容和第二媒體內容是否與同一事件相關。詳細地說，處理器110將從第一媒體內容和第二媒體內容的元資料獲得基於信號強度來對附近裝置進行排序的列表560、570。在圖5A和圖5B所說明的實施例中，列表560、570將包含根據附近裝置到電子裝置的信號強度而從高到低排序的附近裝置。示範性實施例包含從不同視點擷取感興趣的事件作為多個媒體內容的第一電子裝置10和第二電子裝置20。 In one of the exemplary embodiments, processor 110 will identify whether the first media content and the second media content are related to the same event based on the surrounding signal information. In detail, the processor 110 obtains a list 560, 570 that sorts nearby devices based on signal strength from the metadata of the first media content and the second media content. In the embodiment illustrated in Figures 5A and 5B, the lists 560, 570 will contain nearby devices sorted from high to low based on the signal strength of nearby devices to the electronic device. Set. The exemplary embodiment includes first electronic device 10 and second electronic device 20 that take an event of interest as a plurality of media content from different viewpoints.

舉例來說，第一電子裝置10將擷取第一媒體內容，且第二電子裝置20將擷取第二媒體內容。在附近，存在AP1 510、AP2 520、AP3 530、AP4 540和AP5 550。處理器110可根據列表560和570中所列出的AP的排序而確定第一媒體內容和第二媒體內容是否在位置上與同一事件相關。列表560將基於AP與第一電子裝置10之間的信號強度而在第一媒體內容的擷取後對第一電子裝置10周圍的AP進行排序，且列表570將基於AP與第二電子裝置20之間的信號強度而在第二媒體內容的擷取後對第二電子裝置20周圍的AP進行排序。在示範性實施例中，處理器110將比較列表560與570之間的AP1 510、AP2 520、AP3 530、AP4 540和AP5 550的排序。 For example, the first electronic device 10 will capture the first media content, and the second electronic device 20 will capture the second media content. In the vicinity, there are AP1 510, AP2 520, AP3 530, AP4 540, and AP5 550. Processor 110 may determine whether the first media content and the second media content are locationally related to the same event based on the ranking of the APs listed in lists 560 and 570. The list 560 will sort the APs around the first electronic device 10 after the first media content is captured based on the signal strength between the AP and the first electronic device 10, and the list 570 will be based on the AP and the second electronic device 20 The signal strength between the two is sorted by the AP around the second electronic device 20 after the second media content is captured. In an exemplary embodiment, processor 110 will compare the ordering of AP1 510, AP2 520, AP3 530, AP4 540, and AP5 550 between lists 560 and 570.

舉例來說，在示範性實施例中的一者中，列表560和570中所列出的AP中的至少三者必須為相同的，且所述至少三個AP中的至少兩個AP必須在列表中排在前三，以讓處理器110根據附近裝置的信號強度而認為第一媒體內容和第二媒體內容在位置上與同一事件相關。然而，本揭露不限制用於基於周圍信號資訊來識別第一媒體內容和第二媒體內容的相關性的要求，所述要求可根據實際應用來設計。舉例來說，在其他示範性實施例中，用於識別第一媒體內容和第二媒體內容與同一事件相關的要求可為列表中的兩個匹配的AP且兩個匹配的AP中的任一者在列表中排在前三。 For example, in one of the exemplary embodiments, at least three of the APs listed in lists 560 and 570 must be the same, and at least two of the at least three APs must be The top three in the list are such that the processor 110 considers the first media content and the second media content to be positionally related to the same event based on the signal strength of nearby devices. However, the present disclosure does not limit requirements for identifying the relevance of the first media content and the second media content based on surrounding signal information, which may be designed according to the actual application. For example, in other exemplary embodiments, the requirements for identifying the first media content and the second media content related to the same event may be two of the matching APs in the list and any of the two matching APs Are listed in the list The first three.

在本揭露的示範性實施例中的一者中，還可在附近不存在AP的情況下根據Wi-Fi直連(Wi-Fi Direct)信號來識別媒體內容的相關性。舉例來說，示範性電子裝置100可具有在熱點(hotspot)模式中透過Wi-Fi直連與附近的其他電子裝置通信的能力。示範性電子裝置100和附近的其他電子裝置之間的Wi-Fi直連信號強度可得以檢測且儲存在媒體內容中的每一者的元資料中。接著，當識別第一媒體內容和第二媒體內容在位置上與同一事件相關時，示範性電子裝置100的處理器110可利用附近的其他電子裝置的Wi-Fi直連信號強度以確定由不同電子裝置擷取的媒體內容是否相互相關。 In one of the exemplary embodiments of the present disclosure, the relevance of the media content may also be identified from the Wi-Fi Direct signal in the absence of an AP in the vicinity. For example, the exemplary electronic device 100 can have the ability to communicate with other nearby electronic devices via Wi-Fi Direct in a hotspot mode. The Wi-Fi Direct signal strength between the exemplary electronic device 100 and other nearby electronic devices can be detected and stored in the metadata of each of the media content. Then, when it is identified that the first media content and the second media content are related in position to the same event, the processor 110 of the exemplary electronic device 100 can utilize the Wi-Fi Direct signal strength of other nearby electronic devices to determine that the difference is different. Whether the media content captured by the electronic device is related to each other.

基於上述內容，本揭露將根據時間碼、附近裝置的音頻波形、地理標籤和信號強度而識別第一媒體內容和第二媒體內容在時間和位置上與同一事件相關，且每一步驟的細節描述于上文中。在本揭露的實施例中的一者中，處理器110可首先根據時間碼來識別第一媒體內容和第二媒體內容是否與同一事件相關。如果第一媒體內容和第二媒體內容在時間上不相關，那麼處理器110將認為第一媒體內容和第二媒體內容在位置上也不與同一事件相關。如果確定第一媒體內容和第二媒體內容在時間上相關，那麼處理器110將接著以附近裝置的音頻波形、地理標籤和信號強度的序列來確定第一媒體內容和第二媒體內容是否在位置上與同一事件相關。然而，本揭露不限於此。識別第一媒體內容和第二媒體內容是否在時間和位置上與同一事件相關的序列可加以修改以滿足實際應用的設計要求。 Based on the above, the present disclosure will identify that the first media content and the second media content are related to the same event in time and location according to the time code, the audio waveform of the nearby device, the geotag, and the signal strength, and the detailed description of each step Above. In one of the embodiments of the present disclosure, the processor 110 may first identify whether the first media content and the second media content are related to the same event based on the time code. If the first media content and the second media content are not temporally related, the processor 110 will consider that the first media content and the second media content are also not related in the same location to the same event. If it is determined that the first media content and the second media content are temporally related, the processor 110 will then determine whether the first media content and the second media content are in a location with a sequence of audio waveforms, geotags, and signal strengths of nearby devices. Related to the same event. However, the disclosure is not limited thereto. Identify the first media content and the second media Whether the volume content is temporally and geographically related to the same event sequence can be modified to meet the design requirements of the actual application.

一旦第一媒體內容和第二媒體內容被識別為與同一事件相關，那麼處理器110將接著提供第一媒體內容和第二媒體內容，作為用於產生MPOV視訊的候選媒體內容。 Once the first media content and the second media content are identified as being associated with the same event, the processor 110 will then provide the first media content and the second media content as candidate media content for generating MPOV video.

在識別第一媒體內容和第二媒體內容的相關性之後，示範性電子裝置100將進一步根據時間碼和音頻資訊來同步第一媒體內容和第二媒體內容。詳細地說，處理器110將根據第一媒體內容的時戳和第二媒體內容的時戳來調整第一媒體內容和第二媒體內容。舉例來說，可根據第一媒體內容和第二媒體內容中的每一者的開始時間和結束時間來調整第一媒體內容和第二媒體內容。 After identifying the relevance of the first media content and the second media content, the exemplary electronic device 100 will further synchronize the first media content and the second media content based on the time code and the audio information. In detail, the processor 110 will adjust the first media content and the second media content according to the time stamp of the first media content and the time stamp of the second media content. For example, the first media content and the second media content may be adjusted according to a start time and an end time of each of the first media content and the second media content.

此外，處理器110還可根據音頻波形來調整第一媒體內容和第二媒體內容。如上所述，可從媒體內容提取關於媒體內容的音頻波形的資訊，例如，音頻波形的可區別特徵的值和對應於所述值的時戳。在上述擊球事件的實例中，可基於在球棒接觸球時產生的擊球噪音來調整第一媒體內容和第二媒體內容。舉例來說，處理器110將識別表示第一媒體內容和第二媒體內容中的擊球噪音的值(例如，圖4所說明的值422)，且接著基於對應於所述值的時戳(例如，圖4所說明的時戳421)在統一時間軸上對準第一媒體內容和第二媒體內容。然而，本揭露不限於此。可根據音頻波形的其他特性(例如，可區別特徵的形狀)來對準第一媒體內容和第二媒體內容。 In addition, the processor 110 can also adjust the first media content and the second media content according to the audio waveform. As described above, information about the audio waveform of the media content can be extracted from the media content, such as the value of the distinguishable feature of the audio waveform and the timestamp corresponding to the value. In the example of the above-described batting event, the first media content and the second media content may be adjusted based on the shot noise generated when the bat contacts the ball. For example, processor 110 will identify a value representative of the hitting noise in the first media content and the second media content (eg, the value 422 illustrated in FIG. 4), and then based on a time stamp corresponding to the value ( For example, the time stamp 421 illustrated in FIG. 4 aligns the first media content and the second media content on a unified time axis. However, the disclosure is not limited thereto. The first medium can be aligned according to other characteristics of the audio waveform (eg, the shape of the distinguishable feature) Body content and second media content.

一旦第一媒體內容和第二媒體內容在時間上同步，那麼處理器110將識別第一媒體內容和第二媒體內容在時間上重疊的重疊部分。在示範性實施例中的一者中，可根據第一媒體內容和第二媒體內容的開始時間和結束時間來識別重疊部分。 Once the first media content and the second media content are synchronized in time, the processor 110 will identify overlapping portions of the first media content and the second media content that overlap in time. In one of the exemplary embodiments, the overlapping portion may be identified based on the start time and end time of the first media content and the second media content.

此後，處理器110將進一步根據第一媒體內容或第二媒體內容的內容特徵來識別與重疊部分相關聯的重點期間。內容特徵可指媒體內容的媒體類型和鏡頭類型，其中媒體內容的媒體類型可為靜止圖像或連續圖像，且媒體內容的鏡頭類型可為特寫鏡頭(closed-up shot)、中景鏡頭(medium shot)、近鏡頭(zoom-in shot)或遠鏡頭(zoom-out shot)。媒體內容的媒體類型和鏡頭類型可用於自動地確定事件的重點。 Thereafter, the processor 110 will further identify the focus period associated with the overlapping portion based on the content characteristics of the first media content or the second media content. The content feature may refer to a media type and a lens type of the media content, wherein the media type of the media content may be a still image or a continuous image, and the lens type of the media content may be a closed-up shot or a medium shot ( Medium shot), zoom-in shot, or zoom-out shot. The media type and lens type of the media content can be used to automatically determine the focus of the event.

在示範性實施例中的一者中，可實施面部檢測以確定由物件佔據的幀的部分與未由物件佔據的幀的其他部分之間的比率。如果物件佔據幀的比率超過預定比率，那麼處理器110會將第二媒體內容的幀的鏡頭類型識別為特寫鏡頭。舉例來說，預定比率可為任何數值，例如，60%或60%以上。當物件佔據幀的至少60%時，處理器110會將具有佔據幀的至少60%的物件的幀識別為特寫鏡頭。此外，處理器110還將針對具有佔據小於至少60%的區域的物件的幀來識別中景鏡頭(例如，空鏡頭)。在連續圖像的狀況下，上述技術可用於透過分析物件在視訊的每一幀中佔據的區域來確定視訊的鏡頭類型是近鏡頭還是遠鏡頭。 In one of the exemplary embodiments, face detection may be implemented to determine the ratio between the portion of the frame occupied by the object and the other portions of the frame that are not occupied by the object. If the ratio of objects occupying the frame exceeds a predetermined ratio, the processor 110 identifies the shot type of the frame of the second media content as a close-up shot. For example, the predetermined ratio can be any value, for example, 60% or more. When the object occupies at least 60% of the frame, the processor 110 identifies the frame with the object occupying at least 60% of the frame as a close-up. In addition, processor 110 will also identify mid-range shots (eg, empty shots) for frames having objects that occupy less than at least 60% of the area. In the case of continuous images, the above technique can be used to determine whether the lens type of the video is a near lens or a far lens by analyzing an area occupied by the object in each frame of the video.

在示範性實施例中的一者中，媒體內容的媒體類型用於識別與重疊部分相關聯的重點期間。舉例來說，作為與重疊部分相關聯的靜止圖像擷取的事件的場景將被視為事件的重點。在本揭露的實施例中的一者中，媒體內容的鏡頭類型可用於識別與重疊部分相關聯的重點期間。可根據以特寫鏡頭或近鏡頭的類型擷取的媒體內容來識別重點期間，這是因為使用者可趨向於針對被視為事件的重點的時刻拍攝特寫鏡頭或近鏡頭。 In one of the exemplary embodiments, the media type of the media content is used to identify a focus period associated with the overlapping portion. For example, a scene that is an event captured by a still image associated with an overlapping portion will be considered the focus of the event. In one of the embodiments of the present disclosure, the lens type of the media content can be used to identify a focus period associated with the overlapping portion. The focus period can be identified based on the media content captured in the close-up or close-up type because the user can tend to take close-up or close-up shots of the moments that are considered to be the focus of the event.

在下文中，圖6到圖8為用於闡述識別第一媒體內容和第二媒體內容的重疊部分內的重點期間的細節的特定示範性實施例。 In the following, FIGS. 6 through 8 are specific exemplary embodiments for illustrating details of highlighting periods of emphasis within overlapping portions of first media content and second media content.

圖6為說明根據本揭露的示範性實施例中的一者的統一時間軸上的第一媒體內容和第二媒體內容的同步的圖式。當第一媒體內容610和第二媒體內容620在時間上同步時，處理器110將進一步識別重疊部分640，其中第一媒體內容610和第二媒體內容620在第一時戳T1與第二時戳T2之間在時間上重疊。換句話說，第一媒體內容610與第二媒體內容620之間的重疊部分將表示在幾乎同一時間從不同的觀點擷取的同一現實生活事件。在本實施例中，可根據第二媒體內容620的開始時間和第一媒體內容610的結束時間來識別第一媒體內容610和第二媒體內容620的重疊部分。 6 is a diagram illustrating synchronization of first media content and second media content on a unified timeline in accordance with one of the exemplary embodiments of the present disclosure. When the first media content 610 and the second media content 620 are synchronized in time, the processor 110 will further identify the overlapping portion 640, wherein the first media content 610 and the second media content 620 are at the first timestamp T1 and the second time The stamps T2 overlap in time. In other words, the overlapping portion between the first media content 610 and the second media content 620 will represent the same real life event that was drawn from a different perspective at substantially the same time. In this embodiment, the overlapping portion of the first media content 610 and the second media content 620 may be identified according to the start time of the second media content 620 and the end time of the first media content 610.

參看圖6，第一媒體內容610和第二媒體內容620為在第一時間T1與第二時間T2之間具有重疊部分640的連續圖像。連續圖像將包含多個幀。處理器110將分析重疊部分640內的第一媒體內容610和第二媒體內容620以確定媒體類型和鏡頭類型，來識別與重疊部分640相關聯的重點期間。詳細地說，處理器110將確定第一媒體內容610和第二媒體內容620為連續圖像。此外，處理器110將識別第二媒體內容620為物件的特寫鏡頭，這是因為由第二媒體內容620擷取的物件(例如，打擊手)佔據的區域大於與重疊部分640相關聯的第二媒體內容640的幀的預定比率(例如，60%)。由物件佔據的區域遍及與重疊部分640相關聯的第二媒體內容620的每一幀而類似，因此，第二媒體內容620的鏡頭類型被確定為特寫鏡頭。 Referring to FIG. 6, the first media content 610 and the second media content 620 are continuous images having overlapping portions 640 between the first time T1 and the second time T2. even The continuation image will contain multiple frames. The processor 110 will analyze the first media content 610 and the second media content 620 within the overlap portion 640 to determine the media type and shot type to identify the highlight period associated with the overlap portion 640. In detail, the processor 110 will determine that the first media content 610 and the second media content 620 are continuous images. Moreover, the processor 110 will identify the second media content 620 as a close-up of the object because the object captured by the second media content 620 (eg, the striking hand) occupies a larger area than the second associated with the overlapping portion 640 A predetermined ratio of frames of media content 640 (eg, 60%). The area occupied by the object is similar throughout each frame of the second media content 620 associated with the overlapping portion 640, and thus the lens type of the second media content 620 is determined to be a close-up.

在圖6所說明的示範性實施例中，處理器110將識別重疊部分640內的重點期間650，這是因為第二媒體內容620的媒體類型被識別為特寫鏡頭。也就是說，由物件在與重疊部分相關聯的第二媒體內容620的幀內佔據的區域大於預定比率。應注意，重疊部分640可具有較大長度(例如，5分鐘)的事件的連續鏡頭。處理器110可透過預定持續時間(例如，小於5分鐘的任何時長)來界定重點期間650。在本示範性實施例中，重點期間650可界定於第三時間T3與第四時間T4之間。由第三時間T3和第四時間T4界定的重點期間650可在重疊部分640內任意選擇，且可具有預定持續時間。本揭露不希望限制具有特寫鏡頭類型的媒體內容的重疊部分640內的重點期間650的持續時間和選擇。 In the exemplary embodiment illustrated in FIG. 6, processor 110 will identify focus period 650 within overlap portion 640 because the media type of second media content 620 is identified as a close-up shot. That is, the area occupied by the object within the frame of the second media content 620 associated with the overlapping portion is greater than a predetermined ratio. It should be noted that the overlapping portion 640 can have a continuous footage of events of greater length (eg, 5 minutes). The processor 110 can define the focus period 650 for a predetermined duration (eg, any duration less than 5 minutes). In the present exemplary embodiment, the focus period 650 may be defined between the third time T3 and the fourth time T4. The highlight period 650 defined by the third time T3 and the fourth time T4 may be arbitrarily selected within the overlap portion 640 and may have a predetermined duration. The present disclosure does not wish to limit the duration and selection of focus periods 650 within overlapping portions 640 of media content having close-up types.

圖7為說明根據本揭露的示範性實施例中的一者的統一時間軸上的第一媒體內容710和第二媒體內容720的同步的圖式。在示範性實施例中，第二媒體內容720為以近鏡頭類型和遠鏡頭類型擷取的連續圖像，其中由物件在第二內容720的幀中佔據的區域增大且減小。如上所述，使用者可趨向於在被視為事件的重點的時刻在對象處拉近。因此，處理器110將分析第二媒體內容720且將重疊部分740內的第二媒體內容720的幀721識別為重點。接著，處理器110會將具有由第三時間T3和第四時間T4界定且居中於第二媒體內容的幀721的時戳的預定持續時間的時段識別為重點期間750，這是因為由物件在第二媒體內容720的幀721中佔據的區域超過預定比率(例如，60%)。因此，處理器110將選擇重點期間750內的第一媒體內容710和第二媒體內容720，作為用於產生MPOV視訊的候選來源。 FIG. 7 is a diagram illustrating the unification of one of the exemplary embodiments in accordance with the present disclosure. A synchronized schema of the first media content 710 and the second media content 720 on the timeline. In an exemplary embodiment, the second media content 720 is a continuous image captured in a near shot type and a far shot type, wherein the area occupied by the object in the frame of the second content 720 increases and decreases. As noted above, the user may tend to zoom in at the object at a time that is considered the focus of the event. Accordingly, processor 110 will analyze second media content 720 and identify frame 721 of second media content 720 within overlapping portion 740 as a focus. Next, the processor 110 identifies the time period having the predetermined duration of the time stamp defined by the third time T3 and the fourth time T4 and centered on the frame 721 of the second media content as the focus period 750, because the object is at The area occupied by the frame 721 of the second media content 720 exceeds a predetermined ratio (for example, 60%). Accordingly, processor 110 will select first media content 710 and second media content 720 within highlight period 750 as candidate sources for generating MPOV video.

圖8為說明根據本揭露的示範性實施例中的一者的在第一媒體內容810為靜止圖像時的第一媒體內容810和第二媒體內容820的同步的圖式。在示範性實施例中，第一媒體內容810為靜止圖像，且第二媒體內容820為連續圖像(例如，視訊或連拍圖像)。處理器110將識別重疊部分，其中第一媒體內容810在由第一媒體內容810的時戳界定的第五時間T5與第二媒體內容820重疊。處理器110將識別在第三時間T3與第四時間T4之間居中於第一媒體內容810的時戳的重點期間850，這是因為第一媒體內容810的媒體類型為靜止圖像。換句話說，處理器110可選擇在第五時間T5之前和之後具有預定持續時間的時間間隔作為重點期間850。然而，本揭露不限於此，重點期間850可透過其他方式來選擇。因此，處理器110將選擇重點期間850內的第一媒體內容810和第二媒體內容820，作為用於產生MPOV視訊的候選來源。 FIG. 8 is a diagram illustrating synchronization of first media content 810 and second media content 820 when first media content 810 is a still image, in accordance with one of the exemplary embodiments of the present disclosure. In an exemplary embodiment, the first media content 810 is a still image and the second media content 820 is a continuous image (eg, a video or continuous image). The processor 110 will identify the overlapping portion, wherein the first media content 810 overlaps the second media content 820 at a fifth time T5 defined by the timestamp of the first media content 810. The processor 110 will identify a focus period 850 that is centered on the timestamp of the first media content 810 between the third time T3 and the fourth time T4 because the media type of the first media content 810 is a still image. In other words, the processor 110 may select a time interval having a predetermined duration before and after the fifth time T5 as a focus. Period 850. However, the disclosure is not limited thereto, and the key period 850 can be selected by other means. Accordingly, processor 110 will select first media content 810 and second media content 820 within highlight period 850 as candidate sources for generating MPOV video.

圖9為說明根據本揭露的示範性實施例中的一者的第一媒體內容910、第二媒體內容920和第三媒體內容930的同步的圖式。在示範性實施例中，作為連續圖像的第二媒體內容920和作為靜止圖像的第三媒體內容930是由同一電子裝置在同一事件時期擷取的。可在第二媒體內容920的擷取的同時獲得第三媒體內容930。處理器110將識別重疊部分940，其中第一媒體內容910在第一時間T1與第二時間T2之間與第二媒體內容920和第三媒體內容930重疊。接著，處理器110將識別界定於第三時間T3與第四時間T4之間且居中於第三媒體內容930的時戳的重點期間950。重點期間950的時間間隔的確定類似於圖8所說明的示範性實施例，且因此此處被省略。因此，處理器110將選擇重點期間950內的第一媒體內容910、第二媒體內容920和第三媒體內容930，作為用於產生MPOV視訊的候選來源。 9 is a diagram illustrating synchronization of first media content 910, second media content 920, and third media content 930 in accordance with one of the exemplary embodiments of the present disclosure. In an exemplary embodiment, the second media content 920 as a continuous image and the third media content 930 as a still image are captured by the same electronic device during the same event period. The third media content 930 can be obtained while the second media content 920 is being captured. The processor 110 will identify the overlapping portion 940, wherein the first media content 910 overlaps the second media content 920 and the third media content 930 between the first time T1 and the second time T2. Processor 110 will then identify a focus period 950 that is defined between the third time T3 and the fourth time T4 and is centered on the timestamp of the third media content 930. The determination of the time interval of the focus period 950 is similar to the exemplary embodiment illustrated in FIG. 8, and thus is omitted herein. Accordingly, processor 110 will select first media content 910, second media content 920, and third media content 930 within highlight period 950 as candidate sources for generating MPOV video.

圖10A和圖10B為說明根據本揭露的實施例中的一者的產生MPOV視訊的方法的流程圖。參看圖10A，在步驟1010中，示範性電子裝置100將獲得多個媒體內容。在步驟S1020中，示範性電子裝置100將基於對應於媒體內容中的每一者的每一元資料而從多個媒體內容識別第一媒體內容和第二媒體內容，作為與同一事件相關的相關媒體內容，其中元資料至少包括時間資訊或位置資訊。在步驟S030中，示範性電子裝置100將根據相關媒體內容而產生多視點(MPOV)視訊。 10A and 10B are flow diagrams illustrating a method of generating MPOV video in accordance with one of the embodiments of the present disclosure. Referring to FIG. 10A, in step 1010, the exemplary electronic device 100 will obtain a plurality of media content. In step S1020, the exemplary electronic device 100 will identify the first media content and the second media content from the plurality of media content based on each meta-data corresponding to each of the media content, as Relevant media content related to the same event, wherein the meta-data includes at least time information or location information. In step S030, the exemplary electronic device 100 will generate multi-view (MPOV) video based on the relevant media content.

關於圖10A中的步驟S1020的第一媒體內容和第二媒體內容的識別的細節將如下描述於圖10B中。參看圖10B，在步驟S1011中，示範性電子裝置100將透過比較第一媒體內容的時間碼和第二媒體內容的時間碼而將第一媒體內容和第二媒體內容識別為在時間上與同一事件相關的相關媒體內容。在步驟S1012中，示範性電子裝置100可透過比較第一媒體內容和第二媒體內容的音頻資訊而將第一媒體內容和第二媒體內容識別為在位置上與同一事件相關的相關媒體內容。在步驟S1013中，示範性電子裝置100可確定精度資料是否在預定精度範圍內，且接著在精度資料在預定精度範圍內時，透過確定第一媒體內容的GPS位置與第二媒體內容的GPS位置之間的差是否在預定距離內，而將第一媒體內容和第二媒體內容識別為在位置上與同一事件相關的相關媒體內容。在步驟S1014中，示範性電子裝置100可透過比較第一媒體內容的列表與第二媒體內容的列表之間的附近裝置的次序而將第一媒體內容和第二媒體內容識別為在位置上與同一事件相關的相關媒體內容。 Details regarding the identification of the first media content and the second media content of step S1020 in FIG. 10A will be described below in FIG. 10B. Referring to FIG. 10B, in step S1011, the exemplary electronic device 100 will identify the first media content and the second media content as being temporally identical by comparing the time code of the first media content with the time code of the second media content. Relevant media content related to the event. In step S1012, the exemplary electronic device 100 may identify the first media content and the second media content as related media content related in position to the same event by comparing the audio information of the first media content and the second media content. In step S1013, the exemplary electronic device 100 may determine whether the accuracy data is within a predetermined accuracy range, and then determine the GPS position of the first media content and the GPS position of the second media content when the accuracy data is within the predetermined accuracy range. Whether the difference is within a predetermined distance, and the first media content and the second media content are identified as related media content that is related in position to the same event. In step S1014, the exemplary electronic device 100 may identify the first media content and the second media content as being in position by comparing the order of the nearby devices between the list of the first media content and the list of the second media content. Relevant media content related to the same event.

圖11為說明根據本揭露的實施例中的一者的基於重點期間內的媒體內容而產生MPOV視訊的方法的流程圖。在圖11所說明的示範性實施例中，步驟S1110和S1120將類似於圖10所說明的步驟S1010和S1020，且因此，步驟S1110和S1120的描述得以省略。 11 is a flow diagram illustrating a method of generating MPOV video based on media content within a focused period, in accordance with one of the embodiments of the present disclosure. In the exemplary embodiment illustrated in FIG. 11, steps S1110 and S1120 will be similar to that illustrated in FIG. Steps S1010 and S1020, and therefore, the description of steps S1110 and S1120 are omitted.

參看圖11，在步驟S1130中，示範性電子裝置100將識別第一媒體內容和第二媒體內容在時間上重疊的重疊部分。在步驟S1140中，示範性電子裝置100將透過分析與重疊部分相關聯的第一媒體內容和第二媒體內容的內容特徵來識別重點期間。在步驟S1150中，示範性電子裝置100將從重點期間內的第一媒體內容和第二媒體內容產生MPOV視訊。 Referring to FIG. 11, in step S1130, the exemplary electronic device 100 will identify overlapping portions of the first media content and the second media content that overlap in time. In step S1140, the exemplary electronic device 100 will identify the focus period by analyzing the content characteristics of the first media content and the second media content associated with the overlapping portion. In step S1150, the exemplary electronic device 100 generates MPOV video from the first media content and the second media content in the focus period.

在下文中，將參考圖10A到圖10B和圖11所說明的方法以特定示範性實施例來進一步闡述示範性電子裝置100。本揭露的本實施例將根據圖10A和圖10B所說明的步驟來確定媒體內容中的每一者是否在時間和位置上與事件相關，且被識別為在時間和位置上與同一事件相關的媒體內容將被識別為相關媒體內容。接著，示範性電子裝置100將提供被識別為與同一事件相關的媒體內容，作為用於產生MPOV視訊的候選媒體內容。此外，根據圖11所說明的步驟，示範性電子裝置100將同步相關媒體內容以識別相關媒體內容在時間上重疊的重疊部分。接著，將在重疊部分內識別重點期間以便產生具有事件的重點的MPOV視訊。 In the following, the exemplary electronic device 100 will be further elaborated with specific exemplary embodiments with reference to the methods illustrated in FIGS. 10A-10B and 11. The present embodiment of the present disclosure will determine, based on the steps illustrated in Figures 10A and 10B, whether each of the media content is related to an event in time and location, and is identified as being associated with the same event in time and location. Media content will be identified as relevant media content. Next, the exemplary electronic device 100 will provide media content that is identified as being associated with the same event as candidate media content for generating MPOV video. Moreover, in accordance with the steps illustrated in FIG. 11, exemplary electronic device 100 will synchronize relevant media content to identify overlapping portions of temporally overlapping related media content. Next, the highlight period will be identified within the overlap to generate MPOV video with the focus of the event.

在本揭露的示範性實施例中的一者中，圖3所說明的示範性電子裝置100可為圖1所說明的電子裝置10、20、30中的任一者。換句話說，表示事件的場景的媒體內容可由電子裝置10、20、30中的每一者的圖像擷取元件170從事件的不同視點擷取，且儲存在電子裝置10、20、30中的每一者的儲存媒體150中。 In one of the exemplary embodiments of the present disclosure, the exemplary electronic device 100 illustrated in FIG. 3 can be any of the electronic devices 10, 20, 30 illustrated in FIG. In other words, the media content representing the scene of the event may be captured by the image capture component 170 of each of the electronic devices 10, 20, 30 from different viewpoints of the event, And stored in the storage medium 150 of each of the electronic devices 10, 20, 30.

參看圖1，第一電子裝置10可為起始產生MPOV視訊的過程且請求媒體內容的內容請求者，且第二電子裝置20和/或第三電子裝置30可為將媒體內容提供到電子裝置10以產生事件的MPOV視訊的內容提供者。在示範性實施例中，第一電子裝置10可回應於嵌入在媒體內容的元資料中或與媒體內容的元資料相關聯的時間碼、音頻資訊和位置資訊(包含地理標籤和周圍信號資訊)而在多個媒體內容中識別相關媒體內容。在本揭露的示範性實施例中的一者中，第一電子裝置10可將相關媒體內容分組為媒體合輯。應注意，將媒體內容分組為媒體合輯(例如，媒體集)可在媒體內容的擷取後根據元資料來自動地執行或由用戶例如將媒體內容插入到合輯中或移除媒體內容來手動地執行，本揭露不限於此。 Referring to FIG. 1, the first electronic device 10 may be a content requester that initiates a process of generating MPOV video and requests media content, and the second electronic device 20 and/or the third electronic device 30 may provide media content to the electronic device. 10 to the content provider of the MPOV video that generated the event. In an exemplary embodiment, the first electronic device 10 may respond to time code, audio information, and location information (including geotags and surrounding signal information) embedded in the metadata of the media content or associated with the metadata of the media content. And identifying related media content in multiple media content. In one of the exemplary embodiments of the present disclosure, the first electronic device 10 may group the related media content into a media mix. It should be noted that grouping media content into media mixes (eg, media sets) may be performed automatically based on metadata after the media content has been retrieved or manually by the user, for example, inserting the media content into the compilation or removing the media content. Execution, the disclosure is not limited to this.

接著，第一電子裝置10可將相關媒體內容的元資料作為一組相關性準則傳輸到第二電子裝置20和/或第三電子裝置30以請求相關媒體內容(即，元資料交換)。在本實施例中，所述一組相關性準則可包含時間資訊、音頻資訊和位置資訊。第二電子裝置20將根據從第一電子裝置10傳輸的相關媒體內容的元資料而在由第二電子裝置20擷取的多個媒體內容中識別相關媒體內容。換句話說，第二電子裝置20將回應于對應於由第一電子裝置10擷取的相關媒體內容的時間資訊、視訊資訊、位置資訊而在由第二電子裝置20擷取的媒體內容中識別相關媒體媒體。此外，第三電子裝置30將執行與電子裝置20類似的程式，因此，關於第三電子裝置的描述將得以省略。 Next, the first electronic device 10 may transmit the metadata of the relevant media content as a set of relevance criteria to the second electronic device 20 and/or the third electronic device 30 to request relevant media content (ie, metadata exchange). In this embodiment, the set of correlation criteria may include time information, audio information, and location information. The second electronic device 20 will identify the relevant media content among the plurality of media content retrieved by the second electronic device 20 based on the metadata of the related media content transmitted from the first electronic device 10. In other words, the second electronic device 20 will recognize the media content captured by the second electronic device 20 in response to the time information, the video information, and the location information corresponding to the related media content captured by the first electronic device 10. Relevant media media. In addition, the third The electronic device 30 will execute a program similar to the electronic device 20, and therefore, a description about the third electronic device will be omitted.

應注意，本揭露不限於上述元資料交換。在示範性實施例中的一者中，第一電子裝置10可根據媒體合輯內的媒體內容而獲得一組相關性準則。在另一示範性實施例中，所述一組相關性準則可由用戶配置以指示用戶的興趣。舉例來說，可建立媒體合輯以收集針對棒球比賽在5月1日下午5：00到下午9：00之間在棒球場擷取的媒體內容。因此，可根據媒體合輯內的媒體內容來確定預定時段(例如，5月1日下午5：00到下午9：00)和感興趣的特定位置(例如，棒球場)的特定地理座標。可透過在MPOV視訊的產生的起始後分析相關媒體內容來獲得所述一組相關性準則的音頻資訊。舉例來說，可在MPOV視訊的產生的起始後提取媒體合輯內的媒體內容的音頻資訊作為所述一組相關性準則的參數中的一者。此外，特定位置處的周圍信號資訊可獲自媒體合輯內的媒體內容的元資料或由用戶手動地配置。第一電子裝置10將接著從第二電子裝置20或第三電子裝置30接收對應於所述一組相關性準則的相關媒體內容，作為MPOV視訊的候選(相關)媒體內容。 It should be noted that the disclosure is not limited to the above meta-data exchange. In one of the exemplary embodiments, the first electronic device 10 may obtain a set of correlation criteria based on media content within the media mix. In another exemplary embodiment, the set of correlation criteria may be configured by a user to indicate an interest of the user. For example, a media mix can be created to collect media content captured at a baseball stadium for a baseball game between 5:00 PM and 9:00 PM on May 1. Thus, a particular geographic coordinate for a predetermined time period (eg, 5:00 pm to 9:00 pm on May 1st) and a particular location of interest (eg, a baseball field) may be determined based on media content within the media mix. The audio information of the set of correlation criteria can be obtained by analyzing the relevant media content after the start of the generation of the MPOV video. For example, audio information of the media content within the media mix may be extracted as one of the parameters of the set of correlation criteria after the initiation of the generation of the MPOV video. Additionally, ambient signal information at a particular location may be obtained from metadata of media content within the media mix or manually configured by the user. The first electronic device 10 will then receive relevant media content corresponding to the set of correlation criteria from the second electronic device 20 or the third electronic device 30 as candidate (related) media content for the MPOV video.

此外，第一電子裝置10將同步相關媒體內容以識別由第一電子裝置10、第二電子裝置20和第三電子裝置30擷取的相關媒體內容在時間上重疊的重疊部分，且接著基於重疊部分內的媒體內容的內容特徵而識別重點期間。處理器110將接著透過組合/ 混合重點期間內的相關媒體內容而產生MPOV視訊。 Furthermore, the first electronic device 10 will synchronize the relevant media content to identify overlapping portions of the temporally overlapping related media content retrieved by the first electronic device 10, the second electronic device 20, and the third electronic device 30, and then based on the overlap The content characteristics of the media content within the portion identify the key period. Processor 110 will then pass through the combination / MPOV video is generated by mixing relevant media content during the key period.

總的來說，本揭露的示範性電子裝置將根據嵌入在媒體內容中的每一者的元資料中或與媒體內容中的每一者的元資料相關聯的時間資訊、音頻資訊和位置資訊(包含地理標籤和周圍信號資訊)而識別在時間和位置上與同一事件相關的第一媒體內容和第二媒體內容。接著，將提供第一媒體內容和第二媒體內容，作為用於產生事件的MPOV視訊的相關媒體內容。本揭露將進一步同步相關媒體內容以識別相關媒體內容在時間上重疊的重疊部分。從重疊部分，可根據相關媒體內容的內容特徵來識別事件的重點期間。因此，可根據重點期間內的相關媒體內容而產生MPOV視訊。 In general, the exemplary electronic device of the present disclosure will associate time information, audio information, and location information in metadata associated with each of the media content or metadata associated with each of the media content. Identifying the first media content and the second media content related to the same event in time and location (including the geotag and surrounding signal information). Next, the first media content and the second media content will be provided as the relevant media content of the MPOV video used to generate the event. The present disclosure will further synchronize relevant media content to identify overlapping portions of related media content that overlap in time. From the overlapping portion, the key period of the event can be identified based on the content characteristics of the relevant media content. Therefore, MPOV video can be generated based on relevant media content during the key period.

用於本申請案的所揭露實施例的詳細描述中的元件、動作或指令不應解釋為對本揭露來說為絕對關鍵或必要的，除非明確地如此描述。而且，如本文中所使用，用詞“一”可包含一個以上項目。如果打算指僅一個項目，那麼將使用術語“單一”或類似語言。此外，如本文中所使用，在多個項目和/或多個項目種類的列表之前的術語“中的任一者”希望包含所述項目和/或項目種類個別地或結合其他項目和/或其他項目種類“中的任一者”、“中的任何組合”、“中的任何多個”和/或“中的多個的任何組合”。另外，如本文中所使用，術語“集合”希望包含任何數量個項目，包含零個。另外，如本文中所使用，術語“數量”希望包含任何數量，包含零。 The elements, acts or instructions in the detailed description of the disclosed embodiments of the present application should not be construed as being critical or essential to the present disclosure unless explicitly described. Moreover, as used herein, the word "a" can encompass more than one item. If you intend to refer to only one item, the term "single" or similar language will be used. Moreover, as used herein, the term "any of" preceding a list of items and/or plurality of item categories is intended to encompass the item and/or item type individually or in combination with other items and/or Any of the other item categories "any of the combinations", "any combination of", "any of the plurality", and/or any combination of the plurality. Also, as used herein, the term "set" is intended to encompass any number of items, including zero. Also, as used herein, the term "amount" is intended to include any quantity, including zero.

雖然本發明已以實施例揭露如上，然其並非用以限定本發明，任何所屬技術領域中具有通常知識者，在不脫離本發明的精神和範圍內，當可作些許的更動與潤飾，故本發明的保護範圍當視後附的申請專利範圍所界定者為準。 Although the present invention has been disclosed in the above embodiments, it is not intended to limit the present invention, and any one of ordinary skill in the art can make some changes and refinements without departing from the spirit and scope of the present invention. The scope of the invention is defined by the scope of the appended claims.

S1010~S1030‧‧‧步驟 S1010~S1030‧‧‧Steps

Claims

A method for generating multi-viewpoint (MPOV) video for an electronic device, the method comprising: obtaining a plurality of media content; and from the plurality of media based on each meta-data corresponding to each of the media content The content identifies the first media content and the second media content as related media content related to the same event, wherein the metadata includes at least time information or location information, wherein the first media content is based on a first viewpoint Taking the second media content according to a second viewpoint, wherein the first viewpoint is different from the second viewpoint; generating the multi-view video according to the related media content; Multi-view video, displaying the first media content and the second media content related to the same event by a display screen.

The method of claim 1, further comprising: comparing the first media content and the second media content by comparing audio information between the first media content and the second media content Recognized as the related media content.

The method of claim 2, wherein the audio information comprises a value and a time stamp of distinguishable features of the audio waveform.

The method of claim 1, wherein determining the related media content further comprises: comparing a time code of the first media content with the second media content The first media content and the second media content are identified as time code to the related media content that is temporally related to the same event.

The method of claim 1, wherein the location information comprises a geotag, and the geotag includes global positioning system location and accuracy data, and the first media content and the content are from the media content Determining, by the second media content, the related media content related to the same event includes: determining whether the precision data is within a predetermined accuracy range, and determining, by the determining, that the precision data is within the predetermined accuracy range Identifying whether the difference between the global positioning system location of the media content and the global positioning system location of the second media content is within a predetermined distance, and identifying the first media content and the second media content The related media content related to the same event in position.

The method of claim 1, wherein the location information comprises surrounding signal information, and the surrounding signal information comprises a list of signal strengths of nearby devices, and the correlation is determined from the media content. The media content further includes: comparing the nearby devices listed in the list of the first media content and the nearby devices listed in the list of the second media content The first media content and the second media content are identified as the related media content that is related in position to the same event.

The method of claim 6, wherein the nearby device comprises an access point or other electronic device.

For example, the method described in claim 1 of the patent scope further includes: Transmitting a set of correlation criteria to another electronic device, the set of correlation criteria for identifying third media content retrieved by the another electronic device as related to the first media content and the One of the related media content of the second media content being the same event; and receiving the third media content from the another electronic device to generate the multi-view video.

The method of claim 8, wherein the set of correlation criteria is determined based on the time information, audio information, and the location information of the first media content and the second media content. of.

The method of claim 1, wherein the first media content and the second media content comprise still images, continuous images, and audio recordings.

A video processing device, comprising: a processor configured to: obtain a plurality of media content; identify a first media content from the plurality of media content based on each meta-data corresponding to each of the media content a second media content, as related media content related to the same event, wherein the metadata includes at least time information or location information, wherein the first media content is captured according to a first viewpoint, the second media The content is captured according to a second viewpoint, wherein the first viewpoint is different from the second viewpoint; the multi-view video is generated according to the related media content; and according to the multi-view video, by one Display screen synchronization display is the same as the above The first media content related to the event and the second media content.

The video processing device of claim 11, wherein the processor is further configured to compare the first media by comparing audio information between the first media content and the second media content The content and the second media content are identified as the related media content.

The video processing device of claim 12, wherein the audio information comprises a value and a time stamp of distinguishable features of the audio waveform.

The video processing device of claim 11, wherein the processor is further configured to compare the first time of the first media content with a time code of the second media content The media content and the second media content are identified as the related media content that is temporally related to the same event.

The video processing device of claim 11, wherein the location information comprises a geotag, and the geotag includes GPS location and accuracy data, and the processor is further configured to determine whether the accuracy data is Determining a difference between the GPS location of the first media content and the GPS location of the second media content within a predetermined accuracy range and when the accuracy profile is within the predetermined accuracy range Whether the first media content and the second media content are identified as the related media content that is related in position to the same event, within a predetermined distance range.

The video processing device of claim 11, wherein the location information includes surrounding signal information, and the surrounding signal information includes a list of signal strengths of nearby devices, and the processor is further configured. Through Comparing the first media content and the nearby devices listed in the list of the second media content and the nearby devices listed in the list of the second media content The second media content is identified as the related media content that is related in position to the same event.

The video processing device of claim 16, wherein the nearby device comprises an access point and other electronic devices.

The video processing device of claim 11, further comprising: a transceiver configured to: transmit a set of correlation criteria to another electronic device, the set of correlation criteria being used by the other Identifying, by the electronic device, the third media content as one of the related media content related to the same event as the first media content and the second media content; and from the another electronic device Receiving the third media content to generate the multi-view video.

The video processing device of claim 18, wherein the set of correlation criteria is based on the time information, audio information, and the location information of the first media content and the second media content. To determine.

The video processing device of claim 11, wherein the first media content and the second media content comprise a still image, a continuous image, and an audio recording.