TW201411552A - An image enhancement apparatus - Google Patents

An image enhancement apparatus Download PDF

Info

Publication number
TW201411552A
TW201411552A TW102132551A TW102132551A TW201411552A TW 201411552 A TW201411552 A TW 201411552A TW 102132551 A TW102132551 A TW 102132551A TW 102132551 A TW102132551 A TW 102132551A TW 201411552 A TW201411552 A TW 201411552A
Authority
TW
Taiwan
Prior art keywords
audio signal
audio
images
image
determining
Prior art date
Application number
TW102132551A
Other languages
Chinese (zh)
Inventor
Rajeswari Kannan
Ravi Shenoy
Pushkar Prasad Patwardhan
Original Assignee
Nokia Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Corp filed Critical Nokia Corp
Publication of TW201411552A publication Critical patent/TW201411552A/en

Links

Landscapes

  • Processing Or Creating Images (AREA)

Abstract

A method comprising: analysing at least two images to determine at least one object mutual to the at least two images, the object having a periodicity of motion; generating an animated image based on the at least two images, wherein the at least one object is animated; determining at least one audio signal associated with the at least one object; and combining the at least one audio signal with the animated image to generate an audio enabled animated image.

Description

影像加強裝置 Image enhancement device 發明領域 Field of invention

本發明係有關於為影像提供一種額外的功能。本發明更有關於,但並不侷限於,顯示裝置,可為在移動裝置中所顯示的影像提供額外的功能。 The present invention is directed to providing an additional function for images. More particularly, but not exclusively, the present invention provides display devices that provide additional functionality for images displayed in the mobile device.

發明背景 Background of the invention

許多可攜式的裝置,例如行動電話,都裝配有諸如一種玻璃或塑膠的顯示視窗以提供資訊給該用戶。而且,如此的顯示視窗現在通常被用做為觸控輸入。在一些更先進的裝置中,該裝置裝配有適合用來產生可聽得見的反饋的換能器。 Many portable devices, such as mobile phones, are equipped with a display window such as a glass or plastic to provide information to the user. Moreover, such display windows are now commonly used as touch inputs. In some more advanced devices, the device is equipped with a transducer suitable for producing audible feedback.

影像和動畫影像是吾人所熟知的。動畫影像或是會動照片影像可以提供該檢視正在觀看一種視訊的錯覺。該會動照片通常是靜態的照片,在其中發生有一種細微和反覆性的動作。這些是特別有用的,因為比起傳統的視訊,他們可以使用顯著較小的頻寬來在裝置之間被傳輸或發送。 Imagery and animated images are well known to us. Animated images or moving photo images can provide the illusion that the view is watching a video. The moving photo is usually a static photo in which a subtle and repetitive action takes place. These are particularly useful because they can be transmitted or transmitted between devices using significantly smaller bandwidths than conventional video.

發明概要 Summary of invention

根據第一方面,本發明提供一種方法,其包含有:分析至少兩張影像以確定該至少兩張影像共同的至少一物件,該物件具有一種週期性的運動;基於該至少兩張影像產生一種動畫影像,其中該至少一個物件被製作成動畫;確定與該至少一個物件關聯的至少一個音頻訊號;以及結合該至少一個音頻訊號和該動畫影像以產生一種音訊致能的動畫影像。 According to a first aspect, the present invention provides a method, comprising: analyzing at least two images to determine at least one object common to the at least two images, the object having a periodic motion; generating one based on the at least two images An animated image, wherein the at least one object is animated; determining at least one audio signal associated with the at least one object; and combining the at least one audio signal and the animated image to produce an audio-enabled animated image.

確定與該至少一個物件關聯的至少一個音頻訊號可以包含有:接收該至少一個音頻訊號,其中該至少一個音頻訊號其至少有一部分本質上和該至少兩張影像是在同一時間被擷取;以及濾波該至少一個音頻訊號。 Determining the at least one audio signal associated with the at least one object may include: receiving the at least one audio signal, wherein at least a portion of the at least one audio signal is substantially captured at the same time as the at least two images; Filtering the at least one audio signal.

接收該至少一個音頻訊號可以包含有從至少一支麥克風處接收該至少一個音頻訊號。 Receiving the at least one audio signal can include receiving the at least one audio signal from the at least one microphone.

濾波該至少一個音頻訊號可以包含有:確定該至少一個音頻訊號的至少一個前景音源;濾波該至少一個音頻訊號以從該至少一個音頻訊號處移除該至少一個前景音源以產生一種環音聲音頻訊號,作為與該至少一個物件關聯的至少一個音頻訊號。 Filtering the at least one audio signal may include: determining at least one foreground sound source of the at least one audio signal; filtering the at least one audio signal to remove the at least one foreground sound source from the at least one audio signal to generate a ring sound audio a signal as at least one audio signal associated with the at least one object.

濾波該至少一個音頻訊號以選擇該至少一個音頻訊號的至少一個部分可以包含有:確定該至少一個音頻訊號的至少一個前景音源;濾波該至少一個音頻訊號以從該至少一個音頻訊號處抽取出該至少一個前景音源以產生至少一個前景音源音頻訊號,作為與該至少一個物件關聯的至少一個音頻訊號。 Filtering the at least one audio signal to select at least one portion of the at least one audio signal may include: determining at least one foreground sound source of the at least one audio signal; filtering the at least one audio signal to extract the at least one audio signal At least one foreground source to generate at least one foreground source audio signal as at least one audio signal associated with the at least one object.

確定與該至少一個物件關聯的至少一個音頻訊號可以包含有:接收該至少一個物件共同於該至少兩張影像的一種指示;指出該至少一個物件共同於該至少兩張影像;以及基於該指出的物件產生該至少一個音頻訊號。 Determining that the at least one audio signal associated with the at least one object can include: receiving an indication that the at least one object is common to the at least two images; indicating that the at least one object is common to the at least two images; and based on the pointing The object generates the at least one audio signal.

指出該至少一個物件可以包含有:確定與該至少兩張影像關聯的一個位置;基於該至少兩張影像的位置指出至少一個物件;選取該至少一個物件中的至少一個。 The pointing at least one object may include: determining a location associated with the at least two images; indicating at least one object based on a location of the at least two images; selecting at least one of the at least one object.

指出該至少一個物件可以包含對共同於該至少兩張影像之該至少一個物件進行圖型識別分析以指出該物件。 It is pointed out that the at least one object may include pattern recognition analysis of the at least one object common to the at least two images to indicate the object.

確定與該至少一個物件關聯的至少一個音頻訊號可以包含有以下的至少一個:從一個外部的音訊資料庫接收該至少一個音頻訊號;從一個內部的音訊資料庫接收該至少一個音頻訊號;以及合成該至少一個音頻訊號。 Determining that the at least one audio signal associated with the at least one object may include at least one of: receiving the at least one audio signal from an external audio database; receiving the at least one audio signal from an internal audio database; and synthesizing The at least one audio signal.

該方法更可包含有:產生一種音訊串流模型來控制該音頻訊號的呈現;以及使用該音訊串流模型來處理該至少一個音頻訊號。 The method may further include: generating an audio stream model to control the presentation of the audio signal; and processing the at least one audio signal using the audio stream model.

產生該音訊串流模型可以包含有:確定音頻訊號的音調;確定音頻訊號的音量;確定音頻訊號的播放速度;確定音頻訊號的重複週期;確定音頻訊號的開始;以及確定音頻訊號的結束。 The generating the audio stream model may include: determining a tone of the audio signal; determining a volume of the audio signal; determining a playback speed of the audio signal; determining a repetition period of the audio signal; determining a start of the audio signal; and determining an end of the audio signal.

該方法更可包含把該至少一個音頻訊號與該動畫影像做時間同步以產生一個同步化的音訊致能動畫影像。 The method may further comprise time synchronizing the at least one audio signal with the animated image to generate a synchronized audio enabled animation image.

分析至少兩張影像以確定該至少兩張影像共同的至少一物件,該物件具有一種週期性的運動,其更可包含有:從一影像源接收該至少兩張影像;輸出該至少兩張影像到一顯示器上;以及接收至少一個用戶輸入確定該至少兩張影像共同的該至少一物件。 And analyzing at least two images to determine at least one object common to the at least two images, the object having a periodic motion, and further comprising: receiving the at least two images from an image source; and outputting the at least two images Going to a display; and receiving at least one user input to determine the at least one object common to the at least two images.

分析至少兩張影像以確定該至少兩張影像共同的至少一物件,該物件具有一種週期性的運動,其更可包含有:關聯分析該至少兩張影像以確定至少一個候選物件,該至少一個候選物件共同於該至少兩張影像並具有一種週期性的運動;以及選擇該至少一個候選物件中的至少一個作為該至少一個物件。 And analyzing at least two images to determine at least one object common to the at least two images, the object having a periodic motion, and further comprising: correlating and analyzing the at least two images to determine at least one candidate object, the at least one The candidate objects are common to the at least two images and have a periodic motion; and at least one of the at least one candidate object is selected as the at least one object.

一種裝置可被配置來執行如本發明所描述的方法。 A device can be configured to perform the methods as described herein.

一種包含程式指令的計算機程式產品可使一種裝置執行如本發明所描述的方法。 A computer program product comprising program instructions can cause a device to perform a method as described herein.

一種方法其本質上可如同本發明所描述和如同所附圖示所闡明。 One method can be as essentially described in the present invention and as illustrated in the accompanying drawings.

一種裝置其本質上可如同本發明所描述和如同所附圖示所闡明。 A device can be as essentially described in the present invention and as illustrated in the accompanying drawings.

根據第二方面,本發明提供一種裝置,其包含有至少一個處理器和至少一個內儲有一個或多個程式指令碼的記憶體,該至少一個記憶體和該指令碼被配置成與該至少一個處理器一起使得該裝置至少執行:分析至少兩張影像以確定該至少兩張影像共同的至少一物件,該物件具有 一種週期性的運動;基於該至少兩張影像產生一種動畫影像,其中該至少一個物件被製作成動畫;確定與該至少一個物件關聯的至少一個音頻訊號;以及結合該至少一個音頻訊號和該動畫影像以產生一種音訊致能的動畫影像。 According to a second aspect, the present invention provides an apparatus comprising at least one processor and at least one memory having one or more program instruction codes stored therein, the at least one memory and the instruction code being configured to be at least A processor together causes the apparatus to perform at least: analyzing at least two images to determine at least one object common to the at least two images, the object having a periodic motion; generating an animated image based on the at least two images, wherein the at least one object is animated; determining at least one audio signal associated with the at least one object; and combining the at least one audio signal with the animation The image is used to produce an animated image that is audio enabled.

確定與該至少一個物件關聯的至少一個音頻訊號可以使該裝置執行:接收該至少一個音頻訊號,其中該至少一個音頻訊號其至少有一部分本質上和該至少兩張影像是在同一時間被擷取;以及濾波該至少一個音頻訊號。 Determining that the at least one audio signal associated with the at least one object is operative to: receive the at least one audio signal, wherein at least a portion of the at least one audio signal is substantially captured at the same time as the at least two images And filtering the at least one audio signal.

接收該至少一個音頻訊號可以使該裝置執行從至少一支麥克風處接收該至少一個音頻訊號。 Receiving the at least one audio signal may cause the apparatus to perform receiving the at least one audio signal from the at least one microphone.

濾波該至少一個音頻訊號可以使該裝置執行:確定該至少一個音頻訊號的至少一個前景音源;濾波該至少一個音頻訊號以從該至少一個音頻訊號處移除該至少一個前景音源以產生一種環音聲音頻訊號,作為與該至少一個物件關聯的至少一個音頻訊號。 Filtering the at least one audio signal may cause the apparatus to perform: determining at least one foreground sound source of the at least one audio signal; filtering the at least one audio signal to remove the at least one foreground sound source from the at least one audio signal to generate a ring sound An audible audio signal as at least one audio signal associated with the at least one object.

濾波該至少一個音頻訊號以選擇該至少一個音頻訊號的至少一個部分可以使該裝置執行:確定該至少一個音頻訊號的至少一個前景音源;濾波該至少一個音頻訊號以從該至少一個音頻訊號處抽取出該至少一個前景音源以產生至少一個前景音源音頻訊號,作為與該至少一個物件關聯的至少一個音頻訊號。 Filtering the at least one audio signal to select at least one portion of the at least one audio signal may cause the apparatus to perform: determining at least one foreground sound source of the at least one audio signal; filtering the at least one audio signal to extract from the at least one audio signal The at least one foreground source is generated to generate at least one foreground source audio signal as at least one audio signal associated with the at least one object.

確定與該至少一個物件關聯的至少一個音頻訊號可以使該裝置執行:接收該至少一個物件共同於該至少兩張影像的一種指示;指出該至少一個物件共同於該至少 兩張影像;以及基於該指出的物件產生該至少一個音頻訊號。 Determining that the at least one audio signal associated with the at least one object is executable by the apparatus to: receive an indication that the at least one object is common to the at least two images; indicating that the at least one object is common to the at least one object Two images; and generating the at least one audio signal based on the pointed object.

指出該至少一個物件可以使該裝置執行:確定與該至少兩張影像關聯的一個位置;基於該至少兩張影像的位置指出至少一個物件;選取該至少一個物件中的至少一個。 Indicates that the at least one object can cause the apparatus to perform: determining a location associated with the at least two images; indicating at least one object based on a location of the at least two images; selecting at least one of the at least one object.

指出該至少一個物件可以使該裝置對共同於該至少兩張影像之該至少一個物件執行圖型識別分析以指出該物件。 It is pointed out that the at least one object can cause the apparatus to perform a pattern recognition analysis on the at least one object common to the at least two images to indicate the object.

確定與該至少一個物件關聯的至少一個音頻訊號可以使該裝置執行以下的至少一個:從一個外部的音訊資料庫接收該至少一個音頻訊號;從一個內部的音訊資料庫接收該至少一個音頻訊號;以及合成該至少一個音頻訊號。 Determining at least one audio signal associated with the at least one object may cause the apparatus to perform at least one of: receiving the at least one audio signal from an external audio library; receiving the at least one audio signal from an internal audio library; And synthesizing the at least one audio signal.

該裝置更可被引起來執行:產生一種音訊串流模型來控制該音頻訊號的呈現;以及使用該音訊串流模型來處理該至少一個音頻訊號。 The apparatus is further executable to: generate an audio stream model to control the presentation of the audio signal; and use the audio stream model to process the at least one audio signal.

產生該音訊串流模型可以使該裝置執行:確定音頻訊號的音調;確定音頻訊號的音量;確定音頻訊號的播放速度;確定音頻訊號的重複週期;確定音頻訊號的開始;以及確定音頻訊號的結束。 Generating the audio stream model may cause the apparatus to perform: determining a tone of the audio signal; determining a volume of the audio signal; determining a playback speed of the audio signal; determining a repetition period of the audio signal; determining a start of the audio signal; and determining an end of the audio signal .

該裝置更可被引起來執行把該至少一個音頻訊號與該動畫影像做時間同步以產生一個同步化的音訊致能動畫影像。 The apparatus is further operative to perform time synchronization of the at least one audio signal with the animated image to produce a synchronized audio enabled animation image.

分析至少兩張影像以確定該至少兩張影像共同的至少一物件,該物件具有一種週期性的運動,其更可以使該裝置執行:從一影像源接收該至少兩張影像;輸出該至少兩張影像到一顯示器上;以及接收至少一個用戶輸入確定該至少兩張影像共同的該至少一物件。 Parsing at least two images to determine at least one object common to the at least two images, the object having a periodic motion, which further enables the apparatus to perform: receiving the at least two images from an image source; outputting the at least two And displaying the image on a display; and receiving the at least one user input to determine the at least one object common to the at least two images.

分析至少兩張影像以確定該至少兩張影像共同的至少一物件,該物件具有一種週期性的運動,其更可以使該裝置來執行:關聯分析該至少兩張影像以確定至少一個候選物件,該至少一個候選物件共同於該至少兩張影像並具有一種週期性的運動;以及選擇該至少一個候選物件中的至少一個作為該至少一個物件。 Parsing at least two images to determine at least one object common to the at least two images, the object having a periodic motion, which is further executable by the device: correlating the at least two images to determine at least one candidate object, The at least one candidate object is common to the at least two images and has a periodic motion; and at least one of the at least one candidate object is selected as the at least one object.

根據第三方面,本發明提供有一裝置,其包含有:可分析至少兩張影像的構件以確定該至少兩張影像共同的至少一物件,該物件具有一種週期性的運動;可基於該至少兩張影像產生一種動畫影像的構件,其中該至少一個物件被製作成動畫;可確定與該至少一個物件關聯的至少一個音頻訊號的構件;以及可結合該至少一個音頻訊號和該動畫影像的構件以產生一種音訊致能的動畫影像。 According to a third aspect, the present invention provides a device comprising: means for analyzing at least two images to determine at least one object common to the at least two images, the object having a periodic motion; The image creates a component of an animated image, wherein the at least one object is animated; a member that can determine at least one audio signal associated with the at least one object; and a component that can be combined with the at least one audio signal and the animated image Produces an audio-enabled animated image.

確定與該至少一個物件關聯的至少一個音頻訊號的該構件可以包含有:可接收該至少一個音頻訊號的構件,其中該至少一個音頻訊號其至少有一部分本質上和該至少兩張影像是在同一時間被擷取;以及可濾波該至少一個音頻訊號的構件。 The means for determining the at least one audio signal associated with the at least one object may comprise: means for receiving the at least one audio signal, wherein at least a portion of the at least one audio signal is substantially identical to the at least two images Time is captured; and means for filtering the at least one audio signal.

接收該至少一個音頻訊號的該構件可以包含有 可從至少一支麥克風處接收該至少一個音頻訊號的構件。 The means for receiving the at least one audio signal may include A component of the at least one audio signal can be received from at least one microphone.

濾波該至少一個音頻訊號的該構件可以包含有:可確定該至少一個音頻訊號的至少一個前景音源的構件;可濾波該至少一個音頻訊號的構件以從該至少一個音頻訊號處移除該至少一個前景音源以產生一種環音聲音頻訊號,作為與該至少一個物件關聯的至少一個音頻訊號。 The means for filtering the at least one audio signal may comprise: means for determining at least one foreground sound source of the at least one audio signal; means for filtering the at least one audio signal to remove the at least one from the at least one audio signal The foreground sound source produces a ringtone audio signal as at least one audio signal associated with the at least one object.

濾波該至少一個音頻訊號以選擇該至少一個音頻訊號的至少一個部分的該構件可以包含有:可確定該至少一個音頻訊號的至少一個前景音源的構件;濾波該至少一個音頻訊號以從該至少一個音頻訊號處抽取出該至少一個前景音源以產生至少一個前景音源音頻訊號,作為與該至少一個物件關聯的至少一個音頻訊號。 The means for filtering the at least one audio signal to select at least one portion of the at least one audio signal may comprise: means for determining at least one foreground sound source of the at least one audio signal; filtering the at least one audio signal from the at least one The at least one foreground sound source is extracted from the audio signal to generate at least one foreground sound source audio signal as at least one audio signal associated with the at least one object.

確定與該至少一個物件關聯的至少一個音頻訊號的該構件可以包含有:可接收一種指示的構件,該指示指出該至少一個物件共同於該至少兩張影像;可指出該至少一個物件共同於該至少兩張影像的構件;以及可基於該指出的物件產生該至少一個音頻訊號的構件。 Determining, by the member, the at least one audio signal associated with the at least one object, the means for receiving an indication, the indication indicating that the at least one object is common to the at least two images; indicating that the at least one object is common to the a member of at least two images; and a member that can generate the at least one audio signal based on the pointed object.

指出該至少一個物件的該構件可以包含有:可確定與該至少兩張影像有關之一個位置的構件;可基於該至少兩張影像的位置指出至少一個物件的構件;可自該至少一個物件中選擇出至少一個的構件。 The member indicating the at least one object may include: a member that can determine a position associated with the at least two images; a member that indicates at least one object based on a position of the at least two images; from the at least one object Select at least one component.

指出該至少一個物件的該構件可以包含有可為共同於該至少兩張影像之該至少一個物件進行圖型識別分析的構件以指出該物件。 The member indicating the at least one object can include a member that can perform pattern recognition analysis for the at least one object common to the at least two images to indicate the object.

確定與該至少一個物件關聯的至少一個音頻訊號的該構件可以包含有以下的至少一個:可從一個外部的音訊資料庫接收該至少一個音頻訊號的構件;可從一個內部的音訊資料庫接收該至少一個音頻訊號的構件;以及可合成該至少一個音頻訊號的構件。 The means for determining the at least one audio signal associated with the at least one object may comprise at least one of: means for receiving the at least one audio signal from an external audio library; receiving the audio data from an internal library a component of at least one audio signal; and a component that can synthesize the at least one audio signal.

該裝置更可包含有:可產生一種音訊串流模型來控制該音頻訊號呈現的構件;以及可使用該音訊串流模型來處理該至少一個音頻訊號的構件。 The device may further include: a component that can generate an audio stream model to control the presentation of the audio signal; and a component that can use the audio stream model to process the at least one audio signal.

產生該音訊串流模型的該構件可以包含有:可確定音頻訊號音調的構件;可確定音頻訊號音量的構件;可確定音頻訊號播放速度的構件;可確定音頻訊號重複週期的構件;可確定音頻訊號開始的構件;以及可確定音頻訊號結束的構件。 The means for generating the audio stream model may include: means for determining an audio signal tone; means for determining an audio signal volume; means for determining an audio signal playback speed; means for determining an audio signal repetition period; determining audio The component at which the signal begins; and the component that determines the end of the audio signal.

該裝置更可包含可把該至少一個音頻訊號與該動畫影像做時間同步的構件以產生一個同步化的音訊致能動畫影像。 The device can further include means for synchronizing the at least one audio signal with the animated image to generate a synchronized audio-enabled animated image.

分析至少兩張影像以確定該至少兩張影像共同的至少一物件,該物件具有一種週期性的運動,該構件其更可包含有:可從一影像源接收該至少兩張影像的構件;可輸出該至少兩張影像到一顯示器上的構件;以及可接收至少一個用戶輸入確定該至少兩張影像共同的該至少一物件的構件。 And analyzing at least two images to determine at least one object common to the at least two images, the object having a periodic motion, the component further comprising: a component capable of receiving the at least two images from an image source; And outputting the at least two images to a component on a display; and receiving, by the at least one user, a component of the at least one object that determines the commonity of the at least two images.

分析至少兩張影像以確定該至少兩張影像共同的至少一物件,該物件具有一種週期性的運動,該構件其 更可包含:可關聯分析該至少兩張影像的構件以確定至少一個候選物件,該至少一個候選物件共同於該至少兩張影像並具有一種週期性的運動;以及可選擇該至少一個候選物件中的至少一個作為該至少一個物件的構件。 Analyzing at least two images to determine at least one object common to the at least two images, the object having a periodic motion, the member having The method may further include: correlating the components of the at least two images to determine at least one candidate object, the at least one candidate object being common to the at least two images and having a periodic motion; and selecting the at least one candidate object At least one of the members of the at least one object.

根據第四方面,本發明提供一裝置,其包含有:一分析器被配置成可分析至少兩張影像以確定該至少兩張影像共同的至少一物件,該物件具有一種週期性的運動;一產生器被配置成可基於該至少兩張影像產生一種動畫影像,其中該至少一個物件被製作成動畫;一音訊剪輯產生器被配置成可確定與該至少一個物件關聯的至少一個音頻訊號;以及一結合器被配置成可結合該至少一個音頻訊號和該動畫影像以產生一種音訊致能的動畫影像。 According to a fourth aspect, the present invention provides an apparatus, comprising: an analyzer configured to analyze at least two images to determine at least one object common to the at least two images, the object having a periodic motion; The generator is configured to generate an animated image based on the at least two images, wherein the at least one object is animated; an audio clip generator configured to determine at least one audio signal associated with the at least one object; A combiner is configured to combine the at least one audio signal and the animated image to produce an audio enabled animated image.

該音訊剪輯產生器可以包含有:一輸入被配置成可接收該至少一個音頻訊號,其中該至少一個音頻訊號其至少有一部分本質上和該至少兩張影像是在同一時間被擷取;以及一濾波器被配置成可濾波該至少一個音頻訊號。 The audio clip generator can include: an input configured to receive the at least one audio signal, wherein at least a portion of the at least one audio signal is substantially captured at the same time as the at least two images; and The filter is configured to filter the at least one audio signal.

該輸入可以包含有被配置成可從至少一支麥克風處接收該至少一個音頻訊號的一麥克風輸入。 The input can include a microphone input configured to receive the at least one audio signal from at least one microphone.

該濾波器可以包含有:一本質音頻訊號分析器被配置成可確定該至少一個音頻訊號的至少一個前景音源;並進一步被配置成可濾波該至少一個音頻訊號以從該至少一個音頻訊號處移除該至少一個前景音源以產生一種環音聲音頻訊號,作為與該至少一個物件關聯的至少一個音頻訊號。 The filter can include: an essential audio signal analyzer configured to determine at least one foreground sound source of the at least one audio signal; and further configured to filter the at least one audio signal to be shifted from the at least one audio signal The at least one foreground sound source is generated to generate a ring sound audio signal as at least one audio signal associated with the at least one object.

該濾波器可以包含有:一本質音頻訊號分析器被配置成可確定該至少一個音頻訊號的至少一個前景音源;並進一步被配置成可濾波該至少一個音頻訊號以從該至少一個音頻訊號處抽取出該至少一個前景音源以產生至少一個前景音源音頻訊號,作為與該至少一個物件關聯的至少一個音頻訊號。 The filter can include: an essential audio signal analyzer configured to determine at least one foreground sound source of the at least one audio signal; and further configured to filter the at least one audio signal to extract from the at least one audio signal The at least one foreground source is generated to generate at least one foreground source audio signal as at least one audio signal associated with the at least one object.

該音訊剪輯產生器可以包含有:一合成音訊產生器被配置成可接收該至少一個物件共同於該至少兩張影像的一種指示;被配置成可指出該至少一個物件共同於該至少兩張影像;以及進一步被配置成可基於該指出的物件產生該至少一個音頻訊號。 The audio clip generator can include: a composite audio generator configured to receive an indication that the at least one object is common to the at least two images; configured to indicate that the at least one object is common to the at least two images And further configured to generate the at least one audio signal based on the pointed object.

被配置成可指出該至少一個物件的該合成音訊產生器可以包含有:一位置估計器被配置成可確定與該至少兩張影像關聯的一個位置;一物件識別器被配置成可基於該至少兩張影像的位置指出至少一個物件;以及一物件選擇器被配置成可選取該至少一個物件中的至少一個。 The composite audio generator configured to indicate the at least one object can include: a position estimator configured to determine a location associated with the at least two images; an object identifier configured to be based on the at least The position of the two images indicates at least one item; and an item selector is configured to select at least one of the at least one item.

被配置成可指出該至少一個物件的該合成音訊產生器可以包含有一圖型識別分析器,其被配置成可為共同於該至少兩張影像之該至少一個物件進行圖型識別以指出該物件。 The composite audio generator configured to indicate the at least one object can include a pattern recognition analyzer configured to perform pattern recognition for the at least one object common to the at least two images to indicate the object .

該合成音訊產生器可以包含有以下的至少一個:一輸入被配置成可從一個外部的音訊資料庫接收該至少一個音頻訊號;一輸入被配置成可從一個內部的音訊資料庫接收該至少一個音頻訊號;一輸入被配置成可從一記 憶體接收該至少一個音頻訊號;以及一合成器被配置成可合成該至少一個音頻訊號。 The composite audio generator can include at least one of: an input configured to receive the at least one audio signal from an external audio library; an input configured to receive the at least one from an internal audio library Audio signal; an input is configured to be available from a note The memory body receives the at least one audio signal; and a synthesizer is configured to synthesize the at least one audio signal.

該裝置更可包含有:一音訊剪輯嵌入器被配置成可產生一種音訊串流模型來控制該音頻訊號的呈現;以及被配置成可使用該音訊串流模型來處理該至少一個音頻訊號。 The apparatus can further include: an audio clip embedder configured to generate an audio stream model to control the presentation of the audio signal; and configured to process the at least one audio signal using the audio stream model.

被配置成產生該音訊串流模型的該音訊剪輯嵌入器可被配置成可以:確定音頻訊號的音調;確定音頻訊號的音量;確定音頻訊號的播放速度;確定音頻訊號的重複週期;確定音頻訊號的開始;以及確定音頻訊號的結束。 The audio clip embedder configured to generate the audio stream model can be configured to: determine a tone of the audio signal; determine a volume of the audio signal; determine a playback speed of the audio signal; determine a repetition period of the audio signal; determine an audio signal Start; and determine the end of the audio signal.

該裝置更可包含一同步器,其被配置成可把該至少一個音頻訊號與該動畫影像做時間同步以產生一個同步化的音訊致能動畫影像。 The apparatus can further include a synchronizer configured to time synchronize the at least one audio signal with the animated image to produce a synchronized audio enabled animated image.

該分析器可以包含有:一輸入被配置成可從一影像源接收該至少兩張影像;一輸入被配置成可輸出該至少兩張影像到一顯示器上;以及一用戶介面輸入被配置成可接收至少一個用戶輸入確定該至少兩張影像共同的該至少一物件。 The analyzer can include: an input configured to receive the at least two images from an image source; an input configured to output the at least two images onto a display; and a user interface input configured to be Receiving at least one user input to determine the at least one object common to the at least two images.

該分析器可以包含有:一關聯器被配置成可關聯分析該至少兩張影像以確定至少一個候選物件,該至少一個候選物件共同於該至少兩張影像並具有一種週期性的運動;以及一選擇器被配置成可選擇該至少一個候選物件中的至少一個作為該至少一個物件。 The analyzer can include: a correlator configured to correlate the at least two images to determine at least one candidate object, the at least one candidate object being common to the at least two images and having a periodic motion; The selector is configured to select at least one of the at least one candidate object as the at least one item.

一種儲存在一媒體中的計算機程式產品可使得 一裝置可執行如本發明所描述的方法。 A computer program product stored in a medium can make A device can perform the method as described herein.

一種電子裝置可以包含有如本發明所描述的裝置。 An electronic device can include a device as described herein.

一個晶片組可以包含有如本發明所描述的裝置。 A wafer set can include a device as described herein.

10‧‧‧電子裝置 10‧‧‧Electronic devices

11‧‧‧觸控輸入模組 11‧‧‧Touch input module

12‧‧‧顯示器 12‧‧‧ display

13‧‧‧收發器(TX/RX) 13‧‧‧Transceiver (TX/RX)

15‧‧‧處理器 15‧‧‧ processor

16‧‧‧記憶體 16‧‧‧ memory

17‧‧‧程式碼區段 17‧‧‧Code section

18‧‧‧內儲資料區段 18‧‧‧Inner data section

101‧‧‧攝影機 101‧‧‧ camera

103‧‧‧視訊/影像分析器 103‧‧‧Video/Image Analyzer

105‧‧‧會動照片產生器 105‧‧‧Activity Photo Generator

107‧‧‧音訊剪輯產生器 107‧‧‧Audio Clip Generator

109‧‧‧混合器和同步器 109‧‧‧Mixer and synchronizer

171‧‧‧用戶介面輸入控制器 171‧‧‧User Interface Input Controller

173‧‧‧感興趣區域選擇器 173‧‧‧ Area of Interest Selector

181~187‧‧‧方塊 181~187‧‧‧

201‧‧‧合成音訊產生器 201‧‧‧Synthetic audio generator

203‧‧‧本質音頻訊號分析器 203‧‧‧Intrinsic Audio Signal Analyzer

205‧‧‧音訊剪輯嵌入器 205‧‧‧Audio Clip Embedder

151‧‧‧音訊剪輯資料庫 151‧‧‧Audio Clip Database

171‧‧‧麥克風/麥克風陣列 171‧‧‧Microphone/Microphone Array

301‧‧‧特徵/子區域關聯器 301‧‧‧Feature/Sub-Region Correlator

303‧‧‧剪輯請求器 303‧‧‧Clip Requester

305‧‧‧剪輯傳回器/剪輯選擇器 305‧‧‧Clip Returner/Clip Selector

401‧‧‧環音聲檢測器 401‧‧‧ ring sound detector

501‧‧‧音訊串流模型產生器 501‧‧‧ audio stream model generator

503‧‧‧音訊串流處理器 503‧‧‧Optical Streaming Processor

601~609‧‧‧方塊 601~609‧‧‧

701~709‧‧‧方塊 701~709‧‧‧

801~805‧‧‧方塊 801~805‧‧‧

901~907‧‧‧方塊 901~907‧‧‧

1001~1007‧‧‧方塊 1001~1007‧‧‧

1301‧‧‧一視訊影像 1301‧‧‧1 video image

1303‧‧‧區域1 1303‧‧‧Region 1

1305‧‧‧區域2 1305‧‧‧Region 2

1313‧‧‧選擇的感興趣區域R1 1313‧‧‧Selected area of interest R 1

1315‧‧‧選擇的感興趣區域R2 1315‧‧‧Selected area of interest R 2

1321‧‧‧選擇的音訊區域 1321‧‧‧Selected audio area

1323‧‧‧選擇的視訊子區域R1 1323‧‧‧Selected video sub-area R 1

1325‧‧‧選擇的視訊子區域R2 1325‧‧‧Selected video sub-area R 2

1401‧‧‧動畫視訊子區域 1401‧‧‧Animated Video Sub-area

1403‧‧‧視訊序列 1403‧‧‧Video sequence

1405‧‧‧環音聲序列 1405‧‧‧ ringtone sequence

為了更佳地瞭解本發明,現在將透過示例的方式參考到所附圖示,其中:圖1示意性地展示出一個適合使用一些實施例的裝置;圖2示意性地展示出一個音訊增強會動照片產生器的示例;根據一些實施例,圖3示意性地展示出一個如在圖2中所示的視訊/影像分析器;根據一些實施例,圖4示意性地展示出一個如在圖3中所示的視訊/影像分析器其運作的流程圖;根據一些實施例,圖5示意性地展示出一個如在圖2中所示的音訊剪輯產生器;根據一些實施例,圖6示意性地展示出一個如在圖5中所示的合成音訊產生器;根據一些實施例,圖7示意性地展示出一個如在圖5中所示的本質音頻訊號分析器;根據一些實施例,圖8示意性地展示出一個如在圖5中所示的一個音訊剪輯嵌入器;根據一些實施例,圖9示意性地展示出一個如在圖2中所示之音訊增強會動照片產生器的一個運作流程圖; 根據一些實施例,圖10示意性地展示出一個如在圖6中所示的合成音訊產生器的一個運作流程圖;根據一些實施例,圖11示意性地展示出一個如在圖7中所示的本質音頻訊號分析器的一個運作流程圖;根據一些實施例,圖12示意性地展示出一個如在圖8中所示的音訊剪輯嵌入器的一個運作流程圖;根據一些實施例,圖13示意性地展示出一個如在圖2中所示的混合器和同步器的一個運作流程圖;根據一些實施例,圖14展示出在影像的感興趣區域中做子區域選擇的示例;根據一些實施例,圖15展示出一音訊增強會動照片其視訊和音訊形成的一時間規劃示例。 For a better understanding of the present invention, reference will now be made to the accompanying drawings, by way of example, in which: FIG. 1 schematically shows a device that is suitable to use some embodiments; FIG. 2 schematically shows an audio enhancement session An example of a moving photo generator; Figure 3 schematically illustrates a video/image analyzer as shown in Figure 2, in accordance with some embodiments; Figure 4 schematically illustrates an image as shown in Figure 2, in accordance with some embodiments. A flowchart of the operation of the video/image analyzer shown in FIG. 3; FIG. 5 schematically illustrates an audio clip generator as shown in FIG. 2, in accordance with some embodiments; FIG. 6 illustrates A synthetic audio generator as shown in FIG. 5 is shown; FIG. 7 schematically illustrates an intrinsic audio signal analyzer as shown in FIG. 5, in accordance with some embodiments; Figure 8 schematically illustrates an audio clip embedder as shown in Figure 5; Figure 9 schematically illustrates an audio enhanced action photo generator as shown in Figure 2, in accordance with some embodiments. one of Operational flow chart; Figure 10 schematically illustrates an operational flow diagram of a composite audio generator as shown in Figure 6 in accordance with some embodiments; Figure 11 schematically illustrates one as in Figure 7 in accordance with some embodiments. An operational flow diagram of the illustrated intrinsic audio signal analyzer; FIG. 12 schematically illustrates an operational flow diagram of an audio clip embedder as shown in FIG. 8, in accordance with some embodiments; 13 schematically illustrates an operational flow diagram of a mixer and synchronizer as shown in FIG. 2; according to some embodiments, FIG. 14 illustrates an example of sub-region selection in a region of interest of an image; In some embodiments, Figure 15 illustrates an example of a time plan for video and audio formation of an audio enhanced moving picture.

較佳實施例之詳細說明 Detailed description of the preferred embodiment

本發明其實施例的概念就是在產生會動照片或動畫影像的過程中把音頻訊號結合到會動照片(動畫影像)上。這可以在本發明所示的例子中被實現,藉由使用本質的音頻訊號和合成的音頻訊號中的至少一個來產生和嵌入元資料,該元資料包含有音頻效果訊號或可連結到該音頻效果訊號的鏈結,並以一種方式使得該產生的會動照片可由該音頻效果來增強。 The concept of an embodiment of the present invention is to combine an audio signal into a moving photo (animated image) during the process of generating a moving photo or an animated image. This can be implemented in the illustrated example of the present invention by generating and embedding metadata using at least one of an intrinsic audio signal and a synthesized audio signal that includes an audio effect signal or can be coupled to the audio. The chain of effect signals, and in a manner such that the resulting moving photo can be enhanced by the audio effect.

高品質的照片和視訊是吾人所熟知可提供經歷重溫的一個絕佳的方式。會動照片或動畫影像被視為是照片的一種延伸,並是使用後製技術來產生的。該會動照片 提供了一種構件可使在影像之間共同的物件會動,或是在一張其他部份是靜止或靜態的圖像中某一區域中的物件會動。舉例來說,該設計或美學的元素允許有細微動作的元素,而其餘的影像則是靜止的。在一些會動照片中該動作或動畫特徵是重複的。在以下的描述和專利申請範圍中,該等術語如物件、共同物件、或標的可被認為指的是任何在影像之間共享(或共同)之物,被使用來創建會動照片或動畫物件的元素、物件或元件。舉例來說,被用作為一種輸入的該等影像可能是一段視訊,其內容是一個移動中的玩具火車對上一個本質是靜止的背景。在如此的一個例子中,該物件、標的、共同物件、區域、或元素可以是該玩具火車,其在該動畫影像中提供該動態的或細微的運動元素,而其餘的影像則是靜止的。要被理解的是,該物件或標的之共同與否並不一定是指該物件、標的、或元素在一格一格的畫面中是否是大體相同的。然而,當該物件移動或貌似要移動時,在後續的影像物件之間通常會有一很大程度的相關性。舉例來說,藉由使該火車看起來會變大/變小的方式可使該玩具火車的物件或標的可在一格一格的畫面中看起來好像要駛向或駛離觀察者,或是藉由該玩具火車外觀的改變使得該玩具火車看起來好像要駛離或駛向觀察者。 High-quality photos and video are a great way to get a review of the experience. Moving photos or animated images are seen as an extension of the photo and are produced using post-production techniques. The photo of the meeting A member is provided to move objects that are common between images, or to move objects in an area of a still or static image. For example, the design or aesthetic element allows for elements with subtle movements while the rest of the image is still. This action or animation feature is repeated in some moving photos. In the following description and patent application, such terms as objects, common objects, or objects may be considered to refer to any object that is shared (or common) between images and used to create a moving photo or animated object. Element, object or component. For example, the images used as an input may be a video, the content of which is a moving toy train that is essentially a stationary background. In such an example, the object, the subject, the common item, the area, or the element may be the toy train that provides the dynamic or subtle motion element in the animated image while the remaining images are stationary. It is to be understood that the commonality of the object or subject does not necessarily mean that the object, subject, or element is substantially identical in a single frame. However, when the object moves or appears to be moving, there is usually a large degree of correlation between subsequent image objects. For example, by making the train appear larger/smaller, the object or object of the toy train can appear to be driving toward or away from the viewer in a one-frame picture, or It is the change in the appearance of the toy train that makes the toy train appear to be driving away or heading towards the viewer.

換言之,被指出為該物件、標的、或元素的該影像區域的大小、形狀和位置可以在像一張一張影像之間做改變,然而,在該影像內是一選擇的實體其在一格一格的 畫面之間具有一定程度的相關性(與其對比的是靜態影像的成分,其在一格一格的畫面之間具有本質上完美的相關性)。 In other words, the size, shape, and position of the image area indicated as the object, target, or element can be changed between images, but within the image is a selected entity One grid There is a degree of correlation between the pictures (compared to the composition of the still image, which has a perfectly perfect correlation between the pictures in one frame).

一張會動照片在某些方面可以被視為是影像觀看的一個潛在的自然進展:從灰階(黑白攝影)到彩色,從彩色到高解析度彩色影像,從完全靜態到照片中有局部的運動。然而,重溫一段經歷時若無音訊的旁襯還是會被視為是不完整的,而會動照片在目前尚無法呈現音訊。 A moving photo can be seen in some ways as a potential natural progression of image viewing: from grayscale (black and white photography) to color, from color to high resolution color images, from completely static to localized in photos. exercise. However, revisiting the edge of an experience without audio will still be considered incomplete, and the moving photo is still unable to present audio.

因此,該問題會是如何產生一張會動照片或動畫影像使得該音訊可以加入到其中。 Therefore, the question will be how to generate a moving photo or an animated image so that the audio can be added to it.

通常一張會動照片(或會做動作的照片或動畫影像)是由一視訊序列所建構出,其中音訊很可能是已有的或是伴隨於該視訊。然而,當嘗試要把該音訊連結到該會動照片中時,在該場景中所錄製的音訊不能被整個的附加到該動作影像,該附加的音訊反而應該被選取並做選擇性的處理。舉例來說,吾人不能重複一段音訊剪輯(類似於在影像的子區域中加入動作),因為它會變得令人感到不快。 Usually a moving photo (or a photo or animated image that will be an action) is constructed from a sequence of video, where the audio is likely to be existing or accompanying the video. However, when attempting to link the audio to the moving photo, the recorded audio in the scene cannot be completely attached to the motion image, and the additional audio should be selected and selectively processed. For example, we can't repeat an audio clip (similar to adding an action to a sub-region of an image) because it can become unpleasant.

與添加音訊有關的另一個問題是:當兩個人在對話交談的情形下,重複一特定的音訊序列會產生一種不自然的例子。舉例來說,一次又一次地重複音訊剪輯(「嗨,拉維你好嗎?」)會很不自然。因此,要被理解的是,會有一類或一部分的音訊剪輯或音頻訊號是可以被重複或循環播放的;而會有一類或一種選擇的音訊不應該被重複播放,即使當該影像動做被循環播放時也只能播放一次。 Another problem associated with adding audio is that when two people are talking in a conversation, repeating a particular sequence of audio produces an unnatural example. For example, repeating audio clips over and over again ("Hey, Ravi, are you?") would be unnatural. Therefore, it is to be understood that one or a portion of the audio clips or audio signals can be played repeatedly or in a loop; and one or a selected type of audio should not be played repeatedly, even when the image is moved. It can only be played once during loop playback.

在會動照片或動畫影像中使用循環音訊連結到視訊影像的另外一個問題是要準確地循環播放音頻訊號或音訊剪輯。舉例來說,一個循環或重複播放的音頻訊號當其被重複播放時可能會跳動或暫停,這將導致一個不自然的或令人反感的體驗。 Another problem with using looped audio to link to video images in moving photos or animated images is to accurately loop audio signals or audio clips. For example, a looping or repeating audio signal may bounce or pause when it is played repeatedly, which can lead to an unnatural or offensive experience.

要被理解的是,一張會動照片通常被理解為有一個可重複的、細微的運動元件(或標的或物件),然而在某些情況下,該音訊可以被附加到在一動畫影像或照片中不可重複的物件或運動元件。舉例來說,增加一閃電/打雷聲音到一會動照片中。同樣地,在一些實施例中,該音訊剪輯或訊號可以是在一個視覺運動元件動畫場景內的一個單一實例播放元件。在以下的示例中,運動攝影或會動照片可被視為涵蓋所有的組合或是單一或是可重複的影像動作和音訊剪輯播放。 It is to be understood that a moving photo is generally understood to have a repeatable, subtle moving element (or target or object), however in some cases the audio may be attached to an animated image or An object or moving element that is not repeatable in the photo. For example, add a lightning/thundering sound to a moving photo. As such, in some embodiments, the audio clip or signal can be a single instance playback element within an animated scene of a visual motion component. In the examples below, motion photography or moving photos can be viewed as covering all combinations or single or repeatable image motions and audio clip playback.

圖1是一個示例電子裝置10的一個示意方塊圖,在其上本發明的實施例可以被實現。該裝置10此一實施例被配置來提供改進的影像體驗。 1 is a schematic block diagram of an example electronic device 10 upon which an embodiment of the present invention can be implemented. This embodiment of the device 10 is configured to provide an improved imaging experience.

在一些實施例中,該裝置10是在一種無線通信系統中用來運作的一個行動終端、行動電話或用戶設備。在其它實施例中,該設備是任何合適的電子裝置,被配置來處理視訊和音訊資料。在一些實施例中,該裝置被配置來提供一種影像顯示,舉例來說諸如一台數位相機、一可攜式音頻播放器(MP3播放器)、一可攜式視訊播放器(MP4播放器)。在其它實施例中,該裝置可能是任何合適的具觸控 介面的電子設備(其可能或可能不顯示資訊),諸如一個觸控螢幕或觸控板,當該觸控螢幕或觸控板被觸摸時,它們被配置來提供反饋。例如在一些實施例中,該觸控板可以是一種觸敏鍵板,其在一些實施例中可能沒有標記在其上面;而在其它實施例中在其前窗上可能有實體標記或標誌。在如此的實施例中,該用戶可以藉由實體的指示器來被告知要觸摸那一地方一諸如一個凸起的剖面,或是一個可以被一光導照亮的印刷層。 In some embodiments, the device 10 is a mobile terminal, mobile phone or user device for operation in a wireless communication system. In other embodiments, the device is any suitable electronic device configured to process video and audio material. In some embodiments, the apparatus is configured to provide an image display such as, for example, a digital camera, a portable audio player (MP3 player), and a portable video player (MP4 player). . In other embodiments, the device may be any suitable touch The interface's electronic devices (which may or may not display information), such as a touch screen or trackpad, are configured to provide feedback when the touch screen or trackpad is touched. For example, in some embodiments, the touchpad can be a touch sensitive keypad that may not be labeled thereon in some embodiments; while in other embodiments there may be physical indicia or logos on its front window. In such an embodiment, the user can be informed by a physical indicator of the location to be touched, such as a raised profile, or a printed layer that can be illuminated by a light guide.

該裝置10包含有一個觸控輸入模組或用戶介面11,它被鏈接到一個處理器15。該處理器15進一步被鏈接到一個顯示器12。該處理器15進一步被鏈接到一個收發器(TX/RX)13和一個記憶體16。 The device 10 includes a touch input module or user interface 11 that is linked to a processor 15. The processor 15 is further linked to a display 12. The processor 15 is further linked to a transceiver (TX/RX) 13 and a memory 16.

在一些實施例中,該觸控輸入模組11和/或該顯示器12是分開的或可分離於該電子裝置,而該處理器從該觸控輸入模組11接收訊號和/或透過該收發器13或其他合適的介面發送訊號到該顯示器12。此外,在一些實施例中,該觸控輸入模組11和顯示器12是同一元件的組件。在這樣的實施例中,該觸控輸入模組11和顯示器12可以被稱為該顯示器組件或觸控顯示組件。 In some embodiments, the touch input module 11 and/or the display 12 are separate or separable from the electronic device, and the processor receives signals from the touch input module 11 and/or transmits and receives the signals. The device 13 or other suitable interface sends a signal to the display 12. Moreover, in some embodiments, the touch input module 11 and display 12 are components of the same component. In such an embodiment, the touch input module 11 and the display 12 may be referred to as the display component or the touch display component.

在一些實施例中,該處理器15可以被配置成可執行各種程式碼。在一些實施例中,該實現的程式碼可以包含有一些程序,諸如音頻訊號解析和影像資料的解碼;觸控處理;輸入模擬;或觸覺效果模擬碼,其中該觸控輸入模組輸入會被檢測和處理;效果反饋訊號產生,其中電氣 訊號會被產生,當其被傳遞到一個換能器時可以產生觸覺或觸覺反饋給該裝置的用戶;或致動器處理,其被配置來產生用於驅動一致動器的一個致動器訊號。舉例來說,在一些實施例中,該實現的程式碼可以被儲存在該記憶體16中,更明確地說是被儲存在該記憶體16的一程式碼區段17中,每當有需要時可供該處理器15來檢索。在一些實施例中,該記憶體15可進一步提供用於儲存資料的一個區段18,舉例來說,可儲存依照該應用程式已經被處理的資料,例如,像是虛擬音頻訊號資料。 In some embodiments, the processor 15 can be configured to execute a variety of code. In some embodiments, the implemented code may include programs, such as audio signal parsing and decoding of image data; touch processing; input simulation; or haptic effect analog code, wherein the touch input module input is Detection and processing; effect feedback signal generation, where electrical A signal may be generated that, when delivered to a transducer, may produce tactile or tactile feedback to the user of the device; or an actuator process configured to generate an actuator signal for driving the actuator . For example, in some embodiments, the implemented code may be stored in the memory 16, more specifically in a code segment 17 of the memory 16, whenever needed The processor 15 is available for retrieval. In some embodiments, the memory 15 can further provide a section 18 for storing data, for example, storing data that has been processed in accordance with the application, such as, for example, virtual audio signal data.

在一些實施例中該觸控輸入模組11可以實現任何合適的觸控螢幕介面技術。舉例來說,在一些實施例中該觸控螢幕介面可以包含有一個電容式感測器,其被配置成敏感於一根手指出現在該觸控螢幕介面的上方或上面。該電容式感測器可以包含有一個絕緣體(例如玻璃或塑膠),被塗上一層透明導體(例如,氧化銦錫-ITO)。由於人體也是導體,接觸該螢幕的表面會導致該局部靜電場的一種扭曲,可用電容量的改變來做量測。任何合適的技術也可以被使用來確定該觸摸的位置。該位置可以被傳遞到該處理器,其可計算該用戶的觸摸是如何地有關於該裝置。該絕緣體可保護該導電層不被從手指來污垢,灰塵或殘留物所汙染。 In some embodiments, the touch input module 11 can implement any suitable touch screen interface technology. For example, in some embodiments the touch screen interface can include a capacitive sensor configured to be sensitive to a finger appearing above or above the touch screen interface. The capacitive sensor can comprise an insulator (such as glass or plastic) coated with a transparent conductor (eg, indium tin oxide-ITO). Since the human body is also a conductor, touching the surface of the screen causes a distortion of the local electrostatic field, which can be measured by a change in capacitance. Any suitable technique can also be used to determine the location of the touch. The location can be passed to the processor, which can calculate how the user's touch is related to the device. The insulator protects the conductive layer from dirt, dust or residue from the fingers.

在一些其它的實施例中,該觸控輸入模組可以是一種電阻式感測器,其包含有好幾層薄金屬導電層,其兩兩是由一狹窄間隙所分開。當一個物體,諸如一根手指, 按下該面板其外表面上的某一點時,兩個金屬層會在該點上變成是導通的:然後該面板會表現得像是一對帶有連接輸出的電壓分壓器。這種物理變化因此會導致出一種在該電流中的變化,其被登記為一觸摸事件,並發送到該處理器來處理。 In some other embodiments, the touch input module can be a resistive sensor that includes several layers of thin metal conductive layers separated by a narrow gap. When an object, such as a finger, When a point on the outer surface of the panel is pressed, the two metal layers become conductive at that point: the panel then behaves like a pair of voltage dividers with connected outputs. This physical change therefore results in a change in the current that is registered as a touch event and sent to the processor for processing.

在一些其它的實施例中,該觸控輸入模組可以進一步使用一些技術來確定一觸摸事件,諸如視覺檢測、投影電容檢測、紅外線檢測、表面聲波檢測、色散訊號技術、和聲音脈衝識別。舉例來說,上述之視覺檢測是指一個攝影機可位於該表面下方或在該表面上方檢測出該手指或觸摸物體的位置。在一些實施例中,要被理解的是,「觸摸」可以被同時定義為實體接觸和「懸停觸摸」,而後者並沒有實際與感測器接觸,但當該物體非常地靠近該感測器時,對於該感測器是有一效果的。 In some other embodiments, the touch input module can further utilize techniques to determine a touch event, such as visual inspection, projection capacitance detection, infrared detection, surface acoustic wave detection, dispersion signal technology, and sound pulse recognition. For example, visual inspection as described above refers to a camera that can be positioned below or above the surface to detect the location of the finger or touch object. In some embodiments, it is to be understood that "touch" can be defined as both a physical contact and a "hovering touch", while the latter is not actually in contact with the sensor, but when the object is very close to the sensing When it is used, it has an effect on the sensor.

在此處所描述的該觸控輸入模組是一用戶介面輸入的一個示例。要被理解的是,在一些其它的實施例中,任何其它合適的用戶介面輸入可以被使用來提供用一用戶介面輸入,舉例來說,從一顯示螢幕中選擇一個項目、物件、或區域。在一些實施例中,該用戶介面輸入從而可以是一鍵盤、滑鼠、鍵板、搖桿或任何合適的指標裝置。 The touch input module described herein is an example of a user interface input. It is to be understood that in some other embodiments, any other suitable user interface input can be used to provide input using a user interface, for example, selecting an item, object, or region from a display screen. In some embodiments, the user interface input can thus be a keyboard, mouse, keypad, joystick, or any suitable indicator device.

在一些實施例中,該裝置10能至少部分是用硬體實現該處理技術,換句話說,就是由該處理器15所進行的該處理至少可部分地用硬體來實現,而不需要用軟體或韌體來運作該硬體。 In some embodiments, the apparatus 10 can implement the processing technique at least in part by hardware, in other words, the processing performed by the processor 15 can be implemented at least partially by hardware without the need for Software or firmware to operate the hardware.

在一些實施例中,該收發器13能夠與其他電子裝置通訊,舉例來說,在一些實施例中是透過一無線通訊網路來達到該目的。 In some embodiments, the transceiver 13 is capable of communicating with other electronic devices, for example, in some embodiments, through a wireless communication network.

該顯示器12可包含任何合適的顯示技術。舉例來說,該顯示元件可以位於該觸控輸入模組的下方,並投射一張影像穿透過該觸控輸入模組以被該用戶觀看。該顯示器12可以採用任何合適的顯示技術,諸如液晶顯示器(LCD)、發光二極體(LED)、有機發光二極體(OLED)、電漿顯示單元、場發射顯示器(FED)、表面傳導電子發射顯示器(SED)、以及電泳顯示器(也被稱為電子紙、e-紙或電子墨水顯示器)。在一些實施例中,該顯示器12採用了利用一光導投影到顯示窗口的顯示技術其中之一。 The display 12 can include any suitable display technology. For example, the display component can be located below the touch input module and project an image through the touch input module to be viewed by the user. The display 12 can employ any suitable display technology such as a liquid crystal display (LCD), a light emitting diode (LED), an organic light emitting diode (OLED), a plasma display unit, a field emission display (FED), surface conduction electrons. A light-emitting display (SED), and an electrophoretic display (also known as an electronic paper, e-paper, or electronic ink display). In some embodiments, the display 12 employs one of display techniques that utilize a light guide to project to a display window.

圖2展示出一個音訊增強會動照片產生器的示例。此外,關於圖9,一個如在圖2中所示的音訊增強會動照片產生器其運作會被更進一步的描述。 Figure 2 shows an example of an audio enhanced action photo generator. Further, with respect to Fig. 9, the operation of an audio enhanced moving photo generator as shown in Fig. 2 will be further described.

在一些實施例中,該音訊增強會動照片產生器包含一個攝影機101。該攝影機101可以是任何合適的視訊或影像擷取裝置。該攝影機101可以被配置來擷取影像,並把該影像或視訊資料傳給一個視訊/影像分析器103。 In some embodiments, the audio enhanced motion photo generator includes a camera 101. The camera 101 can be any suitable video or image capture device. The camera 101 can be configured to capture images and pass the images or video data to a video/image analyzer 103.

擷取視訊影像的運作被展示在圖9的步驟601中。 The operation of capturing video images is shown in step 601 of FIG.

在一些實施例中,該示例音訊增強會動照片產生器包含一個視訊/影像分析器103。該視訊/影像分析器103可以被配置來接收來自該攝影機101的影像或視訊資料並分析該候選物件或子區域的視訊影像以用來製作動作/動 畫和聲訊混合。 In some embodiments, the example audio enhanced active photo generator includes a video/image analyzer 103. The video/image analyzer 103 can be configured to receive video or video data from the camera 101 and analyze video images of the candidate object or sub-region for use in making motions/motions Mixing of pictures and sounds.

該視訊/影像分析器103的輸出形式為候選視訊動作選擇之視訊或影像資料以及物件或子區域,可被傳遞到該會動照片產生器105。該視訊/影像分析器103可以進一步被配置成可把指出適合做音訊插入之候選物件或子區域的資料輸出到該音訊剪輯產生器107。 The output of the video/image analyzer 103 in the form of video or video material selected for the candidate video action and the object or sub-area can be transmitted to the moving photo generator 105. The video/image analyzer 103 can be further configured to output data indicative of candidate objects or sub-regions suitable for audio insertion to the audio clip generator 107.

在一些實施例中,該示例音訊增強會動照片產生器包含一個會動照片產生器105。該會動照片產生器105被配置成接收該視訊和任何視訊動作選擇資料並產生一合適的會動照片資料。在以下的示例中,該會動照片產生器被配置成產生動畫影像資料,然而,如本文所描述,在一些實施例中,該動畫可能是很細微的或無法從該影像中取出(換句話說,該影像基本上是一張靜態影像)。該會動照片產生器105可以是任何合適的會動照片或動畫影像產生構件,被配置成以一種合適的格式產生資料使得該會動照片檢視器可以產生帶有任何動作元素的影像。在一些實施例中,該會動照片產生器105可以被配置成可輸出該會動照片資料到一個混合器和同步器109。 In some embodiments, the example audio enhanced motion photo generator includes a moving photo generator 105. The moving photo generator 105 is configured to receive the video and any video action selection data and generate a suitable moving photo material. In the example below, the moving photo generator is configured to generate animated image material, however, as described herein, in some embodiments, the animation may be subtle or unremovable from the image (in other words) In other words, the image is basically a static image). The moving photo generator 105 can be any suitable moving photo or animated image generating component configured to generate material in a suitable format such that the moving photo viewer can produce an image with any action element. In some embodiments, the moving photo generator 105 can be configured to output the moving photo material to a mixer and synchronizer 109.

產生該會動照片資料的運作被展示在圖9的步驟605中。 The operation of generating the moving photo material is shown in step 605 of FIG.

此外,在一些實施例中,該示例音訊增強會動照片產生器包含一個音訊剪輯產生器107。該音訊剪輯產生器107可以被配置成可產生一音訊剪輯或音頻訊號成分,其可被插入到或被結合到該會動照片視訊或影像資料。在一些 實施例中,該音訊剪輯產生器107可以被配置成可從視訊/影像分析器103接收資訊使得該音訊剪輯產生器107被配置成可基於從該視訊/影像分析器103傳來的該資訊選擇一音訊剪輯音頻訊號成分。 Moreover, in some embodiments, the example audio enhanced active photo generator includes an audio clip generator 107. The audio clip generator 107 can be configured to generate an audio clip or audio signal component that can be inserted into or incorporated into the moving photo video or image material. In some In an embodiment, the audio clip generator 107 can be configured to receive information from the video/image analyzer 103 such that the audio clip generator 107 is configured to select based on the information transmitted from the video/image analyzer 103. An audio clip audio signal component.

產生該音訊剪輯的運作被展示在圖9的步驟607中。 The operation of generating the audio clip is shown in step 607 of FIG.

在一些實施例中,該裝置包含一個混合器和同步器109,其被配置成可從該會動照片產生器105接收該視訊影像並且從該音訊剪輯產生器107接收該音頻訊號,和被配置成可用一種合適的方式來混合和同步那兩訊號。 In some embodiments, the apparatus includes a mixer and synchronizer 109 configured to receive the video image from the mobile photo generator 105 and receive the audio signal from the audio clip generator 107, and is configured The two signals can be mixed and synchronized in a suitable manner.

在一些實施例中,該混合器和同步器109可以被配置成可從該會動照片產生器105接收該視訊資料和從該音訊剪輯產生器107接收該音訊資料或音頻訊號,混合該音訊和視訊資料,並用一種方式同步該音訊和視訊資料使得該混合器和同步器109輸出一增強的會動照片。在一些實施例中,該增強的會動照片是一張帶有元資料的會動照片,該元資料包含有將要和該影像輸出一起被輸出的音訊資料(或指向一音訊文件的鏈結)。 In some embodiments, the mixer and synchronizer 109 can be configured to receive the video material from the mobile photo generator 105 and receive the audio data or audio signal from the audio clip generator 107, mixing the audio sum. The video material, and the audio and video data are synchronized in a manner such that the mixer and synchronizer 109 outputs an enhanced moving picture. In some embodiments, the enhanced moving photo is a moving photo with meta-data containing audio data (or a link to an audio file) to be output with the video output. .

在動畫影像資料中混合和同步該音頻剪輯的運作被展示在圖9的步驟609中。 The operation of mixing and synchronizing the audio clips in the animated video material is shown in step 609 of FIG.

圖3更詳盡地展示出一個視訊/影像分析器103的示例。此外,關於圖4,一個如在圖3中所示的該視訊/影像分析器103其運作會被更詳細地描述。 An example of a video/image analyzer 103 is shown in more detail in FIG. Further, with respect to Figure 4, the operation of the video/image analyzer 103 as shown in Figure 3 will be described in more detail.

在本發明中所描述的視訊/影像分析器103被配 置成可從該攝影機接收該視訊或影像資料,並確定被選擇為動畫候選者的物件、區域或子區域,換句話說,就是具有相關週期性的物件或子區域。在一些實施例中,該視訊/影像分析器103可以被配置來指出要做音訊插入的影像或視訊物件或子區域候選者。舉例來說,圖14展示出一示例視訊影像1301,其具有各種被指出的候選子區域。 The video/image analyzer 103 described in the present invention is equipped It is arranged to receive the video or video material from the camera and to determine an object, region or sub-region selected as an animation candidate, in other words, an object or sub-region having an associated periodicity. In some embodiments, the video/image analyzer 103 can be configured to indicate an image or video object or sub-region candidate to be audio inserted. For example, FIG. 14 shows an example video image 1301 having various indicated candidate sub-regions.

該圖的頂端部分顯示出該視訊影像1301包含有一個具有一種相關週期性的第一物件或區域,區域1,其中心在位置X1,Y1 1303;以及一個具有一種相關週期性的第二物件或區域,區域2,其中心在位置X2,Y2 1305。 The top portion of the figure shows that the video image 1301 contains a first object or region having an associated periodicity, a region 1 having a center at a position X 1 , Y 1 1303, and a second having a correlated periodicity. Object or area, area 2, centered at position X 2 , Y 2 1305.

在一些實施例中,該視訊/影像分析器103可以被配置來選擇該等物件或區域的其中之一作為候選區域(或是一「感興趣區域」)。因此,如在該圖的中間部分所示,該視訊/影像分析器可以選擇區域1和區域2,在圖中分別示為R1,1313,和R2,1315。 In some embodiments, the video/image analyzer 103 can be configured to select one of the objects or regions as a candidate region (or a "region of interest"). Thus, as shown in the middle portion of the figure, the video/image analyzer can select Region 1 and Region 2, shown as R 1 , 1313, and R 2 , 1315, respectively.

在一些實施例中,該視訊/影像分析器103可被配置來選擇該張影像的全部來作為相對於音頻剪輯插入之感興趣區域。因此,如該圖14底端部分中所示,視訊/影像分析器可以被配置成選擇整張影像1301,其包含該選擇的視訊物件或子區域,示為R1,1323,R2,1325,和該音訊選擇區域1321。 In some embodiments, the video/image analyzer 103 can be configured to select all of the image as a region of interest inserted relative to the audio clip. Thus, as shown in the bottom portion of FIG. 14, the video/image analyzer can be configured to select an entire image 1301 containing the selected video object or sub-region, shown as R 1 , 1323, R 2 , 1325. And the audio selection area 1321.

在一些實施例中,這些物件或區域可以使用某種基於子區域選擇規則的選擇方法來自動選擇、使用一些用戶輸入來半自動選擇,或者手動選擇(換句話說,完全由該 用戶介面的使用來選擇)。 In some embodiments, these objects or regions may be automatically selected using some sorting method based on sub-region selection rules, semi-automatically selected using some user input, or manually selected (in other words, entirely by the Use the user interface to choose).

在一些實施例中,該視訊/影像分析器103包含一個用戶介面輸入控制器171。該用戶介面輸入控制器171可被配置成可接收一用戶介面輸入,舉例來說,一觸控螢幕介面或任何合適的用戶介面輸入,諸如鍵盤、鍵板、滑鼠或任何合適的指標。在一些實施例中,該用戶介面輸入控制器可以被配置成可從該用戶介面輸入確定一選擇輸入,並把該選擇輸入連同該選擇輸入的位置一起傳送給該感興趣區域選擇器173。在一些實施例中,該用戶輸入控制器被配置成如同一用戶介面輸出來運作,舉例來說,被配置成把該視訊或影像資料輸出到一個合適的觸控螢幕顯示器上並從那些中做出該選擇。 In some embodiments, the video/image analyzer 103 includes a user interface input controller 171. The user interface input controller 171 can be configured to receive a user interface input, such as a touch screen interface or any suitable user interface input, such as a keyboard, keypad, mouse, or any suitable indicator. In some embodiments, the user interface input controller can be configured to determine a selection input from the user interface input and communicate the selection input to the region of interest selector 173 along with the location of the selection input. In some embodiments, the user input controller is configured to operate as the same user interface output, for example, configured to output the video or image data to a suitable touch screen display and from those Make this choice.

接收該攝影機資料/顯示該攝影機資料的運作被展示在圖4的步驟181中。 The operation of receiving the camera data/displaying the camera data is shown in step 181 of FIG.

此外,接收該用戶介面輸入的運作被顯示在圖4的步驟183中。 Additionally, the operation of receiving the user interface input is shown in step 183 of FIG.

在一些實施例中,該視訊/影像分析器103包含一個感興趣區域選擇器173。該感興趣區域選擇器173可以被配置成可接收該攝影機資料還有該用戶介面輸入控制的資料,並從該用戶介面輸入選擇來確定一物件、子區域或「感興趣區域」。 In some embodiments, the video/image analyzer 103 includes a region of interest selector 173. The region of interest selector 173 can be configured to receive the camera material and the user interface input control data and input selections from the user interface to determine an object, sub-area or "region of interest".

在一些實施例中,該感興趣區域選擇器173以自動或半自動的方式來運作,可在視訊或影像資料之中選出具相關週期性之任何合適的物件或子區域。換句話說,該 視訊或影像資料會被監控數個畫面,而在該視訊影像的一個物件或區域之內相關的週期性會被指出並被選擇作為一感興趣區域。這些物件或「感興趣區域」可以被使用來產生該會動照片的動作部分,或在一些實施例中選擇該區域,其可提供要和該影像資料做混合的一段音訊剪輯。 In some embodiments, the region of interest selector 173 operates in an automated or semi-automatic manner to select any suitable object or sub-region with associated periodicity among the video or image material. In other words, the The video or video data is monitored for a number of pictures, and the associated periodicity within an object or region of the video image is indicated and selected as a region of interest. These objects or "regions of interest" can be used to generate an action portion of the moving photo or, in some embodiments, select an area that provides an audio clip to be blended with the image material.

該感興趣區域選擇器173可以被配置來輸出該選定的物件或「感興趣區域」資料。因此,在一些實施例中,該感興趣區域選擇器173可以被配置成輸出該選擇的物件、子區域或「感興趣區域」數值給該音訊剪輯產生器107,並和該攝影機視訊或影像資料一起傳送給該會動照片產生器105。 The region of interest selector 173 can be configured to output the selected object or "region of interest" material. Thus, in some embodiments, the region of interest selector 173 can be configured to output the selected object, sub-region or "region of interest" value to the audio clip generator 107 and to the camera video or video material. It is transmitted to the moving photo generator 105 together.

輸出感興趣區域資料的運作被展示在圖4的步驟187中。 The operation of outputting the region of interest data is shown in step 187 of FIG.

圖5展示出一個音訊剪輯產生器107的示例。 FIG. 5 shows an example of an audio clip generator 107.

在一些實施例中,該音訊剪輯產生器107包含有一個合成音訊產生器201。在一些實施例中,該合成音訊產生器201可以被配置成從該視訊/影像分析器103接收所選擇的物件、子區域或「感興趣區域」資訊。此外,在一些實施例中,該合成音訊產生器201可以被配置成會被耦合到一音訊剪輯資料庫151。該音訊剪輯資料庫151可以是任何合適的資料庫或鏈結到的音頻訊號資料庫。舉例來說,在一些實施例中,該音訊剪輯資料庫151可以是儲存在網際網路或在「雲端」之音訊剪輯的一個資料庫。此外,在一些實施例中,該音訊剪輯資料庫151可以是音頻訊號或是鏈結 的一個集合或收集,該等鏈結指向儲存在該裝置記憶體中的音頻訊號。 In some embodiments, the audio clip generator 107 includes a composite audio generator 201. In some embodiments, the composite audio generator 201 can be configured to receive selected objects, sub-areas, or "regions of interest" information from the video/image analyzer 103. Moreover, in some embodiments, the composite audio generator 201 can be configured to be coupled to an audio clip library 151. The audio clip database 151 can be any suitable library or linked audio signal library. For example, in some embodiments, the audio clip database 151 can be a library of audio clips stored on the Internet or in the "cloud." Moreover, in some embodiments, the audio clip database 151 can be an audio signal or a link. A collection or collection of audio signals that are stored in the memory of the device.

圖6更詳盡地展示出該合成音訊產生器201的一個示例。此外,根據一些實施例,圖10展示了該合成音訊產生器201其運作的一個流程圖。 An example of the composite audio generator 201 is shown in more detail in FIG. Moreover, FIG. 10 illustrates a flow chart of the operation of the composite audio generator 201, in accordance with some embodiments.

在一些實施例中,該合成音訊產生器201包含了一個特徵/子區域關聯器301。該特徵/子區域關聯器301被配置成從該視訊/影像分析器103接收一個輸入。 In some embodiments, the composite audio generator 201 includes a feature/sub-region correlator 301. The feature/sub-region correlator 301 is configured to receive an input from the video/image analyzer 103.

接收該特徵/子區域選擇資訊的運作被顯示在圖10的步驟701中。 The operation of receiving the feature/sub-region selection information is displayed in step 701 of FIG.

在一些實施例中,該特徵/子區域關聯器301可以被配置成可把一個特徵關聯給由該視訊/影像分析器所選擇之該接收到的物件、子區域(或「感興趣區域」)。一個特徵可以是任何的標籤或主題,在其上可以找到關聯的音訊剪輯(然後被用來產生一合成的音訊效果)。 In some embodiments, the feature/sub-region correlator 301 can be configured to associate a feature to the received object, sub-region (or "region of interest") selected by the video/image analyzer. . A feature can be any tag or theme on which an associated audio clip can be found (and then used to produce a composite audio effect).

舉例來說,該裝置的用戶可能用該攝影機擷取到一鬱鬱蔥蔥綠色森林場景與一頭大象的視訊,其中大部分擷取到的場景都是綠色植物,而該大象的鼻子是場景中唯一的會動的成分。在這個例子中,基於該影像的週期性,該視訊/影像分析器103可以被配置成肇因於該子區域的週期性選擇該象鼻成為一個合適的影像或視訊物件或子區域。 For example, the user of the device may use the camera to capture a lush green forest scene and an elephant's video, most of which are captured by green plants, and the elephant's nose is the only scene in the scene. The moving ingredients. In this example, based on the periodicity of the image, the video/image analyzer 103 can be configured to select the elephant nose as a suitable image or video object or sub-region due to the periodicity of the sub-region.

該特徵/子區域關聯器301然後可以在如此的例子中被配置成用一個「大象」的特徵關聯到或指出該象鼻 的該選擇物件或子區域。 The feature/sub-region correlator 301 can then be configured in such an example to associate or indicate the elephant nose with an "elephant" feature The selection of objects or sub-areas.

在一些實施例中,該特徵/子區域關聯器301可以被配置成可基於形狀和/或動作自動指出該物件或確定一個關聯。舉例來說,一個「即指即現」系統就可以被使用,其中該影像物件、子區域或影像被傳送給一台可識別出該影像物件或子區域的伺服器。換句話說,該特徵/子區域關聯器並不是被實現在該裝置中。 In some embodiments, the feature/sub-region correlator 301 can be configured to automatically indicate the object or determine an association based on the shape and/or action. For example, a "point-and-click" system can be used in which the image object, sub-area or image is transmitted to a server that can identify the image object or sub-area. In other words, the feature/sub-region correlator is not implemented in the device.

在一些實施例中,該特徵/子區域關聯器301可基於位置或其他的感測器輸入協助來執行該子區域關聯或識別。舉例來說,該影像是在一動物園或於一大象公園中被拍攝,該動物園區域/大象公園的位置可被識別出,該相關的特徵被確定,並被關聯到該物件或子區域。 In some embodiments, the feature/sub-region correlator 301 can perform the sub-region association or identification based on location or other sensor input assistance. For example, the image is taken at a zoo or in an elephant park, the location of the zoo area/elephant park can be identified, the associated feature is determined, and associated with the object or sub-area .

在一些實施例中,該特徵/子區域關聯器301可以額外地接收一用戶輸入來致能一手動或半自動的特徵關聯。舉例來說,關於前面的例子,當該用戶被提示要指出該物件或子區域時,藉由該用戶輸入「大象」,由該象鼻所指出的物件或子區域就會被關聯到。此外,在一些實施例中,該特徵/子區域關聯器基於該影像可以提供多個建議的特徵,而該用戶可從該建議的特徵列表選擇其中一個(半自動模式)。舉例來說,使用前面的圖像例子,該特徵/子區域關聯器可以傳回(從該裝置的內部或外部回應該裝置的「即指即現」方法)一可能的特徵列表,例如「大象」、「灰蛇」、「巨鼠」等,而該用戶可選擇其中一個可能的特徵。 In some embodiments, the feature/sub-region correlator 301 can additionally receive a user input to enable a manual or semi-automatic feature association. For example, with the previous example, when the user is prompted to indicate the object or sub-area, by the user inputting "elephant", the object or sub-area indicated by the elephant nose is associated. Moreover, in some embodiments, the feature/sub-region correlator can provide a plurality of suggested features based on the image, and the user can select one of the suggested feature lists (semi-automatic mode). For example, using the previous image example, the feature/sub-region correlator can return a list of possible features (such as the "point-and-see" method from the inside or outside of the device), such as "large "Image", "Grey Snake", "Giant Mouse", etc., and the user can select one of the possible features.

該關聯的特徵識別然後可以被傳送給一剪輯請 求者303。 The associated feature recognition can then be sent to a clip please Seeker 303.

把該特徵關聯給該物件或子區域的運作被展示在圖10的步驟703中。 The operation of associating the feature to the object or sub-area is shown in step 703 of FIG.

在一些實施例中,該合成音訊產生器201包含一剪輯請求器303。在一些實施例中,該剪輯請求器303可以被配置成從該特徵/子區域關聯器301接收該關聯的特徵。該剪輯請求器303然後可以被配置成輸出最像那個或包含該特徵之音訊呈現的一音頻訊號、音訊剪輯或音訊樣本給該音訊資料庫151。因此,舉例來說,對於被識別為大象的物件或特徵,該剪輯請求器303可以從該音訊資料庫151請求一段大象音訊剪輯或音頻訊號/樣本。 In some embodiments, the composite audio generator 201 includes a clip requester 303. In some embodiments, the clip requestor 303 can be configured to receive the associated feature from the feature/sub-region correlator 301. The clip requestor 303 can then be configured to output an audio signal, audio clip or audio sample that most closely resembles or contains the audio presentation of the feature to the audio library 151. Thus, for example, for an object or feature identified as an elephant, the clip requester 303 can request an elephant audio clip or audio signal/sample from the audio library 151.

從該音訊資料庫請求一段與該特徵或識別出物件相關的剪輯的運作被展示在圖10的步驟705中。 The operation of requesting a clip associated with the feature or identifying the object from the audio library is shown in step 705 of FIG.

在一些實施例中,該合成音訊產生器201包含有一個剪輯傳回器/剪輯選擇器305。該剪輯傳回器/剪輯選擇器305被配置成從該音訊資料庫151接收與該特徵或識別出物件有關之合適的音訊資料庫音訊剪輯或音頻訊號/樣本。舉例來說,針對大象可從該音訊資料庫151傳回一系列的音訊剪輯或音頻訊號/樣本。在一些實施例中,該剪輯傳回器/剪輯選擇器305可被配置成可呈現這些示例音訊剪輯或音頻訊號/樣本候選者,並允許該用戶可選擇其中一個示例候選者,換句話說,選擇該音訊剪輯的運作是半自動的。在一些實施例中,該剪輯傳回器/剪輯選擇器305可以被配置成可自動選擇該剪輯,舉例來說,選擇具有最大相關性 的該音訊剪輯或音頻樣本。 In some embodiments, the composite audio generator 201 includes a clip passer/clip selector 305. The clip passer/clip selector 305 is configured to receive from the audio library 151 an appropriate audio library audio clip or audio signal/sample associated with the feature or identified object. For example, a series of audio clips or audio signals/samples may be returned from the audio library 151 for an elephant. In some embodiments, the clip passer/clip selector 305 can be configured to present these example audio clips or audio signal/sample candidates and allow the user to select one of the example candidates, in other words, The operation of selecting this audio clip is semi-automatic. In some embodiments, the clip passer/clip selector 305 can be configured to automatically select the clip, for example, selecting the maximum correlation The audio clip or audio sample.

從該音訊資料庫傳回一個關聯剪輯的運作被展示在圖10的步驟707中。 The operation of returning an associated clip from the audio library is shown in step 707 of FIG.

在一些實施例中,該剪輯傳回器/剪輯選擇器305可以輸出該選擇的音訊剪輯給該音訊剪輯嵌入器205。 In some embodiments, the clip passer/clip selector 305 can output the selected audio clip to the audio clip embedder 205.

輸出一音訊剪輯給該音訊剪輯嵌入器205的運作被展示在圖10的步驟709中。 The operation of outputting an audio clip to the audio clip embedder 205 is shown in step 709 of FIG.

舉例來說,處於環境吵雜此類地方的實施例中,使用該攝影機擷取一照片或視訊影像,而該音訊(例如在本討論的示例中該音訊剪輯是一大象聲音)的加入是一個後製的處理步驟。此外,舉例來說,在一些實施例中,在擷取照片或視訊影像的同時沒有任何聲音是有可能的,因此,要使一張會動照片具有音訊的方法就是後製處理,藉由加入合成的音訊來合成出該音訊。在這裡所描述的實施例中,在該用戶處應該不用再開啟任何獨立的應用程式或程式,或花上數小時來產生該音訊。舉例來說,在一些實施例中,該「即指即現」匹配(換句話說,該關聯與音訊剪輯請求和傳回運作)將是馬上有效的。如在這裡所描述的,在此之後該音訊剪輯的儲存就會發生,而該用戶可在以後的某個時間點撤消該音訊效果。 For example, in an embodiment where the environment is noisy, such a camera is used to capture a photo or video image, and the audio (eg, the audio clip is an elephant sound in the example discussed) is A post-processing step. In addition, for example, in some embodiments, it is possible to capture a photo or a video image without any sound. Therefore, the method for making a moving photo to have audio is post-processing, by adding The synthesized audio is combined to synthesize the audio. In the embodiment described herein, there should be no need to open any separate applications or programs at the user, or spend hours generating the audio. For example, in some embodiments, the "point-to-live" match (in other words, the association and audio clip request and return operations) will be effective immediately. As described herein, the storage of the audio clip will occur after that, and the user can undo the audio effect at a later point in time.

在一些實施例中,該音訊剪輯產生器107包含有一本質音頻訊號分析器203。該本質音頻訊號分析器203可以被配置成從一支麥克風或麥克風陣列171處接收一音頻訊號輸入。 In some embodiments, the audio clip generator 107 includes an intrinsic audio signal analyzer 203. The intrinsic audio signal analyzer 203 can be configured to receive an audio signal input from a microphone or microphone array 171.

根據一些實施例,圖7展示了本質音頻訊號分析器203的一個示例。此外,圖11描述了如在圖7中所示之該示例本質音頻訊號分析器203的運作。 FIG. 7 illustrates an example of an intrinsic audio signal analyzer 203, in accordance with some embodiments. In addition, FIG. 11 depicts the operation of the example essence audio signal analyzer 203 as shown in FIG.

在一些實施例中,該麥克風/麥克風陣列171可以被配置成和該裝置整合在一起,或者在一些實施例中,是和該裝置分開的。在該麥克風/麥克風陣列171與該裝置是實體分開的實施例中,該麥克風/麥克風陣列171可以被配置成把音頻訊號從該麥克風或麥克風陣列傳送到該裝置,特別是該音訊剪輯產生器107。 In some embodiments, the microphone/microphone array 171 can be configured to be integrated with the device or, in some embodiments, separate from the device. In embodiments where the microphone/microphone array 171 is physically separate from the device, the microphone/microphone array 171 can be configured to transmit audio signals from the microphone or microphone array to the device, particularly the audio clip generator 107. .

從該麥克風/麥克風陣列接收該音頻訊號的運作被展示在圖11的步驟801中。 The operation of receiving the audio signal from the microphone/microphone array is shown in step 801 of FIG.

在一些實施例中,該本質音頻訊號分析器203包含有一個環音聲檢測器401。 In some embodiments, the intrinsic audio signal analyzer 203 includes a ring sound detector 401.

該環音聲檢測器401可以被配置成從該麥克風/麥克風陣列171處接收該音頻訊號,並確定該音頻訊號的環境或環場音訊成分,其代表在同一時間被擷取的視訊其伴隨的環繞聲音頻訊號。舉例來說,在一些實施例中,該麥克風陣列包含兩支麥克風配置以擷取一立體聲音頻訊號。然而,要被理解的是,在一些實施例中,該音訊可以是多聲道的或是單聲道的音頻訊號。在一些實施例中,該環音聲檢測器401被配置成從整個音訊場景所擷取的音頻訊號中抽取主要的或是方向的成分以傳回環場音訊成分。 The ring sound detector 401 can be configured to receive the audio signal from the microphone/microphone array 171 and determine an environmental or surround sound component of the audio signal, which is representative of the video captured at the same time. Surround audio signal. For example, in some embodiments, the microphone array includes two microphone configurations to capture a stereo audio signal. However, it is to be understood that in some embodiments, the audio can be a multi-channel or mono audio signal. In some embodiments, the ring sound detector 401 is configured to extract a primary or directional component from the audio signal captured by the entire audio scene to return the ring field audio component.

要被理解的是,在一些實施例中,該環音聲檢測器401可以被配置來選擇一個特定方向的成分,而不是該環 音聲。舉例來說,在一些實施例中,該感興趣區域顯示出是一個特定的區域成分,該環音聲檢測器401可以被配置成從該視訊/影像分析器103接收一輸入,並選擇出可代表該辨別出物件或子區方向的該音頻訊號。 It is to be understood that in some embodiments, the ring sound detector 401 can be configured to select a component of a particular direction instead of the ring. sound. For example, in some embodiments, the region of interest is shown as a particular region component, and the ringtone detector 401 can be configured to receive an input from the video/image analyzer 103 and select an Represents the audio signal that identifies the direction of the object or sub-area.

在一些實施例中,該環音聲檢測器401可以檢測出,舉例來說,來自環繞該裝置其周圍環境的音頻訊號,例如風吹的聲音、下雨的聲音、海浪的聲音、或一般自然的聲音。在一些實施例中,該環音聲檢測器401被配置成可在該輸入音頻通道音訊資料上套用一種非負矩陣的因子分解。 In some embodiments, the ring sound detector 401 can detect, for example, an audio signal from the surrounding environment of the device, such as a wind blow, a raining sound, a sound of a wave, or generally natural. sound. In some embodiments, the ring sound detector 401 is configured to apply a non-negative matrix factorization on the input audio channel audio material.

舉例來說,該非負矩陣因子分解可以被總結如下:令Xr和Xl分別為該左、右聲道音訊資料。在一些實施例中,非負矩陣因子分解的第一步驟是計算出一共變異數矩陣C,由下式給出 For example, the non-negative matrix factorization can be summarized as follows: Let Xr and Xl be the left and right channel audio data, respectively. In some embodiments, the first step of non-negative matrix factorization is to compute a common variance matrix C, given by

其中:σ L 2:在該左聲道中的能量 Where: σ L 2 : energy in the left channel

σ R 2:在該右聲道中的能量 σ R 2 : energy in the right channel

ρ:在該兩通道之間的互相關係數。 ρ: the number of correlations between the two channels.

在一些實施例中,當σ的值小於0時,不進行非負矩陣因子分解一因為它意味著該左聲道和右聲道呈現負相關,其可以被視為是一個環聲訊號。 In some embodiments, when the value of σ is less than 0, non-negative matrix factorization is not performed because it means that the left and right channels exhibit a negative correlation, which can be considered a ring signal.

在一些實施例中,當ρ的值大於0時,該訊號的主要成分可以用矩陣C的一個秩為1的非負矩陣因子分解來估計。這會產生出該等標準差σ L σ R 的估計。 In some embodiments, when the value of ρ is greater than zero, the dominant component of the signal can be estimated using a non-negative matrix factorization of rank 1 of matrix C. This produces an estimate of the standard deviations σ L and σ R .

此外,在一些實施例中,基於左右聲道的標準差,權重的計算方式如下 Moreover, in some embodiments, based on the standard deviation of the left and right channels, the weights are calculated as follows

在一些實施例中,該獲得的權重可以被用來產生加權和成分與加權差成分以分別取得該主要成分和環音聲成分。 In some embodiments, the obtained weights can be used to generate weighted sum components and weighted difference components to obtain the primary component and the ringtone component, respectively.

藉由非負矩陣因子分解從該音頻訊號中確定該環音聲的運作展示在圖11的步驟803中。 The operation of determining the ring sound from the audio signal by non-negative matrix factorization is shown in step 803 of FIG.

在一些實施例中,該環音聲檢測器401被配置成輸出該濾波出的環音聲音頻訊號給該音訊剪輯嵌入器205。 In some embodiments, the ring sound detector 401 is configured to output the filtered ring sound audio signal to the audio clip embedder 205.

輸出該環音聲訊號給該嵌入器運作展示在圖11的步驟805中。 Outputting the ring sound signal to the embedder operation is shown in step 805 of FIG.

在一些實施例中,該音訊剪輯產生器107包含一音訊剪輯嵌入器205。圖8展示出一個示例嵌入器205。此外,在圖8中所示之該音訊剪輯嵌入器205的運作被描述在圖12中。 In some embodiments, the audio clip generator 107 includes an audio clip embedder 205. FIG. 8 shows an example embedder 205. Further, the operation of the audio clip embedder 205 shown in FIG. 8 is described in FIG.

該音訊剪輯嵌入器205被配置成接收由該合成音訊產生器201和該本質音頻訊號分析器203兩者中至少一種 所產生的音訊剪輯音頻訊號,並以一種方式配置該音頻訊號使得該音頻訊號有一種適於混合和同步的格式和形式。 The audio clip embedder 205 is configured to receive at least one of the synthesized audio generator 201 and the intrinsic audio signal analyzer 203 The resulting audio clips the audio signal and configures the audio signal in a manner such that the audio signal has a format and form suitable for mixing and synchronizing.

在一些實施例中,該音訊剪輯嵌入器205包含有一個音訊串流模型產生器501。該音訊串流模型產生器501被配置成可產生一個適合套用到該音頻訊號的一種音訊串流模型。該音訊串流創建模型可以被配置來配置出特定的音訊效果。舉例來說,該模型可以被配置來配置出一個簡單的開關或混音操作(該音訊剪輯的開啟和關閉切換)到一更為複雜的操作,諸如在該音頻訊號中的音調或節拍的變化。 In some embodiments, the audio clip embedder 205 includes an audio stream model generator 501. The audio stream model generator 501 is configured to generate an audio stream model suitable for applying to the audio signal. The audio stream creation model can be configured to configure a particular audio effect. For example, the model can be configured to configure a simple switch or mix operation (switching of the audio clip on and off) to a more complex operation, such as a change in pitch or beat in the audio signal. .

在一些實施例中,該音訊串流模型產生器501可以被配置來確定該音訊剪輯應該要以一個特定的時間長度,或以一特定的開始時間和結束時間在一段定義的期間內被輸出。舉例來說,如在圖15中所示,對於來自該整個視訊序列的一環音聲,其模型就是使得該音訊串流模型被定義為在該整個動畫期間都輸出該音頻訊號。換句話說,該音訊串流模型會從時間0秒到時間T秒輸出該T秒時間長的環音聲音訊剪輯,並一直循環播放該音訊剪輯。然而,在一些實施例中,所產生的模型可以定義成只在該視訊序列中的一部分來輸出一音頻訊號。在一些實施例中,該音訊串流模型產生器可以被配置成進一步為該音頻訊號剪輯的開始和結束定義一個合適的音調或音量處理,使得當把該音頻訊號的一個循環的結束連接到下一個循環的開始時,不會有任何明顯的不連續性。 In some embodiments, the audio stream model generator 501 can be configured to determine whether the audio clip should be output for a specified length of time, or for a particular start time and end time, for a defined period of time. For example, as shown in FIG. 15, for a ring sound from the entire video sequence, the model is such that the audio stream model is defined to output the audio signal during the entire animation. In other words, the audio stream model outputs the ring-tone audio clip of the T-second time from time 0 seconds to time T seconds, and the audio clip is played continuously. However, in some embodiments, the generated model can be defined to output an audio signal only in a portion of the video sequence. In some embodiments, the audio stream model generator can be configured to further define an appropriate tone or volume process for the beginning and end of the audio signal clip such that when the end of a loop of the audio signal is connected to There is no obvious discontinuity at the beginning of a loop.

換句話說,在一些實施例中,該模型可以被配置來定義該音頻訊號的一個循環和一個變形使得該被體驗的音頻訊號是連續的而且不是斷續的(當該音頻訊號循環播放時,具有暫停、或突然的音量或音調變化)。 In other words, in some embodiments, the model can be configured to define a loop and a deformation of the audio signal such that the experienced audio signal is continuous and not intermittent (when the audio signal is looped, Has a pause, or a sudden volume or pitch change).

在一些實施例中,該音訊串流模型產生器可以為每一個要被輸出的該等音訊效果都產生一個音訊串流模型。舉例來說,一環音聲音頻剪輯可以在該影像/視訊的整個時間長度上被播放,並具有一合成的音頻剪輯只在該整個長度的一個部分上被播放。 In some embodiments, the audio stream model generator can generate an audio stream model for each of the audio effects to be output. For example, a ringtone audio clip can be played over the entire length of the video/video and has a composite audio clip that is played only on one portion of the entire length.

該音訊串流模型的產生被展示在圖12的步驟901中。 The generation of the audio stream model is shown in step 901 of FIG.

該音訊串流模型產生器501可以被配置來輸出該音訊串流模型給一音訊串流處理器503。 The audio stream model generator 501 can be configured to output the audio stream model to an audio stream processor 503.

在一些實施例中,該音訊剪輯嵌入器205可以包含有一音訊串流處理器503。該音訊串流處理器503可以被配置成從該合成音頻產生器201或該本質音頻訊號分析器203(或從該兩者)接收將被進行串流處理的音頻訊號。換句話說,由該音訊串流模型產生器501所產生的該模型會被套用來產生一個基於該模型的輸出音頻訊號。換句話說,該音頻訊號可以按任何合適的方式來被選擇、濾波、切換、混合使得該模型被遵循。 In some embodiments, the audio clip embedder 205 can include an audio stream processor 503. The audio stream processor 503 can be configured to receive audio signals to be streamed from the synthesized audio generator 201 or the intrinsic audio signal analyzer 203 (or both). In other words, the model generated by the audio stream model generator 501 is used to generate an output audio signal based on the model. In other words, the audio signal can be selected, filtered, switched, mixed in any suitable manner such that the model is followed.

接收要被嵌入的該音頻訊號被展示在圖12的步驟903中。 Receiving the audio signal to be embedded is shown in step 903 of FIG.

把該串流模型套用到該音頻訊號的運作被展示 在圖12的步驟905中。 The operation of applying the streaming model to the audio signal is shown In step 905 of FIG.

在一些實施例中,該音訊串流處理器503可以被配置成輸出該音頻訊號到該混合器和同步器109。 In some embodiments, the audio stream processor 503 can be configured to output the audio signal to the mixer and synchronizer 109.

輸出該串流處理後的音頻訊號到該混合器和同步器的運作被展示在圖12的步驟907中。 The operation of outputting the streamed audio signal to the mixer and synchronizer is shown in step 907 of FIG.

一示例混合器和同步器的運作被展示在圖13中。 The operation of an example mixer and synchronizer is shown in FIG.

在一些實施例中,該混合器和同步器109可以被配置來接收該視訊資料。 In some embodiments, the mixer and synchronizer 109 can be configured to receive the video material.

接收該視訊資料的運作被展示在圖13的步驟1001中。 The operation of receiving the video material is shown in step 1001 of FIG.

此外,該混合器和同步器109可以被配置成接收來自該音頻剪輯產生器107的該音訊資料。 Additionally, the mixer and synchronizer 109 can be configured to receive the audio material from the audio clip generator 107.

接收該音訊資料的運作被展示在圖13的步驟1003中。 The operation of receiving the audio material is shown in step 1003 of FIG.

在一些實施例中,該混合器和同步器109可以同步或結合該音訊資料與該視訊資料。舉例來說,在一些實施例中,如圖15所示的該視訊有兩個感興趣區域,其中第一個有意義的週期性其中心位置位於像素X1,Y1,長度為以t1為中心的兩秒,其產生一動畫視訊子區域1401;而第二個有意義的週期性其中心位置位於像素X2,Y2,長度為以t2為中心的兩秒,被展示在視訊序列1403中;該擷取視訊長度為T,有一個來自整個視訊序列的環音聲,將被用於循環播放。該環音聲,已經根據該音訊串流模型被嵌入(處理)以產生一環音聲序列1405。該混合器和同步器可以被配置 來同步該音頻訊號到該影像和該影像動畫。舉例來說,如在圖15中所示,該音頻訊號或剪輯可以被同步在該視訊或聲音循環1405的開始處。 In some embodiments, the mixer and synchronizer 109 can synchronize or combine the audio material with the video material. For example, in some embodiments, the video as shown in FIG. 15 has two regions of interest, wherein the first meaningful periodicity has a center position at the pixel X 1 , Y 1 and a length of t 1 . Two seconds of the center, which produces an animated video sub-region 1401; and a second meaningful periodicity whose center position is at pixel X 2 , Y 2 , and the length is two seconds centered on t 2 , is displayed in video sequence 1403 Medium; the captured video length is T, and there is a ring sound from the entire video sequence that will be used for loop playback. The ring sound has been embedded (processed) according to the audio stream model to produce a ring sound sequence 1405. The mixer and synchronizer can be configured to synchronize the audio signal to the image and the image animation. For example, as shown in FIG. 15, the audio signal or clip can be synchronized at the beginning of the video or sound cycle 1405.

在一些實施例中,該混合器和同步器109然後可以被配置來混合或多工該資料以形成一會動照片或動畫影像元資料檔案,其包含影像或視訊資料和音頻訊號資料。在一些實施例中,資料的這種混合或多工可以產生一檔案,其至少包含以下的一些:視訊資料、音訊資料、子區域的識別資料和時間同步資料,按照任何合適的格式。在一些實施例中,該混合器和同步器109可以輸出元資料或是歸檔輸出資料。 In some embodiments, the mixer and synchronizer 109 can then be configured to mix or multiplex the data to form a moving photo or animated video metadata file containing image or video material and audio signal data. In some embodiments, such mixing or multiplexing of data may result in a file containing at least some of the following: video material, audio material, sub-region identification data, and time synchronization data, in any suitable format. In some embodiments, the mixer and synchronizer 109 can output metadata or archive output data.

應當被領會的是,用戶裝置該術語意在涵蓋任何適當類型的無線用戶裝置,諸如行動電話、可攜式資料處理裝置或可攜式的網路瀏覽器。此外,將被理解的是,音響聲道該術語意在涵蓋聲音出口、通道和共振腔,而如此的聲音通道可以與該換能器一體成形,或者作為該換能器與該裝置其機械整合的一部分。 It should be appreciated that the term user device is intended to cover any suitable type of wireless user device, such as a mobile phone, a portable data processing device, or a portable web browser. Furthermore, it will be understood that the term acoustic channel is intended to encompass sound outlets, channels and resonant cavities, and such a sound channel may be integrally formed with the transducer or as a mechanical integration of the transducer with the device. a part of.

在一般的情況下,本發明其各種實施例的設計可以用硬體或專用電路、軟體、邏輯或其任意的組合來實現。舉例來說,某些方面可以被實現在硬體中,而其他的方面可以被實現在一個可由一控制器、微處理器或其它計算裝置來執行的韌體或軟體中,雖然本發明並不侷限於此。雖然本發明其各種不同的方面是用方塊圖、流程圖、或使用一些其他的圖形表示來說明和描述,但可被充分理解的 是,本發明所描述的這些方塊圖、裝置、系統、技術或方法可被實現在,但並不侷限於,硬體、軟體、韌體、特別用途的電路或邏輯、通用硬體或控制器或其他計算裝置、或它們的某種組合中。 In general, the design of various embodiments of the present invention can be implemented in hardware or special purpose circuits, software, logic, or any combination thereof. For example, some aspects may be implemented in hardware, while other aspects may be implemented in a firmware or software executable by a controller, microprocessor or other computing device, although the invention is not Limited to this. Although the various aspects of the invention have been illustrated and described in terms of a block diagram, a flow diagram, or some other graphical representation, it can be fully understood. Yes, the block diagrams, devices, systems, techniques, or methods described herein may be implemented, but are not limited to, hardware, software, firmware, special purpose circuits or logic, general purpose hardware or controllers. Or other computing devices, or some combination thereof.

本發明其實施例的設計可被實現的方式有:用可由該行動裝置的一個資料處理器來執行的計算機軟體,其位於該處理器實體中;或是用硬體;或是用軟體和硬體的一種組合。此外,關於這方面應該要被指出的是,在該等圖示中任何的邏輯流程方塊可以代表:程式的步驟;或是相互連接的邏輯電路、模塊和功能;或是程式步驟和邏輯電路、模塊和功能的一種組合。該軟體可被儲存在:實體媒體諸如記憶體晶片或是被實現在處理器內部的記憶體區塊;磁性媒體諸如硬碟或軟碟;以及光學媒體諸如DVD和其資料變形,CD。 The design of an embodiment of the present invention can be implemented in a computer software that can be executed by a data processor of the mobile device, located in the processor entity; or in hardware; or in software and hard a combination of bodies. In addition, it should be noted in this regard that any of the logic flow blocks in the figures may represent: steps of the program; or interconnected logic circuits, modules and functions; or program steps and logic circuits, A combination of modules and functions. The software can be stored in: physical media such as a memory chip or a memory block implemented inside the processor; magnetic media such as a hard disk or a floppy disk; and optical media such as a DVD and its data variant, CD.

在本發明其實施例的設計中被使用的記憶體可以是適合於該當地的技術環境的任何類型,並且可以使用任何合適的資料儲存技術,諸如基於半導體的記憶體裝置、磁性記憶體裝置和系統、光學的記憶體裝置和系統、固定式記憶體和可移式記憶體。該資料處理器可以是適合該當地技的術環境的任何類型,並且可以包含有一台或多台通用計算機、特殊用途計算機、微處理器、數位訊號處理器(DSP)、特定應用積體電路(ASIC),閘層級電路和基於多核心處理器架構的處理器,但並不侷限於所述這些例子。 The memory used in the design of embodiments of the present invention may be of any type suitable for the local technical environment, and any suitable data storage technique may be used, such as semiconductor based memory devices, magnetic memory devices, and System, optical memory devices and systems, fixed memory and removable memory. The data processor can be any type suitable for the local technical environment, and can include one or more general purpose computers, special purpose computers, microprocessors, digital signal processors (DSPs), application specific integrated circuits ( ASIC), gate level circuits and processors based on multi-core processor architecture, but are not limited to the examples described.

本發明的實施例可以藉由各種的組件,諸如積體 電路模組,來被設計出。 Embodiments of the invention may be implemented by various components, such as integrated bodies The circuit module is designed to be.

如在本發明申請書中所使用的,「電路」該術語是指以下所有的情況:(a)只用硬體的電路實現(諸如只用類比和/或數位電路的實現)和(b)電路和軟體(和/或韌體)的組合,諸如:(i)處理器的一個組合,或(ii)處理器/軟體(包含數位訊號處理器)的一部分、軟體、和記憶體器一起工作來形成一種裝置,諸如一行動電話或伺服器,來執行各種功能,和(c)電路,諸如一個微處理器或一個微處理器的一部分,其需要軟體或韌體來運作,即使該軟體或韌體實際上是不存在的。 As used in this application, the term "circuitry" refers to all of the following: (a) implemented only in hardware (such as implementations using analog and/or digital circuits only) and (b) A combination of circuitry and software (and/or firmware), such as: (i) a combination of processors, or (ii) a portion of a processor/software (including a digital signal processor), a software, and a memory device To form a device, such as a mobile phone or server, to perform various functions, and (c) a circuit, such as a microprocessor or a portion of a microprocessor that requires software or firmware to operate even if the software or The firmware does not actually exist.

「電路」的這個定義適用於在本發明申請書中對這個術語的所有使用,包含所有的專利申請範圍。作為一個進一步的示例,如本發明申請書中所使用,「電路」該術語也涵蓋僅有一個處理器(或多個處理器)的實現或一處理器與它(或它們)所伴隨的軟體和/或韌體的一部分的實現。「電路」該術語也涵蓋,舉例來說而且如果可適用於該特別專利申請範圍元件的話,一行動電話的一基頻帶積體電路或應用程式處理器積體電路,或是在伺服器、一蜂巢網路裝置、或其他網路裝置內類似的積體電路。 This definition of "circuitry" applies to all uses of this term in the application of the present invention, including all patent application scopes. As a further example, as used in the application of the present application, the term "circuitry" also encompasses an implementation of only one processor (or multiple processors) or a software associated with it (or their). And/or the implementation of a part of the firmware. The term "circuitry" also encompasses, for example, and if applicable to the elements of the particular patent application, a baseband integrated circuit or application processor integrated circuit of a mobile telephone, or a server, A similar integrated circuit in a cellular network device, or other network device.

經由示範性和非限制性的示例,該先前的描述已經提供本發明其示例實施例的一種完整且資訊充分的描述。然而,對於那些在該相關領域中的知悉技術人員而言, 有鑑於先面的描述,當再配合閱讀該所附圖示和該所附專利申請範圍時,各式各樣的修改和調整會變得顯而易見。然而,本發明教學之所有的這些和類似的修改仍將落入到本發明的範疇內,其正如在該所附的專利申請範圍中所定義。 This previous description has provided a complete and well-documented description of example embodiments of the invention, by way of exemplary and non-limiting example. However, for those skilled in the relevant art, Numerous modifications and adaptations will become apparent in the light of the appended claims. However, all such and similar modifications of the teachings of the present invention are still within the scope of the invention as defined in the appended claims.

10‧‧‧電子裝置 10‧‧‧Electronic devices

11‧‧‧觸控輸入模組 11‧‧‧Touch input module

12‧‧‧顯示器 12‧‧‧ display

13‧‧‧收發器(TX/RX) 13‧‧‧Transceiver (TX/RX)

15‧‧‧處理器 15‧‧‧ processor

16‧‧‧記憶體 16‧‧‧ memory

17‧‧‧程式碼區段 17‧‧‧Code section

18‧‧‧內儲資料區段 18‧‧‧Inner data section

Claims (15)

一種裝置,其包含有至少一個處理器和至少一個內儲有一個或多個程式指令碼的記憶體,該至少一個記憶體和該指令碼被配置成與該至少一個處理器一起使得該裝置至少執行:分析至少兩張影像以確定該至少兩張影像共同的至少一物件,該物件具有一種週期性的運動;基於該至少兩張影像產生一種動畫影像,其中該至少一個物件被製作成動畫;確定與該至少一個物件關聯的至少一個音頻訊號;以及結合該至少一個音頻訊號和該動畫影像以產生一種音訊致能的動畫影像。 An apparatus comprising at least one processor and at least one memory having stored therein one or more program instruction codes, the at least one memory and the instruction code being configured to cause the apparatus to be at least with the at least one processor Performing: analyzing at least two images to determine at least one object common to the at least two images, the object having a periodic motion; generating an animated image based on the at least two images, wherein the at least one object is made into an animation; Determining at least one audio signal associated with the at least one object; and combining the at least one audio signal with the animated image to generate an audio enabled animated image. 如請求項1中之裝置,其中確定與該至少一個物件關聯的至少一個音頻訊號使得該裝置:接收該至少一個音頻訊號;以及濾波該至少一個音頻訊號。 The apparatus of claim 1, wherein the determining at least one audio signal associated with the at least one object causes the apparatus to: receive the at least one audio signal; and filter the at least one audio signal. 如請求項2中之裝置,其中接收至少一個音頻訊號使得該裝置:從至少一支麥克風處接收該至少一個音頻訊號的至少一個部分本質上是和該被擷取的至少兩張影像是在同一時間點上。 The apparatus of claim 2, wherein the receiving at least one audio signal causes the apparatus: receiving at least one portion of the at least one audio signal from the at least one microphone is substantially the same as the at least two images captured At the time. 如請求項2或3任何一個中之裝置,其中濾波至少一個音 頻訊號使得該裝置執行以下至少一項動作:確定該至少一個音頻訊號的至少一個前景音源;濾波該至少一個音頻訊號以從該至少一個音頻訊號處移除該至少一個前景音源以產生一種環音聲音頻訊號,作為與該至少一個物件關聯的至少一個音頻訊號;以及濾波該至少一個音頻訊號以從該至少一個音頻訊號處抽取出該至少一個前景音源以產生一個主要聲音成分作為該至少一個音頻訊號。 A device according to any one of claims 2 or 3, wherein at least one tone is filtered The frequency signal causes the apparatus to perform at least one of: determining at least one foreground sound source of the at least one audio signal; filtering the at least one audio signal to remove the at least one foreground sound source from the at least one audio signal to generate a ring sound And an audible audio signal as at least one audio signal associated with the at least one object; and filtering the at least one audio signal to extract the at least one foreground sound source from the at least one audio signal to generate a primary sound component as the at least one audio Signal. 如前述任何一個請求項中之裝置,其中確定與該至少一個物件關聯的至少一個音頻訊號使得該裝置:接收該至少一個物件共同於該至少兩張影像的一種指示;指出該至少一個物件共同於該至少兩張影像;以及基於該指出的物件產生該至少一個音頻訊號。 The apparatus of any one of the preceding claims, wherein the determining at least one audio signal associated with the at least one object causes the apparatus to: receive an indication that the at least one object is common to the at least two images; indicating that the at least one object is common to The at least two images; and generating the at least one audio signal based on the pointed object. 如請求項5中之裝置,其中指出至少一個物件使得該裝置:確定與該至少兩張影像關聯的一個位置;基於該至少兩張影像的位置指出至少一個物件;以及選取該至少一個物件。 The apparatus of claim 5, wherein the at least one object is indicative of the apparatus: determining a location associated with the at least two images; indicating at least one object based on the location of the at least two images; and selecting the at least one object. 如請求項5或6任何一個中之裝置,其中該至少一個物件使得該裝置:執行共同於該至少兩張影像之該至少一個物件其圖 型識別分析。 The apparatus of any one of claims 5 or 6, wherein the at least one object causes the apparatus to: perform execution of the at least one object common to the at least two images Type identification analysis. 如前述任何一個請求項中之裝置,其中確定與該至少一個物件關聯的至少一個音頻訊號使得該裝置執行以下至少一項動作:從一個外部的音訊資料庫接收該至少一個音頻訊號;從一個內部的音訊資料庫接收該至少一個音頻訊號;從一記憶體接收該至少一個音頻訊號;以及合成該至少一個音頻訊號。 The apparatus of any one of the preceding claims, wherein the determining at least one audio signal associated with the at least one object causes the apparatus to perform at least one of: receiving the at least one audio signal from an external audio library; from an internal The audio database receives the at least one audio signal; receives the at least one audio signal from a memory; and synthesizes the at least one audio signal. 如請求項2到8任何一個中之裝置,其中接收該至少一個音頻訊號進一步使得該裝置:產生一種音訊串流模型來控制該至少一個音頻訊號的一種呈現;以及使用該音訊串流模型來處理該至少一個音頻訊號。 The apparatus of any one of claims 2 to 8, wherein receiving the at least one audio signal further causes the apparatus to: generate an audio stream model to control a presentation of the at least one audio signal; and process the audio stream model using the audio stream model The at least one audio signal. 如請求項9中之裝置,其中產生該音訊串流模型使得該裝置執行以下至少一項動作:確定音頻訊號的程序;確定音頻訊號的音量;確定音頻訊號的播放速度;確定音頻訊號的重複週期;確定音頻訊號的開始時間;以及確定音頻訊號的結束時間。 The device of claim 9, wherein the audio stream model is generated such that the device performs at least one of: determining a program of an audio signal; determining a volume of the audio signal; determining a playback speed of the audio signal; determining a repetition period of the audio signal ; determine the start time of the audio signal; and determine the end time of the audio signal. 如前述任何一個請求項中之裝置,其中該裝置更被啟動 來:把該至少一個音頻訊號與該動畫影像做時間同步以產生一個同步化的音訊致能動畫影像。 A device as in any one of the preceding claims, wherein the device is activated To: time synchronize the at least one audio signal with the animated image to generate a synchronized audio enabled animation image. 如前述任何一個請求項中之裝置,其中分析至少兩張影像以確定該至少兩張影像共同的該至少一物件更可使得該裝置:從一影像源接收該至少兩張影像;輸出該至少兩張影像到一顯示器上;以及接收至少一個用戶輸入確定該至少兩張影像共同的該至少一物件。 The apparatus of any one of the preceding claims, wherein analyzing the at least two images to determine the at least one object common to the at least two images further enables the device to: receive the at least two images from an image source; output the at least two And displaying the image on a display; and receiving the at least one user input to determine the at least one object common to the at least two images. 如前述任何一個請求項中之裝置,其中分析至少兩張影像以確定該至少兩張影像共同的該至少一物件更可使得該裝置:關聯分析該至少兩張影像以確定至少一個候選物件,該至少一個候選物件共同於該至少兩張影像並具有一種週期性的運動;以及選擇該至少一個候選物件作為該至少一個物件。 The apparatus of any one of the preceding claims, wherein analyzing the at least two images to determine the at least one object common to the at least two images further enables the apparatus to: correlate the at least two images to determine at least one candidate object, At least one candidate object is common to the at least two images and has a periodic motion; and the at least one candidate object is selected as the at least one object. 一種電子裝置,其包含有如請求項1到13中之裝置。 An electronic device comprising the devices of claims 1 to 13. 一種方法,其包含有:分析至少兩張影像以確定該至少兩張影像共同的至少一物件,該物件具有一種週期性的運動;基於該至少兩張影像產生一種動畫影像,其中該至少一個物件被製作成動畫;確定與該至少一個物件關聯的至少一個音頻訊 號;以及結合該至少一個音頻訊號和該動畫影像以產生一種音訊致能的動畫影像。 A method comprising: analyzing at least two images to determine at least one object common to the at least two images, the object having a periodic motion; generating an animated image based on the at least two images, wherein the at least one object Being animated; determining at least one audio message associated with the at least one object And combining the at least one audio signal and the animated image to generate an audio-enabled animated image.
TW102132551A 2012-09-11 2013-09-10 An image enhancement apparatus TW201411552A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CHCH37742012 2012-09-11

Publications (1)

Publication Number Publication Date
TW201411552A true TW201411552A (en) 2014-03-16

Family

ID=50820902

Family Applications (1)

Application Number Title Priority Date Filing Date
TW102132551A TW201411552A (en) 2012-09-11 2013-09-10 An image enhancement apparatus

Country Status (1)

Country Link
TW (1) TW201411552A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9838615B2 (en) 2014-05-22 2017-12-05 Htc Corporation Image editing method and electronic device using the same

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9838615B2 (en) 2014-05-22 2017-12-05 Htc Corporation Image editing method and electronic device using the same

Similar Documents

Publication Publication Date Title
JP7457082B2 (en) Reactive video generation method and generation program
JP6559316B2 (en) Method and apparatus for generating haptic feedback based on analysis of video content
US20140078398A1 (en) Image enhancement apparatus and method
CN106664376B (en) Augmented reality device and method
US11782507B2 (en) Image changes based on facial appearance
US8644467B2 (en) Video conferencing system, method, and computer program storage device
US20160198097A1 (en) System and method for inserting objects into an image or sequence of images
CN110121093A (en) The searching method and device of target object in video
WO2023279705A1 (en) Live streaming method, apparatus, and system, computer device, storage medium, and program
CN112199016B (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
TW202314496A (en) Task processing method, electrionic equipment and computer-readable storage medium
CN104333688B (en) The device and method of image formation sheet feelings symbol based on shooting
US20230368461A1 (en) Method and apparatus for processing action of virtual object, and storage medium
WO2020210407A1 (en) System and layering method for fast input-driven composition and live-generation of mixed digital content
US11889222B2 (en) Multilayer three-dimensional presentation
CN112235635A (en) Animation display method, animation display device, electronic equipment and storage medium
CN103873759B (en) A kind of image pickup method and electronic equipment
CN109743566A (en) A kind of method and apparatus of the video format of VR for identification
US10575043B2 (en) Navigating a plurality of video content items
US9269158B2 (en) Method, apparatus and computer program product for periodic motion detection in multimedia content
CN113794831B (en) Video shooting method, device, electronic equipment and medium
KR102367640B1 (en) Systems and methods for the creation and display of interactive 3D representations of real objects
EP2706531A1 (en) An image enhancement apparatus
WO2022170449A1 (en) Method and device for displaying picture window, terminal and storage medium
CN114450730A (en) Information processing system and method