TW201319952A

TW201319952A - Identifying visual media content captured by camera-enabled mobile device

Info

Publication number: TW201319952A
Application number: TW100140157A
Authority: TW
Inventors: Brian Momeyer; Selena Melissa Salazar; Babak Forutanpour
Original assignee: Qualcomm Inc
Priority date: 2011-11-03
Filing date: 2011-11-03
Publication date: 2013-05-16

Abstract

Automatic identification of media content is at least partially based upon visually capturing a still or video image of media content being presented to a user via another device. The media content can be further refined by determining location of the user, capturing an audio portion of the media content, date and time of the capture, or profile/behavioral characteristics of the user. Identifying the media content can require (1) distinguishing a rectangular illumination the corresponds to a video display; (2) decoding a watermark presented within the displayed image/video; (3) characterizing the presentation sufficiently for determining a particular time stamp or portion of a program; and (4) determining user setting preferences for viewing the program (e.g., close captioning, aspect ratio, language). Thus identified, the media content appropriately formatted can be received for continued presentation on a user interface of the mobile device.

Description

Identify visual media content captured by camera-enabled mobile devices

本發明涉及移動操作環境，並且更具體地，涉及以視覺方式辨識由具有相機功能的行動設備擷取的視覺媒體內容。The present invention relates to mobile operating environments and, more particularly, to visually identifying visual media content captured by a camera-enabled mobile device.

幾十年來，數位影像處理的發展已經嘗試使某些視覺能力自動化，諸如圖像辨識。電腦視覺已經嘗試辨識障礙物以便實現自主導覽。光學字元辨識依賴於諸如偵測圖像中的歪斜以及執行字元形狀關聯之類的技術。監控系統嘗試辨識諸如人臉之類的生物測定資料以便保持安全性。For decades, the development of digital image processing has attempted to automate certain visual capabilities, such as image recognition. Computer vision has tried to identify obstacles in order to achieve self-directed viewing. Optical character recognition relies on techniques such as detecting skew in an image and performing character shape associations. The monitoring system attempts to identify biometric data such as faces to maintain security.

影像處理的一個實例是為視訊廣播節目的每個已知段產生數位金鑰簽名，該數位金鑰簽名隨後能夠由為未知段產生的數位金鑰簽名進行匹配。此種技術被廣播廣告商用於自動追蹤以便查看在特定市場中傳播了什麼廣告節目。此種處理得益於能夠採樣到廣播視訊訊號的高保真版本。另外，此種採樣和分析能夠由在時間上具有高處理容量的裝置來執行。由此，執行影像處理的設備通常並非是移動的或意欲消費者使用的。An example of image processing is to generate a digital key signature for each known segment of a video broadcast program, which digital signature can then be matched by a digital key signature generated for the unknown segment. This technology is used by broadcast advertisers for automatic tracking to see what commercials are being broadcast in a particular market. This type of processing benefits from the ability to sample high-fidelity versions of broadcast video signals. Additionally, such sampling and analysis can be performed by devices that have high processing capacity in time. Thus, the device performing image processing is typically not mobile or intended for consumer use.

技術的進步已經產生了更小且更為強大的處理設備。例如，當前存在各種小型、輕品質且易於使用者攜帶的可攜式個人計算設備，包括無線計算設備，諸如可攜式無線電話、個人數位助理(PDA)和傳呼設備。更具體地，諸如蜂巢式電話和網際網路協定(IP)電話之類的可攜式無線電話能夠經由無線網路傳輸語音和資料封包。此外，很多此類無線電話包括其他類型的設備，該等類型的設備也包含在本文中。Advances in technology have produced smaller and more powerful processing equipment. For example, there are currently a variety of portable, personal computing devices that are small, lightweight, and easy to carry, including wireless computing devices such as portable wireless telephones, personal digital assistants (PDAs), and paging devices. More specifically, portable radiotelephones, such as cellular phones and Internet Protocol (IP) phones, are capable of transmitting voice and data packets over a wireless network. In addition, many such wireless telephones include other types of devices, and such types of devices are also included herein.

在具有數碼相機或顯示由數碼相機擷取的圖像或視訊資料的可攜式個人計算設備中常常使用數位訊號處理器(DSP)、影像處理器和其他處理設備。上述處理設備能夠用於提供視訊和音訊功能，以便處理接收到的諸如圖像資料之類的資料或執行其他功能。數位成像技術使得小型設備能夠擷取圖像資料以及增強和發送圖像資料。Digital signal processors (DSPs), image processors, and other processing devices are often used in portable personal computing devices having digital cameras or displaying images or video data captured by digital cameras. The processing device described above can be used to provide video and audio functions for processing received data such as image data or performing other functions. Digital imaging technology enables small devices to capture image data and enhance and send image data.

會產生以下情況：在某個場景(venue)使用者正在使用媒體內容，但希望在行進中在他們的行動設備上觀看或閱讀該資料。希望在離開該場景之後繼續觀看或閱讀該內容，使用者希望不需要費力的搜尋就能夠容易地定位該內容。在缺乏對媒體內容的源的直接存取的情況下，此種情況尤其如此。另外，與語音辨識不同，在對媒體內容的段進行自動視覺辨識時可以出現各種複雜情況。圖像的朝向和視野內的無關圖像能夠使得經由可攜式個人計算設備的取景器擷取圖像或視訊段的努力複雜化。There are situations in which a user is using media content at a certain venue, but wants to watch or read the material on their mobile device while on the go. It is desirable to continue watching or reading the content after leaving the scene, and the user wishes to be able to easily locate the content without the need for a laborious search. This is especially the case in the absence of direct access to the source of media content. In addition, unlike voice recognition, various complex situations can occur when automatically visually recognizing segments of media content. The orientation of the image and the unrelated images within the field of view can complicate efforts to capture images or video segments via the viewfinder of the portable personal computing device.

以下提供了對一或多個方案的簡單概要，以便提供對該等方案的基本理解。該概要並非是對所有設想到的方案的寬泛概述，並且並非意欲決定所有方案中的關鍵或重要的元素或勾畫出任何或全部方案的範圍。該概要唯一的目的在於以簡化的形式提供一或多個方案的一些構思來作為稍後提供的更為詳細的描述的前序。A brief summary of one or more aspects is provided below to provide a basic understanding of the aspects. This summary is not an extensive overview of all contemplated aspects, and is not intended to identify key or critical elements in all aspects or the scope of any or all aspects. The sole purpose of the summary is to present some of the embodiments of the invention

在一個方案中，提供了一種用於辨識視覺媒體內容的方法。從行動設備的相機接收圖像。偵測在該圖像中包含的四邊形。擷取在該四邊形內包含的視覺媒體內容以辨識該視覺媒體內容。In one aspect, a method for identifying visual media content is provided. Receive images from the camera of the mobile device. Detects the quads contained in the image. The visual media content contained within the quadrilateral is captured to identify the visual media content.

在另一個方案中，提供了用於辨識視覺媒體內容的至少一個處理器。第一模組用於從行動設備的相機接收圖像。第二模組用於偵測在該圖像中包含的四邊形。第三模組用於擷取在該四邊形內包含的視覺媒體內容以辨識該視覺媒體內容。In another aspect, at least one processor for identifying visual media content is provided. The first module is for receiving images from a camera of the mobile device. The second module is for detecting a quadrangle included in the image. The third module is configured to capture visual media content included in the quadrilateral to identify the visual media content.

在另一個方案中，提供了一種用於辨識視覺媒體內容的電腦程式產品。非瞬態電腦可讀取儲存媒體包含多組代碼。第一組代碼用於使得電腦從行動設備的相機接收圖像。第二組代碼用於使該電腦偵測在該圖像中包含的四邊形。第三組代碼用於使該電腦擷取在該四邊形內包含的視覺媒體內容以辨識該視覺媒體內容。In another aspect, a computer program product for identifying visual media content is provided. Non-transient computer readable storage media contains multiple sets of code. The first set of codes is used to cause the computer to receive images from the camera of the mobile device. The second set of codes is used to cause the computer to detect the quadrilateral contained in the image. The third set of codes is for causing the computer to capture visual media content contained within the quad to identify the visual media content.

在又另一個方案中，提供了一種用於辨識視覺媒體內容的裝置。提供了用於從行動設備的相機接收圖像的構件。提供了用於偵測在該圖像中包含的四邊形的構件。提供了用於擷取在該四邊形內包含的視覺媒體內容以辨識該視覺媒體內容的構件。In yet another aspect, an apparatus for identifying visual media content is provided. Means are provided for receiving images from a camera of the mobile device. A member for detecting a quadrilateral contained in the image is provided. Means are provided for capturing visual media content contained within the quad to identify the visual media content.

在再一個方案中，提供了一種用於辨識視覺媒體內容的裝置。行動設備的相機產生圖像。計算平臺偵測在從該相機接收的該圖像中包含的四邊形，並且擷取在該四邊形內包含的視覺媒體內容以辨識該視覺媒體內容。In yet another aspect, an apparatus for identifying visual media content is provided. The camera of the mobile device produces an image. The computing platform detects a quadrilateral contained in the image received from the camera and captures visual media content contained within the quadrilateral to identify the visual media content.

為了實現上述目的和相關目的，該一或多個方案包括在以下完整地描述且在請求項中特別指出的特徵。以下說明以及附圖詳細地闡述了該一或多個方案的特定說明性特徵。但是，該等特徵僅表示可應用各種方案的原理的各種方式中的幾個，並且該說明意欲包括全部該等方案及該等方案之等同形式。To achieve the above and related objects, the one or more aspects include the features that are fully described below and particularly pointed out in the claims. The specific description of the one or more aspects is set forth in the following description and the annexed drawings. However, the features are only a few of the various ways in which the principles of the various embodiments can be applied, and the description is intended to include all such aspects and equivalents.

存在以下的場合：需要經由另一構件辨識和傳送正在觀看的媒體內容(例如，文字、圖像、視訊)。由此使用者能夠以便利的方式來使用媒體內容。例如，使用者能夠閱讀媒體內容，諸如在印刷的定期出版物中包含的或在電腦監視器上顯示的基於文字的新聞或娛樂文章。類似地，媒體內容可以是圖形的，諸如示意性圖畫或照片。作為另一實例，使用者可以觀看正在顯示視覺媒體內容的場景。為了快速地擷取正在觀看的內容以便稍後進行檢索，使用者能夠方便地使用相機功能。為了快速地擷取並在稍後查詢基於文字的或圖形的文章或者視覺媒體內容的全部內容，使用者能夠使用具有相機功能的行動設備(例如，智慧手機、可攜式遊戲機、個人數位助理等等)。There are occasions where it is necessary to identify and transmit media content (eg, text, images, video) being viewed via another component. Thereby the user can use the media content in a convenient manner. For example, a user can read media content, such as text-based news or entertainment articles contained in printed periodic publications or displayed on a computer monitor. Similarly, media content can be graphical, such as an illustrative picture or photo. As another example, a user may view a scene in which visual media content is being displayed. In order to quickly capture the content being viewed for later retrieval, the user can conveniently use the camera function. In order to quickly capture and later query the entire content of text-based or graphic articles or visual media content, users can use camera-enabled mobile devices (eg, smart phones, portable game consoles, personal digital assistants) and many more).

現在將參考附圖描述各個方案。在以下描述中，為了解釋的目的而闡述了眾多具體細節以便提供對一或多個方案的透徹理解。然而，顯然該各種方案可以在沒有該等具體細節的情況下實現。在其他實例中，以方塊圖的形式圖示公知的結構和設備以便有助於描述該等方案。Various aspects will now be described with reference to the drawings. In the following description, numerous specific details are set forth However, it will be apparent that the various aspects can be implemented without such specific details. In other instances, well-known structures and devices are illustrated in block diagram form in order to facilitate the description.

最初參考圖1，裝置100被圖示為行動設備102，裝置100辨識由顯示器106在外部呈現供使用者108觀看的視覺媒體內容104。行動設備102的相機110產生數位圖像112。計算平臺114偵測四邊形116，四邊形116對應於外部顯示器106並包含在從相機110接收到的數位圖像112內。計算平臺114可以引導相機110進行的更高解析度的成像來包含四邊形116，以便擷取在四邊形116內包含的視覺媒體內容104。計算平臺114、遠端伺服器118或此二者協調能夠分析視覺媒體內容104以相對於具有媒體內容檔122的資料庫120進行辨識。然後可以經由空中通道128將辨識資訊124或視覺媒體內容104的更為完整的版本126傳輸至行動設備102以便在使用者介面130中呈現給使用者108。Referring initially to FIG. 1, mobile device 100 is illustrated as device 102, device 100 identification visual media content 104 presented by the display 106 for the user 108 viewing the outside. Camera 110 of mobile device 102 produces digital image 112 . Computing platform 114 to detect 116 quadrilateral, quadrilateral 116 and 106 corresponding to the external display is included in the digital image received from the camera 110 112. The computing platform 114 can direct the higher resolution imaging performed by the camera 110 to include the quadrilateral 116 to capture the visual media content 104 contained within the quadrilateral 116 . The computing platform 114 , the remote server 118, or both, can analyze the visual media content 104 for identification with respect to the repository 120 having the media content file 122 . A more complete version 126 of the identification information 124 or visual media content 104 can then be transmitted to the mobile device 102 via the airway 128 for presentation to the user 108 in the user interface 130 .

在圖2中，圖示了用於辨識視覺媒體內容的方法或操作序列200。從行動設備的相機接收圖像(方塊202)。偵測在該圖像內包含的四邊形(方塊204)。擷取在該四邊形內包含的視覺媒體內容以便辨識該視覺媒體內容(方塊206)。In FIG. 2 , a method or sequence of operations 200 for identifying visual media content is illustrated. An image is received from a camera of the mobile device (block 202 ). A quadrilateral contained within the image is detected (block 204 ). The visual media content contained within the quadrilateral is retrieved to identify the visual media content (block 206 ).

在一個示例性使用中，考慮以下情況：使用者正在電視上觀看媒體內容，諸如電影，但是選擇在行動設備上觀看剩餘的部分。使用者將行動電話的相機指向TV。行動設備被觸發，以便藉由處理輸入訊框經由該行動設備的取景器來辨識運轉的TV上的節目。具體地，利用電視是位於取景器中心的四邊形並通常比周圍環境更明亮的預期，從背景中裁剪出TV上的圖像。可以將所擷取的內容部分(也許針對機器視覺辨識而進行了預處理)發送至伺服器以便檢視該序列來自哪部電影。例如，上述系統能夠使用雜湊表來快速地檢視應當進一步檢查哪些電影中的哪些場景。雜湊表的關鍵是使用在訊框中找到的顏色的局部長條圖。例如，若圖像1的左上象限具有50%的藍色圖元、30%的白色圖元和20%的黑色圖元，並且隨後在給定的時間後改變為30%的藍色圖元、50%的白色圖元和20%的黑色圖元，則該特徵結合其它三個象限將會用來縮減來自電影資料庫中的訊框的場景。基於該精簡集合，隨後將不基於顏色而是基於頻率來重複該程序。若左上象限在給定的時間具有700個邊緣圖元，在300個訊框後變為400個邊緣圖元，則該模式將進一步精簡該集合。依據該精簡的片段集合，系統能夠使用SIFT或某種其他基於特徵的提取方法來縮減精確的訊框。一旦發現了電影的名稱和時間戳記，該設備就能夠連接至專屬的伺服器，購買並隨後下載整個電影，但是從使用者當前正在TV上觀看該電影的點處開始資料流章節(title)。In one exemplary use, consider the case where the user is watching media content, such as a movie, on a television, but chooses to view the remaining portion on the mobile device. The user points the camera of the mobile phone to the TV. The mobile device is triggered to recognize the program on the running TV via the viewfinder of the mobile device by processing the input frame. In particular, the image on the TV is cropped from the background by utilizing the expectation that the television is a quadrangle in the center of the viewfinder and is generally brighter than the surrounding environment. The captured portion of the content (perhaps preprocessed for machine vision recognition) can be sent to the server to view which movie the sequence came from. For example, the above system can use a hash table to quickly view which scenes in which movies should be further examined. The key to a hash table is to use a local bar graph of the color found in the frame. For example, if the upper left quadrant of image 1 has 50% blue primitives, 30% white primitives, and 20% black primitives, and then changes to 30% blue primitives after a given time, With 50% white primitives and 20% black primitives, this feature combined with the other three quadrants will be used to reduce the scene from the frame in the movie library. Based on the reduced set, the program will then be repeated based on frequency rather than color. If the upper left quadrant has 700 edge primitives at a given time and becomes 400 edge primitives after 300 frames, the pattern will further streamline the collection. Based on this reduced set of fragments, the system can use SIFT or some other feature-based extraction method to reduce the exact frame. Once the name and time stamp of the movie is found, the device can connect to a dedicated server, purchase and then download the entire movie, but start the stream title from the point where the user is currently watching the movie on the TV.

除了圖像偵測方法，還能夠使用麥克風來擷取來自TV的音訊，並在雜湊檢視函數中使用該音訊來輔助媒體內容的決定。可替換地，行動設備能夠在本端或在遠端伺服器上採用影像處理演算法，來辨識合法的(forensic)視訊浮水印。視訊浮水印可以包含時間戳記、客戶辨識符和內容辨識符，以允許提取該等資料，甚至於在壓縮和經過多次數位-類比-數位轉換之後進行提取。In addition to the image detection method, it is also possible to use a microphone to capture audio from the TV and use the audio in the hash view function to assist in the determination of the media content. Alternatively, the mobile device can employ an image processing algorithm on the local end or on the remote server to identify legitimate forensic video watermarks. Video watermarks can contain timestamps, customer identifiers, and content identifiers to allow extraction of such data, even after compression and after multiple bit-to-analog-digital conversions.

在另一方案中，若使用者正在閱讀PC上的文章、雜誌、報紙、圖書等等，但選擇在行動設備上存取該內容。使用者拍攝該內容的照片。例如，相機使用微距模式(macro mode)來在距離鏡頭小於2英寸的物件上進行聚焦，並具有足夠的解析度來進行光學字元辨識。由此，能夠利用搜尋引擎來搜尋所辨識的數碼字串，並在使用者介面上將最佳文章匹配呈現給使用者來進行選擇。可以對所辨識的匹配的文章做書簽或下載以供將來查看。若該內容有版權保護及/或在網上找不到，則能夠使用浮水印技術來決定該使用者是否是該內容的權利擁有者。若未使用浮水印技術但該內容仍有版權保護，則使用者能夠輸入來自該內容的實體拷貝(例如，書、期刊)的訂閱辨識符來存取該內容。In another aspect, if the user is reading an article, magazine, newspaper, book, etc. on the PC, but chooses to access the content on the mobile device. The user takes a photo of the content. For example, the camera uses a macro mode to focus on objects that are less than 2 inches from the lens and has sufficient resolution for optical character recognition. Thus, the search engine can be used to search for the identified digital string and present the best article match to the user on the user interface for selection. The identified matching articles can be bookmarked or downloaded for future viewing. If the content is copyright protected and/or not found online, then watermarking techniques can be used to determine if the user is the rights holder of the content. If the watermarking technique is not used but the content is still copyright protected, the user can enter the subscription identifier from the physical copy of the content (eg, book, journal) to access the content.

可替換地，若找不到該文章，則系統能夠推送該主題的類似文章或同一作者的文章。Alternatively, if the article is not found, the system can push a similar article of the topic or an article of the same author.

在一個示例性方案中，在圖3A-3C中，提供了用於擷取並辨識在相機成像的被偵測的外部顯示器內的視覺媒體內容的方法或操作序列300。使用者將無線行動設備的相機指向顯示器或監視器(方塊302)。使用者選擇用於擷取圖像內容的使用者控制(方塊304)。In one exemplary aspect, in Figures 3A-3C , a method or sequence of operations 300 for capturing and recognizing visual media content within a detected external display imaged by a camera is provided. The user directs the camera of the wireless mobile device to a display or monitor (block 302 ). The user selects user controls for capturing image content (block 304 ).

在一個方案中，使行動設備能夠擷取一種類型的視覺媒體內容(例如，文字、圖形圖像、視訊圖像)。在另一方案中，行動設備可以接收關於要擷取或已擷取哪種類型的視覺媒體內容的指示。作為另一方案，行動設備可以在幾種選項中自動地決定視覺媒體內容的類型。為了該等目的，行動設備可以決定針對文字擷取的意圖或適當性(方塊306)。若是，則可以將擷取引導至高對比度(通常是黑白)的沒有原生運動的文字(方塊308)。行動設備還可以決定針對圖像擷取的意圖或適當性(方塊310)。若是，則目標可以是彩色的並且具有變化的對比度，但是仍然沒有原生運動(方塊312)。行動設備還可以決定針對視訊擷取的意圖/適當性(方塊314)。若是，則目標可以具有原生運動(方塊316)。In one aspect, the mobile device is enabled to capture one type of visual media content (eg, text, graphic images, video images). In another aspect, the mobile device can receive an indication of what type of visual media content to retrieve or have retrieved. As a further alternative, the mobile device can automatically determine the type of visual media content among several options. For such purposes, the mobile device can determine the intent or appropriateness for the text capture (block 306 ). If so, the capture can be directed to a high contrast (usually black and white) text without native motion (block 308 ). The mobile device can also determine the intent or appropriateness for the image capture (block 310 ). If so, the target can be colored and have varying contrast, but there is still no native motion (block 312 ). The mobile device can also determine the intent/appropriateness for video capture (block 314 ). If so, the target can have a native motion (block 316 ).

在一個示例性方案中，在方塊317，顏色轉換處理支持上述的該等決定。從輸入顏色空間(通常是從RGB)變換至亮度-色度空間，可以是有用的，因為系統可以決定在圖像中的顏色的量。可以基於Cb和Cr來決定所研究的圖像中的顏色的量，其中Cb和Cr處於此兩者範圍的中值(例如，128)，表示沒有顏色。可替換地或附加地，可以偵測諸如白色和黑色或藍色的特定顏色，該顏色表示印刷物。可替換地或附加地，可以偵測一致的背景顏色，其中為了解析(depend)圖像或數碼內容可以丟棄該背景顏色。可替換地或附加地，可以使用Gabor濾波器來決定內容是否在某個頻率處具有規則的模式，在某個頻率處具有規則的模式可能表示具有特定字體大小的文字。In an exemplary aspect, at block 317 , the color conversion process supports the above decisions. It can be useful to transform from the input color space (usually from RGB) to the luminance-chrominance space because the system can determine the amount of color in the image. The amount of color in the image under study can be determined based on Cb and Cr, where Cb and Cr are in the median of the range (eg, 128), indicating no color. Alternatively or additionally, a particular color such as white and black or blue may be detected, the color representing the print. Alternatively or additionally, a consistent background color can be detected, wherein the background color can be discarded in order to rely on the image or digital content. Alternatively or additionally, a Gabor filter may be used to determine whether the content has a regular pattern at a certain frequency, and a pattern having a rule at a certain frequency may represent text having a particular font size.

在低解析度的相機預覽程序中能夠找到四邊形的圖像或視訊源(方塊318)，將參考圖3B來論述。進一步參考圖3A，可替換地，可以存在足夠的處理能力來使用更高解析度的初始擷取。在另一方案中，使用者可以藉由點擊(touch)取景器將關注聚焦在相機預覽的一部分上，來加以輔助(方塊320)。在另一方案中，行動設備顯示所辨識的源或顯示候選源以供使用者選擇(方塊322)。若是後者，則使用者介面接收使用者選擇並前進至決定圖像/視訊源(方塊324)。若需要或者能夠進行，則使用者可以經由將相機對準或聚焦或者點擊取景器來按照提示進行輔助(方塊326)。在一些實現方式中，行動設備可以向相機發送感興趣區域(ROI)以進行最佳解析度設置(方塊328)。行動設備從該區域擷取一個(或多個)訊框(方塊330)。A quadrilateral image or video source can be found in the low resolution camera preview program (block 318 ), which will be discussed with reference to FIG. 3B . With further reference to FIG. 3A , alternatively, there may be sufficient processing power to use the higher resolution initial capture. In another aspect, the user can assist by focusing the viewfinder on a portion of the camera preview (block 320 ). In another aspect, the mobile device displays the identified source or display candidate source for selection by the user (block 322 ). In the latter case, the user interface receives the user selection and proceeds to the decision image/video source (block 324 ). If desired or enabled, the user can assist with the prompt by aligning or focusing the camera or clicking on the viewfinder (block 326 ). In some implementations, the mobile device can send a region of interest (ROI) to the camera for optimal resolution settings (block 328 ). The mobile device retrieves one (or more) frames from the area (block 330 ).

辨識所擷取的一個(或多個)訊框的媒體內容(方塊332)。在各種方案中，該辨識可以由行動設備執行、由行動設備和遠端伺服器之間的分散式處理執行或主要由遠端伺服器來執行，如參考圖3C進一步描述的。繼續參考圖3A，行動設備能夠下載媒體內容(方塊334)。The media content of the captured frame(s) is identified (block 332 ). In various aspects, the identification may be performed by the mobile device, by decentralized processing between the mobile device and the remote server, or primarily by a remote server, as further described with respect to FIG. 3C . With continued reference to FIG. 3A , the mobile device can download the media content (block 334 ).

在圖3B中，提供了用於檢視圖像內的四邊形的圖像/視訊源的示例性方法318(方塊318)。若使用較低相機預覽解析度，則一或多個初始圖像可以在「n」個訊框(n=1則表示無運動)上採用VGA解析度(方塊336)。In FIG. 3B , an exemplary method 318 for viewing an image/video source of a quadrilateral within an image is provided (block 318 ). If a lower camera preview resolution is used, one or more of the initial images may have VGA resolution on the "n" frames (n = 1 means no motion) (block 336 ).

進一步參考圖3B，可以建立具有255個值的ROI圖，其中在點擊使用者輸入時，ROI圖可以由於值的減少而變模糊(方塊338)。在一些實例中，偵測可以利用顯示器或監視器具有比房間內其他表面更亮的照度級的趨勢。為此，借助於針對值>x(例如，實現20%的圖元剩餘)閾值化最大照度，諸如由紅-綠-值(RGB)定義的，來建立「明亮的(glowing)」ROI圖(方塊340)。在一些希望進行視訊擷取的實例中，偵測可以利用在訊框間存在著用於顯示運動的變化這一事實。為此，借助於每個訊框與先前「m」訊框(例如，m=3)的差來建立「運動」ROI圖。可以記錄最大差異(△)，以幫助去除信號干擾(方塊342)。可以基於ROI圖的權重來裁剪來自取景器的圖像(方塊344)。將所裁剪的圖像送入快速拐角偵測器(方塊346)。可以對比閾值圖元(6)更為接近的拐角點(CP)進行群集(方塊348)。若CP在明亮的圖內完整的n×n區域內，則可以刪除CP(方塊350)。若在m×m區域中的運動完全在明亮的圖內，則可以刪除CP(方塊352)。With further reference to FIG. 3B , an ROI map having 255 values can be established in which the ROI map can be blurred due to a decrease in value upon clicking of the user input (block 338 ). In some instances, the detection may utilize the tendency of the display or monitor to have a brighter illumination level than other surfaces in the room. To this end, a "glowing" ROI map is created by thresholding the maximum illuminance for values > x (eg, achieving 20% of the remaining primitives), such as defined by red-green-values (RGB) ( Block 340 ). In some instances where video capture is desired, detection can take advantage of the fact that there is a change in display motion between frames. To this end, a "motion" ROI map is created by the difference between each frame and the previous "m" frame (eg, m=3). The maximum difference ([Delta]) can be recorded to help remove signal interference (block 342 ). The image from the viewfinder can be cropped based on the weight of the ROI map (block 344 ). The cropped image is sent to the fast corner detector (block 346 ). Can compare threshold primitives ( 6) A closer corner point (CP) is clustered (block 348 ). If the CP is within the complete nxn region within the bright map, the CP can be deleted (block 350 ). If the movement of the m × m region in the bright FIG entirely, may be deleted CP (block 352).

依據刪除後的CP辨識四邊形的候選(方塊354)：A candidate for identifying a quad based on the deleted CP (block 354 ):

(i)無凸面(角度總和為360°)(方塊356)；(i) no convex surface (the sum of the angles is 360°) (block 356 );

(ii)任何內角>110°(方塊358)；(ii) any internal angle > 110° (block 358);

(iv)視訊長寬比(4:3，16:9)(方塊360)；(iv) video aspect ratio (4:3, 16:9) (block 360 );

(v)面積≧圖像的1/25(方塊362)；(v) 1/25 of the area ≧ image (block 362 );

(vi)具有兩個(2個)相等的鄰角(方塊364)；及(vi) having two (2) equal adjacent corners (block 364 ); and

(vii)基於深度檢視來關聯四邊形的候選(方塊365)。(vii) A candidate for the quadrilateral is associated based on the depth view (block 365 ).

因此，辨識出表示以監視器或顯示器的形狀關聯的組。Thus, a group representing the association of the shape of the monitor or display is identified.

針對使用深度檢視的最後一種方案，經由使用深度聚焦能力，能夠將特定的拐角組決定為具有特定深度。從而，能夠基於拐角在被認為是與候選拐角集合無關的前景深度或背景深度中，來刪除拐角。能夠使用深度資訊來決定圖像中處於同一深度級的拐角集合。For the last solution using depth inspection, a specific corner group can be determined to have a certain depth by using depth focusing capability. Thus, the corners can be deleted based on the corners in the foreground depth or background depth that is considered to be unrelated to the set of candidate corners. The depth information can be used to determine the set of corners at the same depth level in the image.

在於2008年8月5日提出申請的、發明人為Babak Forutanpour的、序號12/185,887、公開號No.20100033617 A1的同在申請中之美國專利申請案「System and Method to Generate Depth Data Using Edge Detection」中描述了關於深度檢視的使用的額外揭示內容，該申請被轉讓給受讓人，並特別以引用方式併入本文。U.S. Patent Application "System and Method to Generate Depth Data Using Edge Detection", filed on August 5, 2008, in which the inventor is Babak Forutanpour, Serial No. 12/185,887, Publication No. 20100033617 A1. Additional disclosure regarding the use of depth inspection is described in the application to the assignee and is hereby incorporated by reference in its entirety.

在主列表中添加候選的四個(4個)拐角(方塊366)。選擇從該主清單中的拐角集合形成的四邊形形狀，使得不允許基本上包含整個圖像的大的偽四邊形包圍較小的四邊形。例如，不允許大於圖像面積的五分之一的四邊形包圍其他候選四邊形。在一個示例性方案中，刪除佔據另一四邊形的80%的任何四邊形(方塊368)。Four (4) corners of the candidate are added to the main list (block 366 ). The quadrilateral shape formed from the set of corners in the master list is selected such that a large pseudo-quadrant that substantially encompasses the entire image is not allowed to surround the smaller quadrilateral. For example, a quadrangle larger than one-fifth of the image area is not allowed to surround other candidate quadrilaterals. In an exemplary scenario, any quadrilateral that occupies 80% of another quadrilateral is deleted (block 368 ).

可以基於偵測表示監視器/顯示器的邊框(border)形狀和統一顏色，來刪除拐角(方塊370)。對於另一實例，四邊形左一半和右一半的長條圖應該匹配。可替換地或附加地，四邊形上一半和下一半的長條圖應該匹配。在另一方案中，允許一個邊框側邊不同於相對側的邊框側邊，以便應對使用者控制、音訊揚聲器、安裝或支撐結構等等的不對稱放置。在一個示例性方案中，經由用於一側的柱狀長條圖(binned histogram)與另一側的差在總圖元的某個限度(例如20%)內來計算匹配，其中該限度可以是不同的。可替換地，若轉換到色調飽和度值(HSV)，則可以將平均色調限制為一個低值(例如，在10%內)。在一個示例性方案中，刪除邊框(寬度的1/14)具有的長條圖具有遠離標準差達值1.5的任何四邊形。The corners may be deleted based on detecting a border shape and a uniform color representing the monitor/display (block 370 ). For another example, the bar graphs of the left half and the right half of the quadrilateral should match. Alternatively or additionally, the upper half of the quadrilateral and the lower half of the bar graph should match. In another aspect, one side of the bezel is allowed to be different from the side of the side of the opposite side in order to accommodate asymmetric placement of user controls, audio speakers, mounting or support structures, and the like. In an exemplary aspect, the match is calculated within a certain limit (eg, 20%) of the total primitive via a difference between the binned histogram for one side and the other side, where the limit can It is different. Alternatively, if converted to a hue saturation value (HSV), the average hue can be limited to a low value (eg, within 10%). In an exemplary scenario, the deleted border (1/14 of the width) has a bar graph with any quadrilateral that is farther than the standard deviation of 1.5.

由於此示例性辨識，能夠從圖像中辨識出與顯示器或監視器相對應的四邊形。Due to this exemplary recognition, a quadrilateral corresponding to the display or monitor can be identified from the image.

在圖3C中，提供了用於辨識所擷取的一個(或多個)訊框的媒體內容的示例性方法或操作序列332。決定與遠端伺服器進行合作的能力(方塊371)。例如，該能力可能由於可用的資源、干擾、通道衰落、發射功率限制、用戶使用限制等等而受限。為了針對該等考慮中的一些或全部而進行調整，可以對至伺服器的連接的頻寬進行測試(方塊372)。能夠決定設備效能約束(例如，中央處理單元(CPU)的速度和可用性、數位信號處理硬體/軟體的配置等等)(方塊374)。可以存取使用者偏好或使用頻寬的成本(方塊376)。能力約束還可以是功率限制，基於本端執行影像處理的功耗或發送可變數的圖像資料所需的功率。能力約束還可以與用於處理和發送圖像資料的端到端時間相關。In FIG. 3C , an exemplary method or sequence of operations 332 for identifying media content of the captured frame(s) is provided. The ability to cooperate with the remote server is determined (block 371 ). For example, this capability may be limited due to available resources, interference, channel fading, transmit power limitations, user usage restrictions, and the like. To make adjustments for some or all of these considerations, the bandwidth of the connection to the server can be tested (block 372 ). Device performance constraints (eg, speed and availability of a central processing unit (CPU), configuration of digital signal processing hardware/software, etc.) can be determined (block 374 ). The cost of the user preference or usage bandwidth can be accessed (block 376 ). The capability constraint may also be a power limitation based on the power consumption of the image processing performed by the local end or the power required to transmit the variable number of image data. Capability constraints can also be correlated to the end-to-end time used to process and send image data.

在一些實例中，能力的一個特性是決定性的。例如，低性能的行動設備可能無法執行額外的數位影像處理，從而必須上傳原始圖像資料而不考慮通道限制。In some instances, one characteristic of capability is decisive. For example, a low-performance mobile device may not be able to perform additional digital image processing, so the original image data must be uploaded regardless of channel restrictions.

在另一實例中，基於數位影像處理的哪些部分能夠在本端或在遠端執行，來決定一組可能的模式。然後，可以基於使用者偏好(例如實現成本)、流量最佳化的系統偏好或經由減少完成該數位影像處理的時間來增強使用者體驗，來進行選擇以得到最優解決方案。In another example, a set of possible modes is determined based on which portions of the digital image processing can be performed at the local or remote end. The user experience can then be enhanced based on user preferences (eg, implementation cost), system preferences for traffic optimization, or by reducing the time to complete the digital image processing to select for an optimal solution.

例如，可以存取查閱資料表(LUT)來決定在設備與伺服器之間分配用於媒體內容辨識的影像處理(方塊378)。因此，在一個方案中，認為本端處理模式是適當的。例如，偵測到非常低的連接，決定有能力的設備是可用的或者已經選擇了該模式(方塊380)。執行基於n×n的長條圖+邊緣偵測&尺度不變特徵轉換(SIFT)(方塊382)。將特徵向量發送至伺服器以便節省頻寬(方塊384)。For example, a look-up data table (LUT) can be accessed to determine image processing for media content identification between the device and the server (block 378 ). Therefore, in one scheme, the local processing mode is considered appropriate. For example, a very low connection is detected, it is determined that a capable device is available or has been selected (block 380 ). Performing an n × n + bar graph based on edge detection & Scale Invariant Feature Transform (the SIFT) (block 382). The feature vector is sent to the server to save bandwidth (block 384 ).

在另一方案中，認為共享處理(sharing processing)模式是適當的。例如，偵測到媒體連接、決定本端和遠端元件可用於共享處理或者已經選擇了該模式(方塊386)。執行基於n×n的長條圖+邊緣偵測而不執行SIFT(方塊388)。將結果發送至伺服器(方塊390)。In another aspect, the sharing processing mode is considered appropriate. For example, a media connection is detected, the local and remote components are determined to be available for sharing processing, or the mode has been selected (block 386 ). The n×n based bar graph + edge detection is performed without performing SIFT (block 388 ). The result is sent to the server (block 390 ).

在另一方案中，認為遠端處理模式是適當的。例如，區域設備被決定為無法執行該處理、決定快速連接是可用的或者已經選擇了該模式(方塊392)。不是進行處理，而是將所擷取的片段發送至伺服器(方塊394)。In another scenario, the remote processing mode is considered appropriate. For example, the zone device is determined to be unable to perform the process, decides that the quick connect is available, or has selected the mode (block 392 ). Instead of processing, the captured segment is sent to the server (block 394 ).

在遠端伺服器完成了任何剩餘處理並且針對遠端伺服器媒體內容目錄進行了匹配之後，行動設備接收到媒體內容匹配的候選清單(方塊396)。在一個方案中，考慮在行動設備上呈現上述匹配的約束。另外，可能需要使用者的輔助(例如，相機的額外對準、與使用者介面的互動等等)來唯一地辨識該媒體內容。為此，可以決定未得到匹配並且需要擷取更多的媒體內容(方塊397)。可替換地或附加地，可以決定有限的匹配(例如，3個)，該等匹配適合於在使用者介面上呈現以供使用者選擇(方塊398)。可替換地或附加地，可以辨識大量的匹配。作為回應，行動設備可以擷取更多內容或發送所擷取內容的更高解析度的版本以供伺服器在匹配中使用(方塊399)。After the remote server has completed any remaining processing and matched for the remote server media content directory, the mobile device receives a candidate list of media content matches (block 396 ). In one scenario, consider presenting the constraints of the above matching on the mobile device. In addition, user assistance (eg, additional alignment of the camera, interaction with the user interface, etc.) may be required to uniquely identify the media content. To this end, it may be decided that no matches are obtained and more media content needs to be retrieved (block 397 ). Alternatively or additionally, a limited match (e.g., three) may be determined, the matches being adapted to be presented at the user interface for selection by the user (block 398 ). Alternatively or additionally, a large number of matches can be identified. In response, the mobile device can retrieve more content or send a higher resolution version of the captured content for use by the server in the match (block 399 ).

在另一示例性方案中，在圖4中，方法400使用諸如智慧手機之類的具有相機功能的通訊裝置或設備執行圖像分析。能夠儲存來自相機取景器的低解析度模式的訊框(方塊402)。可替換地，可以針對自動或手動選擇的較高解析度快照或快照序列執行圖像分析(方塊404)。執行演算法，提取取景器圖像中與正在呈現媒體內容的外部顯示器或監視器(例如，電視監視器)相對應的部分的顯示(方塊406)。為了方便，可以將該區域標識為液晶顯示器(LCD)，LCD是電子書、電視和電腦常用的顯示器類型，雖然應該設想到，能夠使用符合本發明的方案的其它技術。In another exemplary aspect, in FIG. 4 , method 400 performs image analysis using a camera-enabled communication device or device, such as a smart phone. A frame of low resolution mode from the camera viewfinder can be stored (block 402 ). Alternatively, image analysis may be performed for a higher resolution snapshot or snapshot sequence that is automatically or manually selected (block 404 ). The algorithm is executed to extract a display of a portion of the viewfinder image corresponding to an external display or monitor (e.g., a television monitor) that is presenting the media content (block 406 ). For convenience, the area may be identified as a liquid crystal display (LCD), which is a type of display commonly used in e-books, televisions, and computers, although it is contemplated that other techniques consistent with aspects of the present invention can be used.

在圖5中，房間的示例性的較低解析度圖像500包括從側面觀看的顯示器502。In FIG. 5 , an exemplary lower resolution image 500 of a room includes a display 502 viewed from the side.

進一步參考圖4，正確部分的選擇可以部分地涉及執行一系列處理，或者完全在本端執行或者在本端和在遠端分散式地執行或者全部在遠端執行(方塊408)。With further reference to FIG. 4 , the selection of the correct portion may involve, in part, performing a series of processes, either entirely at the local end or performed decentralized at the local end and at the remote end or all at the far end (block 408 ).

例如，可以執行Harris拐角偵測器來檢視感興趣區域(ROI)中的所有拐角(方塊410)。可以檢查所找到的拐角的全部排列和組合，直到建立一個集合為止(方塊412)從而：For example, a Harris corner detector can be executed to view all corners in the region of interest (ROI) (block 410 ). The entire arrangement and combination of the found corners can be checked until a set is created (block 412 ) to:

(1)在該區域中的內容具有的平均亮度比整體圖像的平均亮度高很多，該比值表示為「l」(方塊414)。(1) The content in the area has an average brightness that is much higher than the average brightness of the overall image, and the ratio is expressed as " 1 " (block 414 ).

(2)連接4個拐角的線均勻且在色調飽和度值(HSV)空間中具有大致相同的色調(方塊416)。例如，檢查監視器/顯示器的邊框圖元具有大致或基本上相同的顏色，其中一對或兩對相對側邊具有大約相同的寬度。在一個示例性方案中，可以決定邊框圖元彼此之間具有在一個閾值之內(例如，20%)的RGB匹配。可允許的邊框厚度可以基於LUT。例如，當四邊形內的面積是圖像的1/10時，邊框應該是圖像的x軸的水平寬度的1/30。將會預期640×480的圖像是電視，具有的邊框大約為20個圖元寬。(2) The lines connecting the four corners are uniform and have substantially the same hue in the hue saturation value (HSV) space (block 416 ). For example, the border elements of the inspection monitor/display have a substantially or substantially identical color with one or both pairs of opposing sides having approximately the same width. In an exemplary aspect, it may be decided that the border elements have an RGB match within one threshold (eg, 20%) between each other. The allowable border thickness can be based on the LUT. For example, when the area within the quadrilateral is 1/10 of the image, the border should be 1/30 of the horizontal width of the x-axis of the image. It would be expected that the 640 x 480 image would be a television with a border of approximately 20 primitives.

(3)刪除四個(4個)點的透視(perspective)與在眼睛高度處或低於眼睛高度的物體的取景不匹配的區域(例如，刪除屋頂燈光)(方塊418)。(3) Deleting the perspective of the four (4) points from the area that does not match the framing of the object at or below the height of the eye (eg, deleting the roof light) (block 418 ).

在圖6中，房間的ROI部分的示意性較低解析度圖像600(包括從側面觀看的顯示器602)已經偵測了多個拐角604，該等拐角可以是用於自動定義顯示器602以便能夠擷取媒體內容606的候選。In FIG. 6 , a schematic lower resolution image 600 of the ROI portion of the room (including the display 602 viewed from the side) has detected a plurality of corners 604 , which may be used to automatically define the display 602 to enable A candidate for media content 606 is retrieved.

在圖7中，從拐角偵測中得到的房間的候選圖元集合700包括定義顯示器706的外部點和內部點702和704的多個點，以及在媒體內容710內的圖像點708和在顯示器706外部的無關點712，該等點需要被選擇性地刪除。In FIG. 7 , the set of candidate primitives 700 for a room derived from corner detection includes a plurality of points defining external and internal points 702 and 704 of display 706 , and image points 708 and within media content 710 . The extraneous points 712 outside of display 706 need to be selectively deleted.

進一步參考圖4，現在找到了LCD顯示器的準確拐角，隨後可選地，若透視大於某個閾值 p ，則智慧手機可以對透視進行較佳校正(方塊420)。With further reference to Figure 4 , an accurate corner of the LCD display is now found, and then optionally, if the perspective is greater than a certain threshold p , the smart phone can better correct the perspective (block 420 ).

例如，該閾值 p 可以基於相對的橫向側邊的長度比值。例如，考慮比值 p ₁，指示該等橫向側邊在彼此的90%內。可以決定匹配演算法足夠穩健，從而能夠在不校正所擷取的視訊圖像中造成的失真的情況下實現匹配。對於另一實例，考慮比率 p ₂，指示橫向側邊在彼此的90%至70%之間。可能需要校正來校正由於在該範圍中的透視造成的失真。對於另一實例，考慮比率 p ₃，指示橫向側邊彼此尺寸相對差異較大從而無法進行校正並且導致懷疑是否找到了正確的四邊形。假設使用者並不嘗試以該非垂直的角度來進行擷取。還應該認識到，垂直透視和水平透視可以使用不同的閾值。For example, the threshold p can be based on the length ratio of the opposing lateral sides. For example, consider the ratio p ₁ indicating that the lateral sides are within 90% of each other. It can be determined that the matching algorithm is robust enough to enable matching without correcting the distortion caused in the captured video image. For another example, consider the ratio p ₂ indicating that the lateral sides are between 90% and 70% of each other. Correction may be required to correct for distortion due to perspective in this range. For another example, consider the ratio p _3, indicating the lateral sides opposite to each other larger differences in size and thus can not be corrected and if the cause doubt to find the right quadrilateral. Suppose the user does not attempt to capture at this non-perpendicular angle. It should also be recognized that vertical and horizontal perspectives may use different thresholds.

在一個方案中，能夠校正圖像從垂直平面相對於相機視點的任何旋轉。可以找到一對平行線(例如，上/下側邊，或左/右側邊)，並且以數位方式旋轉整個圖像以使得該兩條直線相對於圖像為0或90度，其中計算更為接近的彼角度。In one aspect, any rotation of the image from the vertical plane relative to the camera viewpoint can be corrected. You can find a pair of parallel lines (for example, the top/bottom side, or the left/right side) and rotate the entire image digitally so that the two lines are 0 or 90 degrees relative to the image, where the calculation is more Close to the angle.

在矩形或正方形n×n網格中建立圖像的長條圖，例如，n=3(方塊422)。Create a bar chart image n × n rectangular or square grid, e.g., n = 3 (block 422).

按照硬邊緣或軟邊緣來計算每個區域中的圖元數量(方塊424)。針對硬(尖銳)邊緣和軟(模糊)邊緣能夠使用各種定義。例如，「硬邊緣」圖元是此圖元：距離該圖元達n(例如，在所有方向上，n=2)個圖元遠的鄰近圖元具有比該圖元自己的值大很多或小很多的值，例如閾值>120。若一個圖元的值在兩個鄰點之間且該兩個鄰點具有彼此不同的值，則該圖元在「軟邊緣」上。在圖像中的變化尖銳程度能夠指示深度上的不連續性、表面方向上的不連續性、材料特性的改變或場景照明的變化。The number of primitives in each region is calculated by hard or soft edges (block 424 ). Various definitions can be used for hard (sharp) edges and soft (fuzzy) edges. For example, a "hard edge" primitive is this primitive: a neighboring primitive that is n from the primitive (eg, n=2 in all directions) is far larger than the primitive of the primitive itself or A much smaller value, such as a threshold >120. If the value of a primitive is between two neighbors and the two neighbors have different values from each other, the primitive is on the "soft edge". The degree of sharpness in the image can indicate discontinuities in depth, discontinuities in the surface direction, changes in material properties, or changes in scene illumination.

將該n ²個紅綠藍(RGB)值和硬邊緣值+軟邊緣值合併至資料有效負荷中並進行發送(方塊426)。經由向伺服器不僅發送顏色資訊還發送硬/軟圖元的數量，伺服器能夠使用該資訊來從電影庫中搜尋訊框以找到具有類似特性的塊。簡言之，僅發送顏色資訊可能是不夠的。可能不足以知道一個塊具有多少紅/綠/藍。在圖像塊中有過多的RGB的情況下，可以使用改進型篩檢程式。例如，還可以經由發送關於該塊在硬邊緣上具有45個圖元且在軟邊緣上具有39個圖元的資訊，來縮小候選清單。在沒有傳輸或處理限制的理想情況下，可以發送整個塊，以使得伺服器能夠逐個訊框地對兩個圖像進行相減。The n ² red green blue (RGB) values and hard edge values + soft edge values are combined into the data payload and transmitted (block 426 ). By sending not only the color information to the server but also the number of hard/soft primitives, the server can use this information to search the frame from the movie library for blocks with similar characteristics. In short, just sending color information may not be enough. It may not be enough to know how many red/green/blue a block has. In the case of excessive RGB in the image block, a modified screening program can be used. For example, the candidate list can also be narrowed down by sending information about the block having 45 primitives on the hard edge and 39 primitives on the soft edge. In the ideal case of no transmission or processing constraints, the entire block can be sent to enable the server to subtract the two images frame by frame.

媒體內容(例如，電影、電視、新聞文章、無線電廣播、播客(podcast)節目等等)被辨識並被格式化以便在行動設備上繼續呈現(方塊428)。若使用者不具有足夠的權利來使用資料，則可以說明進行資料權利訂閱(方塊430)。Media content (eg, movies, television, news articles, radio broadcasts, podcast programs, etc.) is recognized and formatted for continued presentation on the mobile device (block 428 ). If the user does not have sufficient rights to use the material, then a data rights subscription can be made (block 430 ).

借助於前述內容，為使用者提供了一種便利的方式來經由行動設備繼續使用特定的媒體內容，而無需經由行動設備的辨識、定位和存取媒體內容的繁瑣處理。With the foregoing, the user is provided with a convenient way to continue to use specific media content via the mobile device without the cumbersome processing of identifying, locating and accessing the media content via the mobile device.

參考圖8，圖示用於辨識視覺媒體內容的系統800。例如，系統800可以至少部分地位於使用者設備(UE)內。要認識到，系統800被表示為包括多個功能方塊，該等功能方塊可以是表示由計算平臺、處理器、軟體或上述者之組合(例如韌體)所實現的功能的功能方塊。系統800包括能夠聯合動作的多個電子組件的邏輯組802。例如，邏輯組802可以包括用於從行動設備的相機接收圖像的電子組件804。此外，邏輯組802可以包括用於偵測圖像內包含的四邊形的電子組件806。對於另一實例，邏輯組802可以包括用於擷取在四邊形內包含的視覺媒體內容以便辨識該視覺媒體內容的電子組件808。另外，系統800可以包括記憶體820，保存用於執行與電子組件804-808相關的功能的指令。雖然被顯示在記憶體820之外，但是要理解，電子組件804-808之中的一或多個可以存在於記憶體820內。Referring to Figure 8 , a system 800 for identifying visual media content is illustrated. For example, system 800 can reside at least partially within a user equipment (UE). It will be appreciated that system 800 is represented as including a plurality of functional blocks, which may be functional blocks representing functions implemented by a computing platform, a processor, a software, or a combination of the above (e.g., firmware). System 800 includes a logical grouping 802 of a plurality of electronic components that are capable of acting in conjunction. For example, logical grouping 802 can include an electrical component 804 for receiving images from a camera of a mobile device. Additionally, logical grouping 802 can include an electronic component 806 for detecting quadrilaterals contained within the image. For another example, logical grouping 802 can include an electronic component 808 for capturing visual media content contained within a quad to identify the visual media content. Further, system 800 may include memory 820, storage 804 for executing electronic components - 808 functions associated instruction. Although memory is shown outside the body 820, it is to be understood that the electronic component 804 - One or more among 808 can exist within memory 820.

在圖9中，圖示了用於辨識視覺媒體內容的裝置902。提供構件904用於從行動設備的相機接收圖像。提供構件906用於偵測圖像內包含的四邊形。提供構件908用於擷取在四邊形內包含的視覺媒體內容以便辨識該視覺媒體內容。In Figure 9 , an apparatus 902 for recognizing visual media content is illustrated. A component 904 is provided for receiving images from a camera of the mobile device. A providing member 906 is provided for detecting a quadrilateral contained in the image. A component 908 is provided for capturing visual media content contained within the quadrilateral to identify the visual media content.

圖10是特定行動設備1000的方塊圖，包括長條圖、拐角偵測器和尺度不變特徵轉換(SIFT)產生器1064。行動設備1000可以實現在可攜式電子設備中，並且包括耦合至記憶體1032的信號處理器1010，諸如數位訊號處理器(DSP)。長條圖、拐角偵測器和尺度不變特徵轉換(SIFT)產生器1064包含在信號處理器1010中。在示例性實例中，拐角偵測器和SIFT產生器1064按照根據圖1-7所述地或任意組合來進行操作。 10 is a block diagram of a particular mobile device 1000 , including a bar graph, a corner detector, and a scale invariant feature transform (SIFT) generator 1064 . Mobile device 1000 can be implemented in a portable electronic device and includes a signal processor 1010 , such as a digital signal processor (DSP), coupled to memory 1032 . A bar graph, corner detector, and scale invariant feature transform (SIFT) generator 1064 is included in signal processor 1010 . In an illustrative example, the corner detector and SIFT generator 1064 operate in accordance with or in any combination as described with respect to Figures 1-7 .

相機介面1068耦合至信號處理器1010並且還耦合至相機，諸如攝像機1070。相機介面1068可以適用於自動地或響應於由DSP 1010產生的信號，回應於單個圖像擷取命令(諸如使用者「點擊」快門控制器或其它圖像擷取輸入)而拍攝一個場景的多個圖像。顯示控制器1026耦合至信號處理器1010和顯示裝置1028。編碼器/解碼器(轉碼器)1034也可以耦合至信號處理器1010。揚聲器1036和麥克風1038可以耦合至轉碼器1034。無線介面1040可以耦合至信號處理器1010和無線天線1042。Camera interface 1068 is coupled to signal processor 1010 and is also coupled to a camera, such as camera 1070 . The camera interface 1068 can be adapted to automatically capture a scene in response to a single image capture command (such as a user "clicking" on a shutter controller or other image capture input), either automatically or in response to a signal generated by the DSP 1010 . Images. Display controller 1026 is coupled to signal processor 1010 and display device 1028 . An encoder/decoder (transcoder) 1034 can also be coupled to the signal processor 1010 . Speaker 1036 and microphone 1038 can be coupled to transcoder 1034 . Wireless interface 1040 can be coupled to signal processor 1010 and wireless antenna 1042 .

信號處理器1010適於按照如前所述地，基於在鄰近資料點之間的亮度值變化來偵測圖像資料中的拐角。信號處理器1010還適於按照如前所述地，產生圖像資料1046，諸如深度圖或深度資料的其它形式，圖像資料1046是用圖像資料集合匯出的。經由使用深度聚焦能力，可以決定某些拐角組具有特定深度。因此，可以基於拐角在被認為是與拐角候選集合無關的前景深度或背景深度中，來刪除拐角。在一個示例性方案中，除了使用明亮性和運動ROI圖之外，當物件模糊或尖銳時，相機可以掃瞄鏡頭以檢視聚焦級。基於該資訊，可以決定是不存在邊緣、存在軟邊緣還是存在硬邊緣。可以將在同一深度處的拐角認為是共面的。可替換地，可以部分地基於深度資訊決定拐角的三維座標，以便決定並不相對於相機而垂直的共麵點。The signal processor 1010 is adapted to detect corners in the image data based on changes in luminance values between adjacent data points as previously described. The signal processor 1010 is further adapted to generate image material 1046 , such as a depth map or other form of depth material, as previously described, the image material 1046 being exported using a collection of image data. By using depth focusing capabilities, it can be decided that certain corner groups have a certain depth. Thus, the corners can be deleted based on the corners in the foreground depth or background depth that is considered to be unrelated to the set of corner candidates. In an exemplary scenario, in addition to using the brightness and motion ROI maps, when the object is blurred or sharp, the camera can scan the lens to view the focus level. Based on this information, it can be determined whether there is no edge, a soft edge, or a hard edge. Corners at the same depth can be considered to be coplanar. Alternatively, the three-dimensional coordinates of the corners may be determined based in part on the depth information to determine coplanar points that are not perpendicular to the camera.

圖像資料可以包括來自攝像機1070的視訊資料、經由天線1042來自無線傳輸的圖像資料或來自諸如經由通用序列匯流排(USB)介面(未圖示)耦合的外部設備之類的其它源的資料，該等皆是示例性的而非限定性的實例。The image material may include video material from camera 1070 , image data from wireless transmission via antenna 1042, or data from other sources such as external devices coupled via a universal serial bus (USB) interface (not shown). These are all illustrative and not limiting examples.

顯示控制器1026被配置為接收經處理的圖像資料並將經處理的圖像資料提供給顯示裝置1028。另外，記憶體1032可以被配置為接收並儲存經處理的圖像資料，並且無線介面1040可以被配置為接收經處理的圖像資料以便經由天線1042進行發送。Display controller 1026 is configured to receive the processed image material and provide the processed image data to display device 1028 . Additionally, memory 1032 can be configured to receive and store processed image data, and wireless interface 1040 can be configured to receive processed image data for transmission via antenna 1042 .

在一個特定實施例中，信號處理器1010、顯示控制器1026、記憶體1032、轉碼器1034、無線介面1040和相機介面1068包含在系統級封裝或晶片上系統設備1022中。在一個特定實施例中，輸入裝置1030和電源1044耦合至移動晶片上系統設備1022。此外，在一個特定實施例中，如圖10所示的，顯示裝置1028、輸入裝置1030、揚聲器1036、麥克風1038、無線天線1042、攝像機1070和電源1044在晶片上系統設備1022外部。然而，顯示裝置1028、輸入裝置1030、揚聲器1036、麥克風1038、無線天線1042、攝像機1070和電源1044中的每一個皆可耦合至晶片上系統設備1022的某個元件上，諸如介面或控制器。In one particular embodiment, signal processor 1010 , display controller 1026 , memory 1032 , transcoder 1034 , wireless interface 1040, and camera interface 1068 are included in system-in-package or on-wafer system device 1022 . In one particular embodiment, input device 1030 and power source 1044 are coupled to mobile on-wafer system device 1022 . Moreover, in one particular embodiment, as shown in FIG. 10 , display device 1028 , input device 1030 , speaker 1036 , microphone 1038 , wireless antenna 1042 , camera 1070, and power source 1044 are external to system-on-chip device 1022 . However, each of display device 1028 , input device 1030 , speaker 1036 , microphone 1038 , wireless antenna 1042 , camera 1070, and power source 1044 can be coupled to a component of system-on-chip device 1022 , such as an interface or controller.

在一個示例性方案中，行動設備可以使用多輸入多輸出(MIMO)蜂巢通訊能力來執行媒體內容的辨識和傳送。在一個示例性方案中，MIMO系統使用多個(N _T個)發射天線和多個(N _R個)接收天線進行資料傳輸。由N _T個發射天線和N _R個接收天線構成的MIMO通道可以被分解為N _S個獨立通道，也稱為空間通道，其中N _S≦min{N _T,N _R}。N _S個獨立通道中的每一個都對應於一個維度。若利用了由多個發射天線和接收天線所建立的額外維度，MIMO系統就可以提供更高的性能(例如，更高的輸送量及/或更大的可靠性)。In an exemplary scenario, the mobile device can perform identification and transmission of media content using multiple input multiple output (MIMO) cellular communication capabilities. In one exemplary embodiment, MIMO system employs multiple (N _T) transmit antennas and multiple (N _R) receive antennas for data transmission. By the N _T transmit antennas and N _R MIMO channels receive antennas may be decomposed into N _S independent channels is also referred to as spatial channels, where _{_{N S ≦ min {N T,}} N R}. Each of the N _S independent channels corresponds to one dimension. The MIMO system can provide higher performance (e.g., higher throughput and/or greater reliability) if additional dimensions established by multiple transmit and receive antennas are utilized.

MIMO系統可以支援分時雙工(「TDD」)和分頻雙工(「FDD」)。在TDD系統中，前向鏈路傳輸和反向鏈路傳輸在相同的頻率範圍上，從而使得相互原則允許依據反向鏈路通道來估計前向鏈路通道。此舉使得當在一個存取點處有多個天線可用時，該存取點能夠提取前向鏈路上的發射波束成形增益。The MIMO system can support time division duplex ("TDD") and frequency division duplex ("FDD"). In a TDD system, the forward link transmission and the reverse link transmission are on the same frequency range, such that the mutual principle allows the forward link channel to be estimated from the reverse link channel. This allows the access point to extract the transmit beamforming gain on the forward link when multiple antennas are available at one access point.

本文的教導可以結合至使用各種元件來與至少一個其它節點進行通訊的節點(例如，設備)中。圖11圖示可用來實現在節點之間的通訊的幾個示例性元件。具體地，圖11圖示MIMO系統1100的無線設備1110(例如存取點)和無線設備1150(例如存取終端)。在設備1110處，從資料來源1112將多個資料流的訊務資料提供給發射(「TX」)資料處理器1114。The teachings herein may be incorporated into a node (eg, a device) that uses various elements to communicate with at least one other node. Figure 11 illustrates several exemplary elements that may be used to implement communication between nodes. In particular, FIG. 11 illustrates a wireless device 1110 (eg, an access point) and a wireless device 1150 (eg, an access terminal) of a MIMO system 1100 . At device 1110 , the traffic data for the plurality of data streams is provided from data source 1112 to a transmit ("TX") data processor 1114 .

在一些方案中，每個資料流是經由各自的發射天線發送的。TX資料處理器1114可以基於為每個資料流選擇的特定編碼方案，對每個資料流的訊務資料進行格式化、編碼和交錯，以提供編碼資料。In some aspects, each data stream is transmitted via a respective transmit antenna. The TX data processor 1114 can format, encode, and interleave the traffic data for each data stream based on a particular coding scheme selected for each data stream to provide coded material.

可以使用OFDM技術將每一個資料流的編碼資料與引導頻資料進行多工處理。引導頻資料通常是以已知的方式進行處理的已知的資料模式，並且可以在接收方系統處使用引導頻資料來估計通道回應。隨後基於為每一個資料流選擇的特定調制方案(例如，BPSK、QPSK、M-PSK或M-QAM)來調制(即，符號映射)該資料流的經多工的引導頻資料和編碼資料，以提供調制符號。可以經由處理器1130執行的指令來決定每一個資料流的資料速率、編碼和調制。資料記憶體1132可以儲存由處理器1130或設備1110的其它元件使用的程式碼、資料和其它資訊。The coded data of each data stream and the pilot frequency data can be multiplexed using OFDM technology. The pilot data is typically a known data pattern that is processed in a known manner, and the pilot data can be used at the receiver system to estimate the channel response. The multiplexed pilot and encoded data of the data stream is then modulated (ie, symbol mapped) based on a particular modulation scheme (eg, BPSK, QPSK, M-PSK, or M-QAM) selected for each data stream, To provide modulation symbols. The data rate, encoding, and modulation for each data stream can be determined via instructions executed by processor 1130 . Data memory 1132 can store code, data, and other information used by processor 1130 or other components of device 1110 .

可以將全部資料流的調制符號提供給TX MIMO處理器1120，處理器1120可以進一步處理該等調制符號(例如，用於OFDM)。TX MIMO處理器1120隨後向N _T個收發機(「XCVR」)1122a到1122t提供N _T個調制符號流，每一個收發機都具有發射器(TMTR)和接收器(RCVR)。在一些方案中，TX MIMO處理器1120可以對資料流的符號和發送符號的天線使用波束成形權重。Modulation symbols for all data streams can be provided to a TX MIMO processor 1120, processor 1120 may further process these modulation symbols (e.g., for OFDM). TX MIMO processor 1120 then provides 1122a through 1122t N _T modulation symbol streams to N _T transceivers ( "XCVR"), each transceiver having a transmitter (TMTR) and a receiver (RCVR). In some aspects, TX MIMO processor 1120 can use beamforming weights for the symbols of the data stream and the antenna from which the symbol is being transmitted.

每一個收發機1122a到1122t接收並處理各自的符號流，以提供一或多個類比信號，並進一步調節(例如，放大、濾波和升頻轉換)類比信號，以提供適合於經由MIMO通道傳輸的調制信號。隨後分別從N _T個天線1124a到1124t發送來自收發機1122a到1122t的N _T個調制信號。Each transceiver 1122a through 1122t receives and processes a respective symbol stream to provide one or more analog signals and further conditions (e.g., amplifies, filters, and upconverts) analog signals to provide for transmission over a MIMO channel. Modulated signal. 1124a through 1124t are then transmitted from N transceivers 1122a through 1122t _T modulated signals from N _T antennas.

在設備1150處，由N _R個天線1152a到1152r接收發送的調制信號，將來自每一個天線1152a-1152r的接收信號提供給各自的收發機(「XCVR」)1154a到1154r。每一個收發機1154a到1154r可以調節(例如，濾波、放大和降頻轉換)各自的接收信號，數位化經調節的信號，以提供採樣，並進一步處理該等採樣以提供相應的「接收」符號流。At the device 1150 by N _R antennas 1152a through 1152r receive the transmitted modulated signals, the received signal from each antenna 1152a-1152r provided to a respective transceiver ( "XCVR") 1154a to 1154r. Each of the transceivers 1154a through 1154r can condition (e.g., filter, amplify, and downconvert) the respective received signals, digitize the conditioned signals to provide samples, and further process the samples to provide corresponding "received" symbols. flow.

接收(「RX」)資料處理器1160隨後基於特定接收器處理技術來接收並處理來自N _R個收發機1154a到1154r的N _R個接收符號流，以提供N _T個「偵測」符號流。RX資料處理器1160隨後對每一個偵測符號流進行解調、解交錯和解碼，以恢復該資料流的訊務資料。由RX資料處理器1160執行的處理與由在設備1110處的TX MIMO處理器1120和TX資料處理器1114執行的處理相反。A receive ( "RX") data processor 1160 then receives and processes the N _R received symbol streams from N _R transceivers 1154a through 1154r based on a particular receiver processing technique to provide N _T th "detection" symbol streams. The RX data processor 1160 then demodulates, deinterleaves, and decodes each detected symbol stream to recover the traffic data for the data stream. The processing performed by RX data processor 1160 is the inverse of the processing performed by TX MIMO processor 1120 and TX data processor 1114 at device 1110 .

處理器1170週期性地決定使用哪一個預編碼矩陣。處理器1170形成反向鏈路訊息，包括矩陣索引部分和秩值部分。資料記憶體1172可以儲存由處理器1170或設備1150的其它元件使用的程式碼、資料和其它資訊。Processor 1170 periodically determines which precoding matrix to use. Processor 1170 forms a reverse link message comprising a matrix index portion and a rank value portion. Data memory 1172 can store code, data, and other information used by processor 1170 or other components of device 1150 .

該反向鏈路訊息可以包括與通訊鏈路及/或接收資料流有關的各類資訊。該反向鏈路訊息隨後可以由TX資料處理器1138進行處理，由調制器1180進行調制，由收發機1154a到1154r進行調節，並被發送回設備1110，TX資料處理器1138還從資料來源1136接收多個資料流的訊務資料。The reverse link message can include various types of information related to the communication link and/or the received data stream. The reverse link message may then be processed by a TX data processor 1138, modulated by a modulator 1180, conditioned by transceivers 1154a through 1154r, and transmitted back to the device 1110, from TX data processor 1138 further source 1136 Receive traffic data for multiple streams.

在設備1110處，來自設備1150的調制信號由天線1124a-1124t進行接收，由收發機1122a到1122t進行調節，由解調器(「DEMOD」)1140進行解調，並由RX資料處理器1142進行處理，以提取由設備1150發送的反向鏈路訊息。處理器1130隨後決定將哪一個預編碼矩陣用於決定波束成形權重，並隨後處理所提取的訊息。At device 1110 , the modulated signal from device 1150 is received by antennas 1124a - 1124t , adjusted by transceivers 1122a through 1122t , demodulated by a demodulator ("DEMOD") 1140 , and processed by RX data processor 1142 . Processing to extract the reverse link message sent by device 1150 . Processor 1130 then determines which precoding matrix to use for determining beamforming weights and then processes the extracted information.

圖11還圖示通訊元件可以包括用於在存在干擾的情況下實現圖像資料傳輸的一或多個元件。例如，干擾(「INTER.」)控制元件1190可以與處理器1130及/或設備1110的其它元件協調，以便向/從另一設備(例如設備1150)發送/接收信號。類似地，干擾控制元件1192可以與處理器1170及/或設備1150的其它元件協調，以便向/從另一設備(例如設備1110)發送/接收信號。應該認識到，對於每一個設備1110和1150，上述元件中的兩個或兩個以上元件的功能可以由單個元件來提供。例如，單個處理元件可以提供干擾控制元件1190和處理器1130的功能，並且單個處理元件可以提供干擾控制元件1192和處理器1170的功能。 Figure 11 also illustrates that the communication component can include one or more components for effecting image data transmission in the presence of interference. For example, interference ("INTER.") control element 1190 can coordinate with processor 1130 and/or other elements of device 1110 to transmit/receive signals to/from another device (eg, device 1150 ). Similarly, interference control component 1192 can coordinate with processor 1170 and/or other components of device 1150 to transmit/receive signals to/from another device (e.g., device 1110 ). It will be appreciated that for each device 1110 and 1150 , the functionality of two or more of the above elements may be provided by a single element. For example, a single processing element can provide the functionality of interference control element 1190 and processor 1130 , and a single processing element can provide the functionality of interference control element 1192 and processor 1170 .

參考圖12，用於實現所要求保護的主題的各個方案的示例性計算環境1200包括電腦1212。電腦1212包括處理單元1214、系統記憶體1216和系統匯流排1218。系統匯流排1218將多個系統元件耦合至處理單元1214，該等系統元件例如但不限於系統記憶體1216。處理單元1214可以是各種可用的處理器中的任意處理器。還可以將雙微處理器和其它多處理器架構用作處理單元1214。Referring to FIG. 12 , an exemplary computing environment 1200 for implementing various aspects of the claimed subject matter includes a computer 1212 . Computer 1212 includes processing unit 1214 , system memory 1216, and system bus 1218 . System bus 1218 couples a plurality of system components to processing unit 1214 , such as but not limited to system memory 1216 . Processing unit 1214 can be any of a variety of available processors. Dual microprocessors and other multiprocessor architectures can also be used as the processing unit 1214 .

系統匯流排1218可以是幾種類型的匯流排結構中的任意類型，該等匯流排結構包括記憶體匯流排或記憶體控制器、周邊匯流排或外部匯流排及/或本端匯流排，並且使用任意類型的可用匯流排架構，包括但不限於：工業標準架構(ISA)、微通道架構(MSA)、擴展ISA(EISA)、智慧電子驅動器(IDE)、VESA本端匯流排(VLB)、周邊元件互連(PCI)、卡片式匯流排、通用序列匯流排(USB)、高級圖形埠(AGP)、個人電腦記憶卡國際協會匯流排(PCMCIA)、火線(IEEE 1294)和智慧電腦系統介面(SCSI)。The system bus 1218 can be any of several types of bus bar structures including a memory bus or memory controller, a peripheral bus or an external bus and/or a local bus, and Use any type of available bus architecture, including but not limited to: Industry Standard Architecture (ISA), Micro Channel Architecture (MSA), Extended ISA (EISA), Intelligent Electronic Drive (IDE), VESA Local Bus (VLB), Peripheral Component Interconnect (PCI), Card Bus, Universal Serial Bus (USB), Advanced Graphics (AGP), Personal Computer Memory Card International Association Bus (PCMCIA), FireWire (IEEE 1294), and Smart PC System Interface (SCSI).

系統記憶體1216包括揮發性記憶體1220和非揮發性記憶體1222。在非揮發性記憶體1222中儲存了基本輸入/輸出系統(BIOS)，該系統包含用於在諸如啟動程序中在電腦1212的元件之間傳遞資訊的基本常式。作為示例而非限制地，非揮發性記憶體1222可以包括唯讀記憶體(ROM)、可程式設計ROM(PROM)、電可程式設計ROM(EPROM)、電子可抹除可程式設計ROM(EEPROM)或快閃記憶體。揮發性記憶體1220包括隨機存取記憶體(RAM)，充當外部緩衝記憶體。作為示例而非限制地，RAM可以採用多種形式，諸如靜態RAM(SRAM)、動態RAM(DRAM)、同步DRAM(SDRAM)、雙倍資料速率SDRAM(DDR SDRAM)、增強型SDRAM(ESDRAM)、Synchlink DRAM(SLDRAM)、Rambus直接RAM(RDRAM)、直接Rambus動態RAM(DRDRAM)和Rambus動態RAM(RDRAM)。System memory 1216 includes volatile memory 1220 and non-volatile memory 1222 . A basic input/output system (BIOS) is stored in non-volatile memory 1222 that contains the basic routine for communicating information between elements of computer 1212 , such as in a boot process. By way of example and not limitation, non-volatile memory 1222 may comprise read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electronic erasable programmable ROM (EEPROM) ) or flash memory. Volatile memory 1220 includes random access memory (RAM) that acts as an external buffer memory. By way of example and not limitation, RAM can take many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), enhanced SDRAM (ESDRAM), Synchlink. DRAM (SLDRAM), Rambus Direct RAM (RDRAM), Direct Rambus Dynamic RAM (DRDRAM), and Rambus Dynamic RAM (RDRAM).

電腦1212還包括可移除/不可移除、揮發性/非揮發性電腦儲存媒體。例如，圖12圖示盤儲存設備1224。盤儲存設備1224包括但不限於：諸如磁碟機、軟碟機、磁帶碟機、Jaz驅動器、壓縮磁碟、LS-100驅動器、快閃記憶卡或記憶棒之類的設備。另外，盤儲存設備1224可以包括單獨的儲存媒體或與其它儲存媒體組合的儲存媒體，包括但不限於：光碟機，諸如壓縮光碟ROM驅動器(CD-ROM)、CD可記錄驅動器(CD-R驅動器)、CD可重寫驅動器(CD-RW驅動器)或數位多功能光碟ROM驅動器(DVD-ROM)。為了實現盤儲存設備1224到系統匯流排1218的連接，通常使用可移除或不可移除介面，諸如介面1226。Computer 1212 also includes removable/non-removable, volatile/non-volatile computer storage media. For example, Figure 12 illustrates a disk storage device 1224 . Disk storage device 1224 includes, but is not limited to, a device such as a disk drive, a floppy disk drive, a tape drive, a Jaz drive, a compact disk, an LS-100 drive, a flash memory card, or a memory stick. In addition, the disk storage device 1224 may include a separate storage medium or a storage medium combined with other storage media, including but not limited to: a compact disk drive such as a compact disk ROM drive (CD-ROM), a CD recordable drive (CD-R drive) ), CD rewritable drive (CD-RW drive) or digital versatile disc ROM drive (DVD-ROM). To implement the connection of disk storage device 1224 to system bus 1218 , a removable or non-removable interface, such as interface 1226 , is typically used.

要認識到，圖12描述了充當使用者與基本電腦資源之間的中介軟體，該軟體是在合適的操作環境1200中描述的。該種軟體包括作業系統1228。作業系統1228可以儲存在盤儲存設備1224中，作業系統1228用於控制和分配電腦系統1212的資源。系統應用1230經由儲存在系統記憶體1216或盤儲存設備1224中的程式模組1232和程式資料1234來利用作業系統1228對資源的管理。要認識到，可以採用各種作業系統或作業系統的組合來實現所要求保護的主題。It will be appreciated that FIG. 12 depicts acting as an intermediary software between a user and a basic computer resource, which is described in a suitable operating environment 1200 . This type of software includes an operating system 1228 . OS 1228 may be stored in the disk storage device 1224, operating system 1228 to control and allocate resources of the computer system 1212. The system application 1230 utilizes the operating system 1228 to manage resources via the program module 1232 and program data 1234 stored in the system memory 1216 or disk storage device 1224 . It will be appreciated that a variety of operating systems or combinations of operating systems may be employed to implement the claimed subject matter.

使用者經由輸入裝置1236將命令或資訊輸入至電腦1212中。輸入裝置1236包括但不限於：諸如滑鼠的定點設備、軌跡球、尖筆、觸控板、鍵盤、麥克風、搖桿、遊戲手柄、碟形衛星天線、掃瞄器、TV調諧器卡、數碼相機、數位攝像機、網路相機等等。該等和其它輸入裝置經由介面埠1238經由系統匯流排1218連接至處理單元1214。介面埠1238包括：例如，序列埠、平行埠、遊戲連接埠和通用序列匯流排(USB)。輸出設備1240使用一些與輸入裝置1236相同類型的埠。因此，例如，可以使用USB埠向電腦1212提供輸入，並從電腦1212向輸出設備1240輸出資訊。提供輸出配接器1242來說明在其它輸出設備1240中有一些輸出設備1240，例如監視器、揚聲器和印表機，需要特殊的配接器。作為舉例說明但非限定地，輸出配接器1242包括視訊卡和音效卡，提供了在輸出設備1240與系統匯流排1218之間的連接手段。應該注意到，其它設備及/或設備系統提供了輸入和輸出能力兩者，諸如遠端電腦1244。The user enters commands or information into the computer 1212 via the input device 1236 . Input device 1236 includes, but is not limited to, pointing devices such as a mouse, trackball, stylus, trackpad, keyboard, microphone, joystick, gamepad, satellite dish, scanner, TV tuner card, digital Cameras, digital cameras, web cameras, and more. Such and other input devices 1238 via interface port connected via a system bus 1218 to the processing unit 1214. The interface 埠1238 includes, for example, a serial port, a parallel port, a game port, and a universal serial bus (USB). Output device 1240 uses some of the same types of ports as input device 1236 . Thus, for example, USB 埠 can be used to provide input to computer 1212 and output information from computer 1212 to output device 1240 . Output adapter 1242 is provided to illustrate that there are some output devices 1240 in other output devices 1240 , such as monitors, speakers, and printers, requiring special adapters. By way of example and not limitation, the output adapter 1242 includes a video card and a sound card providing a means of connection between the output device 1240 and the system bus 1218 . It should be noted that other devices and/or device systems provide both input and output capabilities, such as remote computer 1244 .

電腦1212可以在使用至一或多個遠端電腦(諸如遠端電腦1244)的邏輯連接的聯網環境中操作。遠端電腦1244可以是個人電腦、伺服器、路由器、網路PC、工作站、基於微處理器的電器、同級設備或其它一般的網路節點等等，並且通常包括相對於電腦1212述及之多個或全部元件。為了簡明，針對遠端電腦1244僅圖示記憶體儲存設備1246。遠端電腦1244經由網路介面1248邏輯連接至電腦1212，並且隨後經由通訊連接1250進行實體連接。網路介面1248涵蓋有線及/或無線通訊網路，諸如區域網路(LAN)和廣域網(WAN)。LAN技術包括光纖分散式資料介面(FDDI)、銅分散式資料介面(CDDI)、乙太網路、權杖環網路等等。WAN技術包括但不限於：點對點連結、電路切換式網路，如集成服務數位網路(ISDN)及變體、封包交換網路和數位用戶線路(DSL)。Computer 1212 can operate in a networked environment using logical connections to one or more remote computers, such as remote computer 1244 . The remote computer 1244 can be a personal computer, a server, a router, a network PC, a workstation, a microprocessor based appliance, a peer device or other general network node, and the like, and typically includes as much as described with respect to the computer 1212. One or all components. For simplicity, only the memory storage device 1246 is illustrated for the remote computer 1244 . The remote computer 1244 is logically connected to the computer 1212 via the network interface 1248 and then physically connected via the communication connection 1250 . The network interface 1248 covers wired and/or wireless communication networks, such as regional networks (LANs) and wide area networks (WANs). LAN technologies include Fiber Distributed Data Interface (FDDI), Copper Distributed Data Interface (CDDI), Ethernet, Scepter Ring Network, and more. WAN technologies include, but are not limited to, point-to-point connections, circuit-switched networks such as Integrated Services Digital Network (ISDN) and variants, packet switched networks, and Digital Subscriber Line (DSL).

通訊連接1250是指用於將網路介面1248連接至匯流排1218的硬體/軟體。雖然為了說明清楚而將通訊連接1250顯示為在電腦1212內部，但是通訊連接也可以在電腦1212外部。對於至網路介面1248的連接所必需的硬體/軟體包括(僅做舉例目的)內部和外部技術，諸如包括一般電話級數據機、纜線數據機和DSL數據機之類的數據機、ISDN配接器和乙太網路卡。Communication connection 1250 refers to the hardware/software used to connect network interface 1248 to busbar 1218 . Although the communication connection 1250 is shown as being internal to the computer 1212 for clarity of illustration, the communication connection may also be external to the computer 1212 . The hardware/software necessary for the connection to the network interface 1248 includes (for example purposes only) internal and external technologies, such as data machines including general telephone-grade data machines, cable data machines, and DSL modems, ISDN. Adapter and Ethernet card.

本領域一般技藝人士將會在不脫離要求保護的揭示內容的精神和範圍的情況下設想到本文所述內容的變體、修改和其它實施方式。因此，此揭示案內容並非要由上述示例性說明來定義，而是由附帶的請求項的精神和範圍來定義。Variations, modifications, and other embodiments of the subject matter described herein will be apparent to those skilled in the <RTIgt; Accordingly, the contents of this disclosure are not intended to be defined by the above-described exemplary description, but rather by the spirit and scope of the appended claims.

應該明顯的是，本文的教導可以以各種形式體現，並且本文揭示的任何具體結構或功能僅是代表性的。基於本文的教導，本領域技藝人士應該認識到，本文揭示的某個方案可以獨立於其它方案來實施，並且該等方案中的兩個或更多個可以以各種方式進行組合。例如，可以使用本文闡述的任意數量的方案來實施裝置或實現方法。另外，可以使用其它結構或功能作為本文闡述的一或多個方案的補充或替代來實施裝置或實現方法。作為示例，在行動通訊環境中提供動態查詢和推薦的背景中描述本文述及之方法、設備、系統和裝置中的多個。本領域技藝人士應該認識到，類似的技術還可以應用於其它通訊和非通訊環境中。It should be apparent that the teachings herein may be embodied in various forms and that any specific structure or function disclosed herein is merely representative. Based on the teachings herein, those skilled in the art will recognize that a certain aspect disclosed herein can be implemented independently of other aspects, and two or more of these can be combined in various ways. For example, an apparatus or implementation may be implemented using any number of the approaches set forth herein. In addition, other structures or functions may be implemented in addition to or in place of one or more of the aspects set forth herein. As an example, a plurality of methods, apparatus, systems, and apparatuses described herein are described in the context of providing dynamic queries and recommendations in a mobile communication environment. Those skilled in the art will recognize that similar techniques can be applied to other communication and non-communication environments as well.

如在本案內容中所使用的，術語「內容」和「物件」用於描述任意類型的應用、多媒體檔、影像檔、可執行檔、程式、網頁、腳本、文件、呈文、訊息、資料、中繼資料或能夠在設備上呈現、處理或執行的任何其它類型的媒體或資訊。As used in the context of this case, the terms "content" and "object" are used to describe any type of application, multimedia file, image file, executable file, program, web page, script, file, submission, message, material, medium Any other type of media or information following the material or that can be rendered, processed or executed on the device.

如在本案內容中所使用的，術語「元件」、「系統」、「模組」等等意欲代表與電腦相關的實體，或者是硬體、軟體、執行中的軟體、韌體、中介軟體或者使所述者任意組合。例如，元件可以是但不限於：在處理器上運行的程序、處理器、物件、可執行檔、執行執行緒、程式或電腦。一或多個元件可以存在於執行程序及/或執行執行緒中，並且元件可以位於一個電腦中或分佈在兩個或更多電腦中。此外，該等元件能夠從在其上儲存有各種資料結構的各種電腦可讀取媒體中執行。該等元件可以經由本端及/或遠端程序進行通訊，例如根據具有一或多個資料封包的信號(例如，來自一個元件的資料，而該元件以信號方式與本端系統、分散式系統中的另一個元件進行互動，或者經由諸如網際網路之類的網路與其它系統進行互動)。另外，如本領域技藝人士將會認識到的，本文述及之系統的元件可以進行重新排列或者由額外的元件進行補充，以便有助於實現結合該等元件述及之各個方案、目的、優勢等等，並且並不限於在給定的附圖中提供的精確配置。As used in the context of this case, the terms "component", "system", "module" and the like are intended to mean a computer-related entity, or a hardware, software, executing software, firmware, mediation software, or The combinations are arbitrarily combined. For example, an element can be, but is not limited to, a program running on a processor, a processor, an object, an executable, a thread of execution, a program, or a computer. One or more components can reside in a executing program and/or a executing thread, and the components can be located in a computer or distributed across two or more computers. Moreover, the elements can be executed from a variety of computer readable media having various data structures stored thereon. The elements can communicate via the local and/or remote program, for example, based on a signal having one or more data packets (eg, data from a component that is signaled to the local system, the distributed system Another component in the interaction interacts with other systems via a network such as the Internet. In addition, as will be recognized by those skilled in the art, the elements of the systems described herein may be rearranged or supplemented by additional elements to facilitate the various aspects, objectives, and advantages described in connection with the elements. And so on, and is not limited to the precise configuration provided in the given figures.

另外，可以用被設計為執行本文所述功能的通用處理器、數位訊號處理器(DSP)、特殊應用積體電路(ASIC)、現場可程式設計閘陣列(FPGA)或其它可程式設計邏輯裝置、個別閘門或者電晶體邏輯裝置、個別硬體元件或所述者任意組合，來實施或執行結合本文揭示的方案所描述的各種示例性的邏輯、邏輯區塊、模組和電路。通用處理器可以是微處理器，但可替換地，該處理器也可以是任何一般的處理器、控制器、微控制器或者狀態機。處理器也可以實現為計算設備的組合，例如，DSP和微處理器的組合、多個微處理器、一或多個微處理器與DSP核心的結合，或者任何其它此種結構。另外，至少一個處理器可以包括可操作來執行一或多個上述步驟及/或操作的一或多個模組。In addition, general purpose processors, digital signal processors (DSPs), special application integrated circuits (ASICs), field programmable gate arrays (FPGAs), or other programmable logic devices designed to perform the functions described herein can be used. The various gates, or transistor logic devices, individual hardware components, or any combination of the above, are used to implement or perform the various exemplary logic, logic blocks, modules and circuits described in connection with the aspects disclosed herein. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any general processor, controller, microcontroller, or state machine. The processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, a combination of one or more microprocessors and a DSP core, or any other such structure. Additionally, at least one processor can include one or more modules operable to perform one or more of the steps and/or operations described above.

此外，可以使用標準程式設計或工程技術將本文述及之各種方案或結構實施為方法、裝置或製品。此外，結合本文揭示的方案所描述的方法或演算法的操作或動作可直接體現在硬體、由處理器執行的軟體模組或二者的組合中。另外，在一些方案中，方法或演算法的操作或動作可以作為代碼或指令的至少一個或任意組合或集合而位於機器可讀取媒體或電腦可讀取媒體上，機器可讀取媒體或電腦可讀取媒體可以結合至電腦程式產品中。此外，本文所使用的術語「製品」意欲包括可以從任何電腦可讀取設備、載體或媒體存取的電腦程式。例如，電腦可讀取媒體可以包括但是不限於，磁存放裝置(例如，硬碟、軟碟、磁帶等等)、光碟(例如，壓縮光碟(CD)、數位多功能光碟(DVD)等等)、智慧卡和快閃記憶體設備(例如，卡、棒、鍵式磁碟等等)。此外，本文述及之各種儲存媒體可以表示用於儲存資訊的一或多個設備或其他機器可讀取媒體。術語「機器可讀取媒體」可以包括但不限於：無線通道以及能夠儲存、包含或攜帶指令或資料的各種其它媒體。In addition, the various aspects or structures described herein may be implemented as a method, apparatus, or article of manufacture using standard programming or engineering techniques. Furthermore, the operations or actions of the methods or algorithms described in connection with the aspects disclosed herein may be embodied in a hardware, a software module executed by a processor, or a combination of both. In addition, in some aspects, the operations or actions of a method or algorithm may be located on a machine readable medium or computer readable medium as at least one or any combination or combination of code or instructions, machine readable medium or computer Readable media can be incorporated into computer program products. Moreover, the term "article of manufacture" as used herein is intended to include a computer program that can be accessed from any computer readable device, carrier, or media. For example, computer readable media can include, but is not limited to, magnetic storage devices (eg, hard drives, floppy disks, tapes, etc.), optical disks (eg, compact discs (CDs), digital versatile discs (DVD), etc.) , smart cards and flash memory devices (eg cards, sticks, keyed disks, etc.). Moreover, various storage media described herein can represent one or more devices or other machine readable media for storing information. The term "machine readable medium" may include, but is not limited to, a wireless channel and various other media capable of storing, containing or carrying instructions or material.

此外，本文結合行動設備來描述各個方案。行動設備還可以被稱作為系統、用戶單元、用戶站、行動站、行動裝置、行動設備、蜂巢設備、多模設備、遠端站、遠端終端機、存取終端、使用者終端、使用者代理、使用者設備或使用者裝置等等。用戶站可以是蜂巢式電話、無線電話、對話啟動協定(SIP)電話、無線區域迴路(WLL)站、個人數位助理(PDA)、具有無線連接能力的手持設備或者連接到無線數據機或用以實現與處理設備的無線通訊的類似機構的其它處理設備。In addition, various aspects are described herein in connection with mobile devices. Mobile devices can also be referred to as systems, subscriber units, subscriber stations, mobile stations, mobile devices, mobile devices, cellular devices, multi-mode devices, remote stations, remote terminals, access terminals, user terminals, users. Agent, user device or user device, etc. The subscriber station can be a cellular telephone, a wireless telephone, a Session Initiation Protocol (SIP) telephone, a Wireless Area Loop (WLL) station, a Personal Digital Assistant (PDA), a wirelessly connected handheld device, or connected to a wireless data modem or Other processing devices that implement similar mechanisms for wireless communication with processing devices.

除了上述內容之外，在本文中使用詞語「示例性的」來表示用作示例、實例或舉例說明。本文中被描述為「示例性的」的任何方案或設計不必被解釋為優選於或者優於其它方案或設計。相反，對詞語「示例性的」的使用意欲以一種具體的形式來提供概念。另外，如在本案和附帶的請求項中所使用的，術語「或」意欲表示包含性的「或」而不是排他性的「或」。即，若非特別指出，或者從上下文中顯而易見，否則「X使用A或B」意欲表示任意一種自然的包含性置換。即，在該示例中，X可以使用A，X可以使用B，或X可以使用A和B兩者，並且因此語句「X使用A或B」在任何上述實例中皆得到滿足。另外，本案中和附帶的請求項中所使用的冠詞「一」應該被整體理解為表示「一或多個」，除非具體指出或者從上下文顯而易見是代表單數形式。In addition to the above, the word "exemplary" is used herein to mean serving as an example, instance, or illustration. Any aspect or design described herein as "exemplary" is not necessarily to be construed as preferred or preferred. Instead, the use of the word "exemplary" is intended to provide a concept in a specific form. In addition, as used in this and the accompanying claims, the term "or" is intended to mean an inclusive "or" rather than an exclusive "or". That is, "X uses A or B" is intended to mean any of the natural inclusive permutations unless otherwise specified or apparent from the context. That is, in this example, X can use A, X can use B, or X can use both A and B, and thus the statement "X uses A or B" is satisfied in any of the above examples. In addition, the article "a" or "an" or "an" or "an"

如本文中使用的，術語「推斷」或「推論」一般是指從經由事件或資料所獲取的一組觀察值來推理或推斷系統、環境或使用者的狀態的程序。可以用推論來辨識具體上下文或動作，或者推論可以產生例如狀態的概率分佈。推論可以是基於概率的-即，基於對資料和事件的考慮來計算感興趣狀態的概率分佈。推論還可以代表用於根據一組事件或資料來組合高級別事件的技術。此種推論導致從一組觀察的事件及/或儲存的事件資料構成新的事件或動作，而不管該等事件是否在時間上緊密相關並且也不管該事件和資料是來自一個還是若干事件和資料來源。As used herein, the term "inference" or "inference" generally refers to a procedure for inferring or inferring the state of a system, environment, or user from a set of observations obtained through an event or material. Inference can be used to identify a particular context or action, or inference can produce, for example, a probability distribution of states. The inference can be based on probability - that is, the probability distribution of the state of interest is calculated based on considerations of data and events. Inference can also represent techniques for combining high-level events based on a set of events or materials. Such inferences result in new events or actions being formed from a set of observed events and/or stored event data, regardless of whether the events are closely related in time and regardless of whether the event and data are from one or several events and data. source.

100．．．裝置100. . . Device

102．．．行動設備102. . . Mobile device

104．．．視覺媒體內容104. . . Visual media content

106．．．顯示器106. . . monitor

108．．．使用者108. . . user

110．．．相機110. . . camera

112．．．數位圖像112. . . Digital image

114．．．計算平臺114. . . Computing platform

116．．．四邊形116. . . quadrilateral

118．．．遠端伺服器118. . . Remote server

120．．．資料庫120. . . database

122．．．媒體內容檔122. . . Media content file

124．．．辨識資訊124. . . Identification information

126．．．媒體內容126. . . Media content

128．．．空中通道128. . . Air passage

130．．．使用者介面130. . . user interface

200．．．用於辨識視覺媒體內容的方法200. . . Method for identifying visual media content

202．．．從行動設備的相機接收圖像202. . . Receiving images from the camera of the mobile device

204．．．偵測在該圖像中包含的四邊形204. . . Detecting the quadrilateral contained in the image

206．．．擷取在該四邊形內包含的視覺媒體內容以便辨識該視覺媒體內容206. . . Capturing visual media content contained within the quad to identify the visual media content

300．．．用於擷取並辨識在相機成像的被偵測外部顯示器內的視覺媒體內容的方法300. . . Method for capturing and recognizing visual media content in a detected external display imaged by a camera

302．．．使無線行動設備的相機進行指向302. . . Point the camera of the wireless mobile device

304．．．使用者選擇用於擷取圖像內容的控制304. . . User selects control for capturing image content

306．．．決定針對文字擷取的意圖/適當性306. . . Decide on the intent/appropriateness of the text

308．．．B/W-無運動308. . . B/W-no motion

310．．．決定針對圖像擷取的意圖/適當性310. . . Decide on the intent/appropriateness of the image capture

312．．．彩色-無運動312. . . Color - no movement

314．．．決定針對視訊擷取的意圖/適當性314. . . Decide on the intent/appropriateness of video capture

316．．．彩色-運動316. . . Color-sports

317．．．將RGB形式的圖像擷取輸出轉換至YCbCr，並進行處理以得到顏色和運動資訊317. . . Convert image capture output in RGB format to YCbCr and process it for color and motion information

318．．．在低分辨率的相機預覽程序中檢視四邊形的圖像/視訊源318. . . View quadrilateral images/video sources in a low resolution camera preview program

320．．．使用者點擊取景器來進行輔助320. . . The user clicks on the viewfinder to assist

322．．．顯示候選源以供使用者選擇322. . . Display candidate sources for users to choose

324．．．接收使用者選擇/決定圖像/視訊源324. . . Receive user selection/decision image/video source

326．．．使用者經由對準/聚焦/點擊來進行輔助326. . . User assisted by alignment/focus/click

328．．．向相機發送感興趣區域(ROI)以獲得最佳分辨率設置328. . . Send a region of interest (ROI) to the camera for optimal resolution settings

330．．．從區域擷取一個(或多個)訊框330. . . Extract one (or more) frames from the area

332．．．辨識所擷取的一個(或多個)訊框的媒體內容332. . . Identifying the media content of one (or more) frames captured

334．．．下載媒體內容334. . . Download media content

336．．．在「N」個訊框(n=1則表示無運動)上採用VGA分辨率336. . . VGA resolution on "N" frames (n=1 means no motion)

338．．．建立具有255個值的ROI圖，其中在點擊使用者輸入時，可以由於值的減少而變模糊338. . . Create an ROI graph with 255 values, which can be blurred due to a decrease in value when clicking on user input

340．．．經由針對值>x(例如，實現20%的圖元剩餘)閾值化最大值(RGB)，來建立「明亮的」ROI圖340. . . Establish a "bright" ROI map by thresholding the maximum value (RGB) for values >x (for example, achieving 20% of the remaining pixels)

342．．．經由每個訊框與前「M」訊框(例如M=3)的差來建立「運動的」ROI圖，並記錄最大△以說明去除信號干擾342. . . Create a "moving" ROI map via the difference between each frame and the previous "M" frame (eg, M=3), and record the maximum △ to indicate the removal of signal interference.

344．．．基於ROI圖的權重來剪裁來自取景器的圖像344. . . Crop the image from the viewfinder based on the weight of the ROI map

346．．．送入快速拐角偵測器346. . . Feed fast corner detector

348．．．對比閾值圖元(6)更靠近的拐角點(CP)進行群集348. . . Contrast threshold primitives ( 6) Closer corner points (CP) for clustering

350．．．若CP在明亮的圖內完整的N×N區域內，則刪除CP350. . . If the CP is in the complete N×N area in the bright map, delete the CP

352．．．若在M×M區域中的運動完全在明亮的圖內，則刪除CP352. . . If the motion in the M×M area is completely in the bright picture, delete the CP

354．．．從刪除後的CP中辨識四邊形的候選354. . . Identify candidates for quadrilateral from the deleted CP

356．．．無凸面(角度總和為360°)356. . . No convex surface (the total angle is 360°)

358．．．任何內角>110°358. . . Any internal angle >110°

360．．．視訊長寬比(4:3，16:9)360. . . Video aspect ratio (4:3, 16:9)

362．．．面積≧圖像的1/25362. . . Area ≧ image 1/25

364．．．2個相等的鄰角364. . . 2 equal neighbors

365．．．基於深度檢視來關聯四邊形的候選365. . . Candidates for associating quadrilaterals based on depth inspection

366．．．在主列表中添加候選的4個拐角366. . . Add 4 corners of the candidate to the main list

368．．．刪除佔據另一四邊形的80%的任何四邊形368. . . Remove any quads that occupy 80% of the other quad

370．．．基於對表徵監視器/顯示器的邊框形狀和統一顏色的偵測來刪除拐角370. . . Remove corners based on detection of the shape of the border and uniform color that characterizes the monitor/display

371．．．決定能力371. . . Decision ability

372．．．對至伺服器的連接的頻寬進行測試372. . . Test the bandwidth of the connection to the server

374．．．決定設備性能約束374. . . Determine device performance constraints

376．．．存取用戶偏好或使用頻寬的成本376. . . Access to user preferences or the cost of using bandwidth

378．．．查閱資料表(LUT)來決定設備現對於伺服器的所分配的用於媒體內容辨識的圖像處理378. . . Review the data table (LUT) to determine the image processing that the device is currently assigning to the server for media content recognition.

380．．．本端處理模式低連接具有數位信號處理能力的設備選擇380. . . Local processing mode low connection device selection with digital signal processing capability

382．．．基於N×N的長條圖+邊緣偵測&尺寸不變特徵轉換382. . . N×N based bar graph + edge detection & size invariant feature conversion

384．．．向伺服器發送特徵矢量384. . . Send a feature vector to the server

386．．．共享處理模式中等連接本端和遠端DSP選擇386. . . Shared processing mode medium connection local and remote DSP selection

388．．．基於N×N的長條圖+邊緣偵測388. . . N×N based bar graph + edge detection

390．．．向伺服器發送結果390. . . Send results to the server

392．．．遠端處理模式快速連接無DSP能力的設備選擇392. . . Remote processing mode for fast connection to DSP-free device selection

394．．．向伺服器發送所擷取的片段394. . . Send the selected clip to the server

396．．．接收媒體內容匹配的候選列表396. . . Receive a candidate list of media content matches

397．．．無匹配：擷取更多媒體內容397. . . No match: grab more media content

398．．．有限匹配(例如3個)：提示使用者進行選擇398. . . Limited match (for example, 3): prompt the user to make a selection

399．．．大數量的匹配：上傳更多片段399. . . A large number of matches: upload more clips

400．．．用於使用具有相機功能的通訊裝置或設備執行圖像分析的方法400. . . Method for performing image analysis using a communication device or device having a camera function

402．．．儲存來自相機取景器的低分辨率模式的訊框402. . . Save frames from low resolution mode of the camera viewfinder

404．．．自動或手動選擇的較高分辨率快照或快照序列404. . . High resolution snapshot or snapshot sequence selected automatically or manually

406．．．提取取景器圖像中與正在呈現媒體內容的外部顯示器/監視器相對應的部分的顯示406. . . Extracting the display of the portion of the viewfinder image corresponding to the external display/monitor that is presenting the media content

408．．．選擇完全本端、分散式本端和遠端或全部遠端執行的正確部分408. . . Select the correct part of the full local, decentralized local and remote or all remote execution

410．．．Harris拐角偵測器檢視ROI中的所有拐角410. . . Harris corner detector examines all corners in the ROI

412．．．檢視的拐角的全部排列和組合，直到建立一個集合412. . . Align all the corners of the view and combine until a set is created

414．．．(1)內部內容具有的平均亮度比整體圖像的平均亮度高很多414. . . (1) The internal content has an average brightness that is much higher than the average brightness of the overall image.

416．．．(2)連接4個拐角的線均勻且在HSV空間中具有大致相同的色調416. . . (2) The lines connecting the four corners are uniform and have substantially the same hue in the HSV space.

418．．．(3)刪除4個點的透視高於眼睛高度的區域418. . . (3) Delete the area where the perspective of the 4 points is higher than the height of the eye

420．．．在找到了準確的拐點的情況下，可選地，若透視大於某個閾值，則對透視進行校正420. . . In the case where an accurate inflection point is found, optionally, if the perspective is greater than a certain threshold, the perspective is corrected

422．．．在矩形或正方形N×N網格中建立圖像的長條圖422. . . Create a bar graph of an image in a rectangular or square N×N grid

424．．．按照硬邊緣或軟邊緣來計算每個區域中的圖元數量424. . . Calculate the number of primitives in each region by hard or soft edges

426．．．將N2個RGB值和硬邊緣值+軟邊緣值合併至資料有效負荷中並進行發送426. . . Combine N2 RGB values and hard edge values + soft edge values into the data payload and send

428．．．媒體內容被辨識並被格式化以便在行動設備上繼續呈現428. . . Media content is recognized and formatted for continued presentation on mobile devices

430．．．說明進行資料權利訂閱430. . . Explain the subscription of data rights

500．．．圖像500. . . image

502．．．顯示器502. . . monitor

600．．．圖像600. . . image

602．．．顯示器602. . . monitor

604．．．拐角604. . . corner

606．．．媒體內容606. . . Media content

700．．．圖元集合700. . . Primitive collection

702．．．外部點702. . . External point

704．．．內部點704. . . Internal point

706．．．顯示器706. . . monitor

708．．．圖像點708. . . Image point

710．．．媒體內容710. . . Media content

712．．．無關點712. . . Irrelevant point

800．．．系統800. . . system

802．．．邏輯組802. . . Logical group

804．．．用於從行動設備的相機接收圖像的電子組件804. . . Electronic component for receiving images from a camera of a mobile device

806．．．用於偵測圖像內包含的四邊形的電子組件806. . . Electronic component for detecting quadrilateral contained in an image

808．．．用於擷取在四邊形內包含的視覺媒體內容以便辨識該視覺媒體內容的電子組件808. . . Electronic component for capturing visual media content contained within a quad to identify the visual media content

820．．．記憶體820. . . Memory

902．．．裝置902. . . Device

904．．．用於從行動設備的相機接收圖像的構件904. . . Membrane for receiving images from a camera of a mobile device

906．．．用於偵測圖像內包含的四邊形的構件906. . . A member for detecting a quadrangle contained in an image

908．．．用於擷取在四邊形內包含的視覺媒體內容以便辨識該視覺媒體內容的構件908. . . Metric for capturing visual media content contained within a quad to identify the visual media content

920．．．記憶體920. . . Memory

1000．．．行動設備1000. . . Mobile device

1010．．．數位信號處理器(DSP)1010. . . Digital signal processor (DSP)

1022．．．設備1022. . . device

1026．．．顯示控制器1026. . . Display controller

1028．．．顯示器1028. . . monitor

1030．．．輸入裝置1030. . . Input device

1032．．．記憶體1032. . . Memory

1034．．．轉碼器1034. . . Transcoder

1036．．．揚聲器1036. . . speaker

1038．．．麥克風1038. . . microphone

1040．．．無線介面1040. . . Wireless interface

1042．．．無線天線1042. . . Wireless antenna

1044．．．電源1044. . . power supply

1046．．．圖像資料1046. . . Image data

1064．．．長條圖、拐角偵測器和SIFT產生器1064. . . Bar graph, corner detector and SIFT generator

1068．．．相機介面1068. . . Camera interface

1070．．．攝像機1070. . . Camera

1100．．．MIMO系統1100. . . MIMO system

1110．．．無線設備1110. . . Wireless device

1112．．．資料來源1112. . . source

1114．．．TX資料處理器1114. . . TX data processor

1120．．．TX MIMO處理器1120. . . TX MIMO processor

1122a．．．收發機1122a. . . Transceiver

1122t．．．收發機1122t. . . Transceiver

1124A．．．天線1124A. . . antenna

1124T．．．天線1124T. . . antenna

1130．．．處理器1130. . . processor

1132．．．記憶體1132. . . Memory

1136．．．資料來源1136. . . source

1138．．．TX資料處理器1138. . . TX data processor

1140．．．解調器1140. . . Demodulator

1142．．．RX資料處理器1142. . . RX data processor

1152A．．．天線1152A. . . antenna

1152R．．．天線1152R. . . antenna

1154a．．．收發機1154a. . . Transceiver

1154r．．．收發機1154r. . . Transceiver

1160．．．RX資料處理器1160. . . RX data processor

1170．．．處理器1170. . . processor

1172．．．記憶體1172. . . Memory

1180．．．數據機1180. . . Data machine

1190．．．干擾控制1190. . . Interference control

1192．．．干擾控制1192. . . Interference control

1200．．．計算環境1200. . . Computing environment

1212．．．電腦1212. . . computer

1214．．．處理單元1214. . . Processing unit

1216．．．系統記憶體1216. . . System memory

1218．．．匯流排1218. . . Busbar

1220．．．揮發性記憶體1220. . . Volatile memory

1222．．．非揮發性記憶體1222. . . Non-volatile memory

1224．．．盤儲存設備1224. . . Disk storage device

1226．．．介面1226. . . interface

1228．．．作業系統1228. . . working system

1230．．．應用1230. . . application

1232．．．模組1232. . . Module

1234．．．資料1234. . . data

1236．．．輸入裝置1236. . . Input device

1238．．．介面埠1238. . . Interface埠

1240．．．輸出設備1240. . . Output device

1242．．．輸出配接器1242. . . Output adapter

1244．．．遠端電腦1244. . . Remote computer

1246．．．記憶體儲存設備1246. . . Memory storage device

1248．．．網路介面1248. . . Network interface

1250．．．通訊連接1250. . . Communication connection

圖1圖示行動設備的示意圖，該行動設備辨識由顯示器在外部呈現供用戶觀看的視覺媒體內容。1 illustrates a schematic diagram of a mobile device that recognizes visual media content presented externally by a display for viewing by a user.

圖2圖示用於辨識視覺媒體內容的方法或操作序列的流程圖。2 illustrates a flow diagram of a method or sequence of operations for identifying visual media content.

圖3A圖示用於擷取並辨識在由相機成像的被偵測的外部顯示器內的視覺媒體內容的方法或操作序列的流程圖。3A illustrates a flow diagram of a method or sequence of operations for capturing and recognizing visual media content within a detected external display imaged by a camera.

圖3B圖示用於檢視圖像內的四邊形圖像/視訊源的方法或操作序列的流程圖。3B illustrates a flow diagram of a method or sequence of operations for viewing a quadrilateral image/video source within an image.

圖3C圖示用於辨識所擷取的訊框的媒體內容的方法或操作序列的流程圖。3C illustrates a flow diagram of a method or sequence of operations for identifying media content of a captured frame.

圖4圖示使用具有相機功能的通訊設備來進行圖像分析的方法或操作序列的示例性流程圖。4 illustrates an exemplary flow chart of a method or sequence of operations for performing image analysis using a communication device having camera functionality.

圖5圖示包含從側面觀看的顯示器的房間的示意性低解析度圖像的圖形圖示。Figure 5 illustrates a graphical illustration of a schematic low resolution image of a room containing a display viewed from the side.

圖6圖示房間的感興趣區域(ROI)部分的示意性低解析度圖像的圖形圖示，具有多個已偵測拐角(corner)作為用於自動定義顯示器的候選。Figure 6 illustrates a graphical illustration of an illustrative low resolution image of a region of interest (ROI) portion of a room with a plurality of detected corners as candidates for automatically defining the display.

圖7圖示從圖像分析中匯出的一組候選的經群集和刪除的圖元的圖形圖示。Figure 7 illustrates a graphical illustration of a set of candidate clustered and deleted primitives that are exported from an image analysis.

圖8圖示用於辨識視覺媒體內容的系統的示意圖。Figure 8 illustrates a schematic diagram of a system for identifying visual media content.

圖9圖示具有用於辨識視覺媒體內容的構件的裝置的示意圖。Figure 9 illustrates a schematic diagram of an apparatus having means for identifying visual media content.

圖10是包括長條圖、拐角偵測器和尺度不變特徵轉換(SIFT)產生器的行動設備的方塊圖。10 is a block diagram of a mobile device including a bar graph, a corner detector, and a scale invariant feature transform (SIFT) generator.

圖11圖示可以包括執行干擾控制操作的一或多個元件的通訊元件的示意圖。11 illustrates a schematic diagram of communication elements that may include one or more components that perform interference control operations.

圖12圖示示例性計算環境的示意圖。Figure 12 illustrates a schematic diagram of an exemplary computing environment.

Claims

A method for recognizing visual media content, comprising: receiving an image from a camera of a mobile device; detecting a quadrilateral included in the image; and capturing visual media content included in the quadrilateral for identification The visual media content.

The method of claim 1, wherein extracting the visual media content included in the quadrilateral to identify the visual media content further comprises: performing character recognition.

The method of claim 1, wherein extracting the visual media content contained within the quadrilateral to identify the visual media content further comprises performing image recognition for static visual media content.

The method of claim 1, wherein detecting the quadrilateral included in the image further comprises: establishing a region of interest map based on the object contrast, the region of interest map identifying important details in the image.

The method of claim 1, wherein detecting the quadrilateral included in the image further comprises: receiving a user input with respect to a portion of a viewfinder scene of the image.

The method of claim 1, wherein receiving the image from the camera of the mobile device further comprises: receiving a plurality of consecutive frames, and wherein the visual media content included in the quadrilateral is captured to identify the visual The media content further includes: performing video image recognition for the dynamic visual media content.

The method of claim 6, wherein detecting the quadrilateral included in the image further comprises: establishing a motion map by determining a difference between the consecutive plurality of frames.

The method of claim 7, further comprising: performing corner detection; and deleting a corner point in the motion map.

The method of claim 1, wherein detecting the quadrilateral included in the image further comprises: creating a region of interest map; and cropping the image to include the region of interest map.

The method of claim 1, wherein detecting the quadrilateral included in the image further comprises: establishing a bright map by detecting a portion having a brighter illumination.

The method of claim 10, wherein detecting the quadrilateral included in the image further comprises: performing corner detection; clustering the corner points; and deleting a cluster of corner points within the bright map.

The method of claim 1, wherein detecting the quadrilateral included in the image further comprises: establishing a depth map by detecting a depth of focus of the plurality of portions in the image.

The method of claim 1, wherein detecting the quadrilateral included in the image further comprises: detecting a candidate quadrilateral shape satisfying an identification criterion of the selected four corner point clusters to identify a rectangular display A perspective view of the device.

The method of claim 13, wherein the recognizing the perspective view of the rectangular display device further comprises: deleting any candidate quadrilateral shape occupying a majority of the area of the other quadrilateral by determining whether the quadrilateral shape is large enough to be Contains all other candidate quadrilateral shapes.

The method of claim 13, wherein the identifying the perspective view of the rectangular display device further comprises: identifying a border of the rectangular display device.

The method of claim 15, wherein the identifying the frame of the rectangular display device further comprises: detecting a common frame thickness of the portion of the frame on the opposite side.

The method of claim 16, wherein detecting the common bezel thickness of the portion of the bezel on the opposite side comprises detecting a percentage of a thickness of the enclosing dimension of the rectangular display device, for example, about 10%.

The method of claim 15, wherein the identifying the frame of the rectangular display device further comprises: detecting a common color of a common main portion of the frame on opposite sides.

The method of claim 18, wherein detecting the common color of a common major portion of the frame on opposite sides further comprises detecting at least a percentage of the plurality of primitives having the common color.

The method of claim 13, wherein the recognizing the perspective view of the rectangular display device further comprises: determining that the candidate quadrilateral shape satisfies more than one criterion, the criterion comprising non-convex surfaces, all of the internal angles being greater than 110 degrees And having an area occupying a major portion of the image, having an aspect ratio approximately equal to a standard video aspect ratio, and two adjacent angles having approximately the same angle.

The method of claim 1, further comprising: performing a bar graph analysis, edge detection, and scale invariant feature conversion on a portion of the image within the selected quadrilateral to identify the corresponding media content. .

The method of claim 1, further comprising: determining an image processing constraint; and, in response to the image processing constraint, allocating a portion of the image within the quadrilateral between the mobile device and a remote server Image processing.

The method of claim 22, wherein the image processing constraint comprises: the ability of the mobile device to perform the image processing.

The method of claim 22, wherein the image processing constraint comprises, at least in part, a data transmission cost transmitted via a transmission channel from the mobile device to the remote processing.

The method of claim 22, wherein the image processing constraint comprises: a capability from the mobile device to a transmission channel of the remote processing.

The method of claim 25, wherein the allocating the image processing of the portion of the image comprises: transmitting the image data comprising an image segment in response to determining a low capability of the transmission channel; in response to the determining a medium capability of the transmission channel to transmit the image data including the image segment after partial image processing; and in response to determining a high capability of the transmission channel, transmitting the image segment including the image segment after all image processing Image data.

The method of claim 1, further comprising: transmitting, to a remote server, image data obtained from a portion of the image within the quadrilateral; and receiving information about the image from the remote server A matching report of any media content library.

The method of claim 27, wherein receiving the report regarding the match from the remote server further comprises: determining that no match is recognized; and receiving an image from a camera of a mobile device, detecting A quadrilateral included in the image and a visual media content contained within the quadrilateral to identify the visual media content for repetition to obtain additional image material to be sent to the remote server.

The method of claim 27, wherein receiving the report regarding the match from the remote server further comprises: determining that the number of matches obtained from the report is suitable for presentation on a user interface of the mobile device Size; and receiving a user selection for a media content in the list of media content derived from the report and presented on the user interface.

The method of claim 27, wherein receiving the report regarding the match from the remote server further comprises: determining that the number of matches obtained from the report has a ratio greater than that suitable for presentation on a user interface of the mobile device One size larger than one size; and an image segment is sent in response to the remote server for additional image processing.

The method of claim 1, further comprising: receiving the matched media content for presentation by the mobile device.

The method of claim 31, further comprising: presenting the matched media content from a point identified by the captured visual media content.

The method of claim 31, further comprising: reformatting the matched media content for a user of the mobile device.

The method of claim 31, further comprising: transmitting an identifier of a user interface of the mobile device to prompt reformatting the matched media content.

The method of claim 31, further comprising: negotiating data management rights to present the matched media content.

The method of claim 1, further comprising: receiving audio captured by a microphone; and using a hash view function to assist in matching the image material with a library of media content.

A processor for recognizing visual media content, comprising: a first module for receiving an image from a camera of a mobile device; and a second module for detecting inclusion in the image a quadrilateral; and a third module for capturing a visual media content included in the quadrilateral to identify the visual media content.

A computer program product for recognizing visual media content, comprising: a non-transitory computer readable storage medium, comprising: a first set of codes for causing a computer to receive an image from a camera of a mobile device; a second set of codes for causing the computer to detect a quadrilateral included in the image; and a first program for causing the computer to capture a visual media content contained in the quadrilateral to identify the visual media content Three sets of code.

An apparatus for recognizing visual media content, comprising: a component for receiving an image from a camera of a mobile device; a component for detecting a quadrilateral included in the image; and A visual media content contained within the quadrilateral is taken to identify a component of the visual media content.

An apparatus for recognizing visual media content, comprising: a camera of a mobile device for generating an image; and a computing platform for detecting a quadrilateral included in the image received from the camera, And for capturing a visual media content included in the quadrilateral to identify the visual media content.

The device as claimed in claim 40, wherein the computing platform is further configured to: capture the visual media content included in the quadrilateral to identify the visual media content by performing character recognition.

The device as recited in claim 40, wherein the computing platform is further for: capturing the visual media content contained within the quadrilateral to identify the visual media content by performing image recognition for the static visual media content.

The device as claimed in claim 40, wherein the computing platform is further configured to: detect the quadrilateral included in the image by establishing a region of interest map based on the object contrast, wherein the region of interest map is identified in the image Important details in the picture.

The apparatus of claim 40, wherein the computing platform is further for detecting the quadrilateral included in the image via a user input that receives a portion of a viewfinder scene relative to the image.

The apparatus of claim 40, wherein the computing platform is further configured to: receive the image from the camera of the mobile device by receiving a plurality of consecutive frames, and wherein the image included in the quadrilateral is captured Visual media content to identify the visual media content further includes performing video image recognition for the dynamic visual media content.

The device as recited in claim 45, wherein the computing platform is further configured to: detect a motion picture included in the image by establishing a motion map by determining a difference between the plurality of consecutive frames The quadrangle.

The apparatus of claim 46, wherein the computing platform is further configured to: perform corner detection; and delete a corner point within the motion map.

The apparatus of claim 40, wherein the computing platform is further configured to detect the quadrilateral included in the image by establishing a region of interest map and cropping the image to include the region of interest map.

The apparatus of claim 40, wherein the computing platform is further configured to detect the quadrilateral included in the image by establishing a bright map by detecting a portion having a brighter illumination.

The apparatus of claim 49, wherein the computing platform is further configured to detect the quadrilateral included in the image by: performing corner detection; clustering corner points; and deleting in the bright picture A corner point cluster.

The device of claim 40, wherein the computing platform is further configured to: detect the inclusion in the image by establishing a depth map by detecting a depth of focus of the plurality of portions in the image quadrilateral.

The device as recited in claim 40, wherein the computing platform is further configured to: identify a candidate quadrilateral shape satisfying an identification criterion by detecting the selected four corner point clusters to identify a perspective view of a rectangular display device The quadrilateral contained in the image is detected.

The apparatus of claim 52, wherein the computing platform is further for identifying the perspective view of the rectangular display device by deleting any candidate quadrilateral shape occupying a majority of the area of the other quadrilateral by: Whether the quadrilateral shape is large enough to include all other candidate quadrilateral shapes.

The device as claimed in claim 52, wherein the computing platform is further configured to recognize the perspective view of the rectangular display device by recognizing a border of the rectangular display device.

The device of claim 54, wherein the computing platform is further configured to: identify the border of the rectangular display device by detecting a common border thickness of the portion of the bezel on the opposite side.

The device as recited in claim 55, wherein the computing platform is further configured to detect the border on the opposite side by detecting a percentage of a size of the enclosure of the rectangular display device, for example, about 10% Part of the common border thickness.

The apparatus of claim 54, wherein the computing platform is further configured to: identify the border of the rectangular display device by detecting a common color of a common main portion of the bezel on opposite sides.

The device of claim 57, wherein the computing platform is further configured to detect a common portion of the frame on the opposite side by detecting that at least a percentage of the plurality of primitives have the common color The common color.

The apparatus of claim 52, wherein the computing platform is further configured to: identify the perspective view of the rectangular display device by determining that the candidate quadrilateral shape satisfies more than one criterion, the standard including non-convex surfaces, all of which are included The interior angles are all greater than 110 degrees, have a region that occupies a major portion of the image, have an aspect ratio that is approximately equal to a standard video aspect ratio, and have two adjacent angles that are approximately the same angle.

The apparatus of claim 40, wherein the computing platform is further configured to: perform a bar graph analysis, edge detection, and scale invariant feature conversion on a portion of the image within the selected quadrilateral to Identify the corresponding media content.

The apparatus of claim 40, wherein the computing platform is further configured to: determine an image processing constraint, and to allocate the image between the mobile device and the remote server in response to the image processing constraint Image processing of the portion within the quadrilateral.

The apparatus of claim 61, wherein the image processing constraint comprises: an ability of the mobile device to perform the image processing.

The apparatus of claim 61, wherein the image processing constraint comprises, at least in part, a data transmission cost transmitted via a transmission channel from the mobile device to the remote processing.

The apparatus of claim 61, wherein the image processing constraint comprises: a capability from the mobile device to a transmission channel of the remote processing.

The apparatus of claim 64, wherein the transmitter is further configured to: in response to determining a low capability of the transmission channel, transmitting the image data comprising an image segment; in response to determining a medium capability of the transmission channel Transmitting the image data including the image segment after partial image processing; and in response to determining a high capability of the transmission channel, transmitting the image data including the image segment after all image processing.

The device as recited in claim 40, further comprising: a transmitter for transmitting image data obtained from a portion of the image within the quadrilateral to a remote server; and a receiver for The remote server receives a report of any match of the image material to a library of media content.

The apparatus of claim 66, wherein the computing platform is further configured to receive, from the remote server, the report regarding any match via the following: the decision not identifying the match; and the pairing from a mobile device The camera receives an image, detects a quadrilateral included in the image, and captures a visual media content contained in the quad to identify the visual media content for repetition to obtain a message to be sent to the remote server. Additional image data.

The apparatus of claim 66, wherein the computing platform is further for receiving, by the receiver, the report from the remote server for any match via the following: determining that the number of matches obtained from the report is suitable for the a size presented by a user interface of the mobile device; and receiving a user selection for a media content in the list of media content derived from the report and presented on the user interface.

The apparatus of claim 67, wherein the computing platform is further for receiving, by the receiver, the receiver from the remote server for any matching of the report: determining that the number of matches obtained from the report has a ratio suitable for A larger size of a size presented by a user interface of the mobile device; and an image segment is sent in response to the remote server for additional image processing.

The device as recited in claim 60, wherein the receiver is further configured to: receive the matched media content for presentation by the mobile device.

The apparatus of claim 70, wherein the computing platform is further for presenting the matched media content from a point identified by the captured visual media content.

The device as claimed in claim 70, wherein the computing platform is further configured to: reformat the matched media content for a user interface of the mobile device.

The device of claim 70, wherein the transmitter is further configured to: send an identifier of a user interface of the mobile device to prompt reformatting the matched media content.

The apparatus of claim 70, wherein the computing platform is further configured to: negotiate data management rights with the receiver via the transmitter to present the matched media content.

The device as claimed in claim 60, further comprising: a microphone for capturing audio, wherein the computing platform is further configured to: use a hash view function to assist in matching the image material with a media content library.

A method comprising: capturing a digital image using a camera of a mobile communication device; determining at least one of the mobile communication device, an empty intermediary from the mobile communication device to a remote network, and a network server a capability constraint; based on the capability constraint, allocating image processing of the digital image between the mobile communication device and the network server; and receiving a result of image recognition.

The method of claim 76, wherein determining the capability constraint further comprises: determining a data transmission capability of the null intermediary.

The method of claim 76, wherein determining the capability constraint further comprises determining an image processing capability of the mobile communication device.

The method of claim 76, wherein the distributing the image processing of the digital image between the mobile communication device and the network server based on the capability constraint further comprises: selecting local processing, sharing processing, and remote processing One.