TW200951836A - Content identification method - Google Patents

Content identification method Download PDF

Info

Publication number
TW200951836A
TW200951836A TW098105677A TW98105677A TW200951836A TW 200951836 A TW200951836 A TW 200951836A TW 098105677 A TW098105677 A TW 098105677A TW 98105677 A TW98105677 A TW 98105677A TW 200951836 A TW200951836 A TW 200951836A
Authority
TW
Taiwan
Prior art keywords
information signal
data set
data
resolution
identity
Prior art date
Application number
TW098105677A
Other languages
Chinese (zh)
Inventor
Marijn Christian Damstra
Mehmet Utku Celik
Original Assignee
Koninkl Philips Electronics Nv
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninkl Philips Electronics Nv filed Critical Koninkl Philips Electronics Nv
Publication of TW200951836A publication Critical patent/TW200951836A/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/487Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/018Audio watermarking, i.e. embedding inaudible data in the audio signal
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/00086Circuits for prevention of unauthorised reproduction or copying, e.g. piracy
    • G11B20/00094Circuits for prevention of unauthorised reproduction or copying, e.g. piracy involving measures which result in a restriction to authorised record carriers
    • G11B20/00123Circuits for prevention of unauthorised reproduction or copying, e.g. piracy involving measures which result in a restriction to authorised record carriers the record carrier being identified by recognising some of its unique characteristics, e.g. a unique defect pattern serving as a physical signature of the record carrier
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/10527Audio or video recording; Data buffering arrangements
    • G11B2020/10537Audio or video recording
    • G11B2020/10546Audio or video recording specifically adapted for audio data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Library & Information Science (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

Disclosed is a method for identifying content within a received information signal. The method comprises receiving (S6) an information signal; generating (S8) a first data set by selecting a sub-set of data (7) using filter data; and processing (S9) said information signal with reference to said first data set to determine whether said information signal comprises particular content. Advantageously, the method further comprises processing (S12) the information signal with reference to a second data set (6) to determine whether said information signal comprises particular content. The invention allows content sharing sites such as YouTube to check whether uploaded files violate copyright rules in case the upload includes contents that was previously downloaded by the client. By keeping a list of contents that clients have downloaded, the server has an indication of a subset of contents that will likely be included in the next upload by the same client. As this subset is relatively small, a finer fingerprint resolution may be applied to this subset.

Description

200951836 六、發明說明: 【發明所屬之技術領域】 本發明係關於一種辨識一被接收之資訊信號内之内文的 方法及裝置。 【先前技術】 目前已有許多電腦經由諸如電纜或DSL的高速網際網路 連接而連接至網際網路。更快的連接允許使用者以遠大於 先前可能的速度下載檔案,使大檔案之下載相對較快。由 於快速連接之發展’網際網路已成為使用者存取諸如視頻 及音樂檔案之媒體内文的一種便利機構。 諸如YouTube的内文共用網站允許一使用者社群各自上 傳其自己的内文,藉此對該社群之其他成員提供此内文之 存取。這些網站已在一相對短時間内成為使用者散佈及消 費媒體内文的一種流行方式’導致以一漸增速率不斷增長 的一巨大資料實體。 在上傳之内文間,通常可找到一使用者從一來源例如一 内文所有者之網站獲取未經授權的内文。重新散佈沒有適 當授權、屬性及付款的内文通常會違反相關的版權法,並 導致一媒體所有者被剝奪其適當享有的收益。對於内文共 用網站之業者來說這可能亦為一個問題,因為當這些網站 業者不瞭解被散佈之檔案的版權狀況時,他們可能要承擔 違反版權法散佈媒體檔案的責任。 一種解決方法係需要版權所有者要求此内文從該網站移 除。由於在公共内文共用網站上巨大且不斷增長的内文實 138157.doc 200951836 =’期望-版權所有者找到並報告侵犯其版權的所有内文 實例難以實行。第二個更基本的問題係該解決方法在本質 上係屬被動回應;即使在請求之後該内文被成功移除,一 段時間内其仍然可自該網站獲取且因此可被該社群獲取。 , 一種更前瞻性的解決方法係利用指紋自動過濾上傳内 文。利用此方法’一指紋被儲存用於作為版權保護之對象 的各個多媒體内文項目D各個指紋係基於該内文項目之性 Φ 質。當上傳一個新的内文項目時,其指紋可被決定並與作 為版權保護之對象的内文項目之指紋對比。取決於與資料 庫中任何匹配的内文項目相關聯之許可,網站可拒絕該上 傳,或配置恰當的屬性及補償。一種如此之服務係稱為為 MediaHedge並可自Philips BU内文辨識獲取。 具代表性之指紋系統儲存一特定解析度的指紋並可辨識 一特定長度(大約10秒)之内文的連續片段。問題出現於當 兩個或更多個内文被侵犯性地編輯並被混合於一組合中 ❹ 時,此係稱為共同創造的一過程。當該組合中的該等片段 短於使用中的特定指紋解析度時,一般的指紋系統無法決 定該材料之來源。 此問題的一種潛在解決方法係維持具有更精細解析度的 指紋之資料庫’其允許辨識更短的片段。雖然利用更精細 解析度之指紋有助於辨識,但其亦增加搜尋時間及錯誤匹 配的可能性。 【發明内容】 本發明之一目的係排除或減輕一個或更多個上述問題。 138157.doc 200951836 根據本發明之一第一態樣,提供—種辨識一被接收之資 訊信號中之内文的方法’該方法包括接收―資訊信號;藉 由利用過濾資料選擇一資料子集而產生一第一資料集;以 及參考該第-資料集處理該資訊㈣以決定該資訊㈣是 否包括特定内文。 利用适種方式,該第一資訊信號之處理係參考一可能減 少的基於該過«料之資料集而執行,藉此降低處理詩 及提高效率。這允許該第—資料集包括—高於實際之解析 度的資料,藉以增加辨識一資訊信號中之一特定内文之存 在的可能性。 該方法可進一步包括參考一第二資料集而處理該資訊信 號以決定該資訊信號是否包括特^内文。參考㈣二資料 集處理該資訊信號可取決於參考該第一資料集之處 行。 丄參考第-及第二資料集二者處理該資訊信號允許該資訊 信號參考不同解析度的資料集而處理。因&,一相對較小 的高解析度資料集可包括該卜資料集,而—較大的低解 析度資料集可包括該第二資料集。利用此方式,藉由首先 參考—較小的#選擇之高解析度資料集然後參考-較大的 低解析度資料集而處理該資訊信號’可決特定β 否被包含於一資訊信號中。 疋 、該過渡資料可基於—與被接收之資訊信號之-來源關聯 的身份。該過濾資料可被配置以選擇與該身份關聯的内 文例如藉由該身份下載之内文。該過滤資料可儲存於— 138157.doc 200951836 與一身份關聯的設定檔(profile)中。當内文係藉由該身份 下載時’該過濾資料可被儲存於該設定標中。 該過濾資料可從該身份之IP位址、地理資料及媒體類型 偏好所組成之群組中選擇。 -δ亥第一貧料集可包括複數個第一解析度之指紋該第二 資料集可包括複數個第二解析度之指紋。較佳而言該第一 解析度為相對較高之解析度,而該第二解析度為相對較低 之解析度。 利用此方式,本發明之實施例允許一相對較小數量的計 算上開銷昂貴的高解析度指紋與一較大數量的低解析度及 °十乂上開銷更便宜的指紋一起被處理。 參考該第一資料集之處理可包括產生被接收資訊信號之 一尚解析度指紋及將該高解析度指紋與該第一資料集之指 紋對比。參考該第二資料集之處理可包括產生被接收資訊 信號之一低解析度指紋及將該低解析度指紋與該第二資料 集之指紋對比。 該方法可進一步包括:如果該資訊信號包括該特定内 文,則執行-第-至少一個的動作,及如果該資訊信號不 包括違特疋内文,則執行一第二至少一個的動作。該第一 =少-個的動作可包括防止該f訊信號之上傳、限制上傳 貝訊尨號之散佈、配置適當屬性或配置適當補償中的任一 個該第一至少一個的動作可包括允許該資訊信號之上傳 及將其儲存於一資料庫中。 一種可提供一被請求之 根據本發明之一第二態樣,提供 138157.doc 200951836 資剔5號的方法’该方法包括接收對—資訊信號之請求、 決定與4#求關^之_身份、回應於該請求而產生並儲存 被請求之資訊錢之—高解析度指紋、喊於該請求而更 新-與該身份關聯之設定檔及傳送該資訊信號。 更新一 β又疋檔可包括在設定财儲存對被請求資訊信號 的一參考。 決定-與該請求關聯之身份可包括決定該請求之來源之 一身份。 本發明& #提供一種執行上述該等方法的電腦程式。 如此種電腦程式可被攜載於—適當的载體媒體上。如此 種載體媒體可為-種諸如—軟碟、硬碟、cd或dvd的 有形載體媒體或一種諸如通信信號的無形載體媒體。 本毛月之另一態樣提供用於執行本發明上述該等方法 的裝置。 【實施方式】 以下將參考_經由實例描述本發明之實施例。 圖1顯示連接至網際網路!的複數台電腦。可以看到兩台 使用者電腦2、3被連接至該網際網路i,一饲服器電腦心亦 如此。該等使用者電腦2、3及該伺服器電腦4可經由該網 際網路1利用對-般技術者顯而易見之方法在彼此之間傳 送資料。 該伺服斋電腦4可對一儲存複數個可被下載至該等使用 者電版)2 3之權案的内文資料庫5進行存取。該伺服器電 腦4經組態以從使用纟電腦2、3接收該等樓案並將該等構 138157.doc 200951836 案儲存於該内文資料庫5中。該伺服器電腦4亦可對一低解 析度指紋資料庫6''解 犀進仃存取。該低解析度指紋資料庫6儲存 該内文資料庫5中所儲在夕裕古〜 斤储存之所有内文的低解析度指紋。— 特定内文項目之一指紋係藉由處理特定内文項目所產生的 資料且係基於該特定内文馆 ^ 文項目之特性。低解析度指紋係從 内文之相對較大的片段產生(例如一視頻檔案之5或1〇秒)。 然後該等儲存之指紋可用於藉由產生被接收槽案之一低解200951836 VI. Description of the Invention: [Technical Field] The present invention relates to a method and apparatus for recognizing a text within a received information signal. [Prior Art] Many computers are currently connected to the Internet via a high-speed Internet connection such as cable or DSL. A faster connection allows the user to download the file at a much greater rate than previously possible, making the download of the large file relatively fast. Thanks to the development of fast connections, the Internet has become a convenient mechanism for users to access media content such as video and music files. A text-based shared website, such as YouTube, allows a user community to upload their own text, thereby providing access to this text to other members of the community. These websites have become a popular way for users to disseminate and consume media content in a relatively short period of time, resulting in a huge data entity that is growing at an increasing rate. Between the uploaded texts, a user can usually be found to obtain unauthorized text from a source such as a website owner's website. Re-distributing the text without proper authorization, attributes, and payments would often violate the relevant copyright laws and result in a media owner being deprived of the appropriate benefits. This may also be a problem for those who use a shared website, because when these website owners do not understand the copyright status of the files being distributed, they may be liable for distributing the media files in violation of copyright laws. One solution is for the copyright owner to request that the text be removed from the site. Due to the large and growing content of the text on the public website, 138157.doc 200951836 = 'Expectation - All copyright instances that copyright owners find and report infringement of their copyrights are difficult to implement. The second, more fundamental problem is that the solution is essentially a passive response; even if the context is successfully removed after the request, it can still be obtained from the website for a period of time and is therefore available to the community. A more forward-looking solution is to use fingerprint auto-filtering to upload text. With this method, a fingerprint is stored for each multimedia content item D, which is the object of copyright protection, and each fingerprint is based on the nature of the content item. When a new text item is uploaded, its fingerprint can be determined and compared to the fingerprint of the text item that is the subject of copyright protection. Depending on the license associated with any matching text item in the database, the website may reject the upload or configure appropriate attributes and compensation. One such service is called MediaHedge and is available from Philips BU. A representative fingerprint system stores a fingerprint of a particular resolution and can identify successive segments of a particular length (approximately 10 seconds). The problem arises when two or more texts are edited invasively and mixed into a combination, which is called a process of co-creation. When the segments in the combination are shorter than the specific fingerprint resolution in use, the general fingerprinting system cannot determine the source of the material. One potential solution to this problem is to maintain a database of fingerprints with finer resolutions that allow for the identification of shorter segments. Although fingerprints with finer resolution are helpful for identification, they also increase the search time and the likelihood of mismatches. SUMMARY OF THE INVENTION One object of the present invention is to obviate or mitigate one or more of the above problems. 138157.doc 200951836 In accordance with a first aspect of the present invention, a method for identifying a text in a received information signal is provided 'the method includes receiving an information signal; and selecting a subset of data by using the filtered data Generating a first data set; and processing the information (4) with reference to the first data set to determine whether the information (4) includes a specific context. In a suitable manner, the processing of the first information signal is performed with reference to a possibly reduced data set based on the material, thereby reducing processing poetry and improving efficiency. This allows the first data set to include data that is higher than the actual resolution, thereby increasing the likelihood of identifying a particular context in an information signal. The method can further include processing the information signal with reference to a second data set to determine whether the information signal includes a special text. Reference (4) and 2 data sets for processing the information signal may depend on the reference to the first data set. The processing of the information signal with reference to both the first and second data sets allows the information signal to be processed with reference to a data set of different resolutions. Because &, a relatively small set of high resolution data sets may include the data set, and a larger set of low resolution data sets may include the second set of data. In this manner, the information signal can be processed by first referring to the smaller #selected high-resolution data set and then referring to the larger low-resolution data set, which can be included in an information signal. 、 , The transitional material may be based on the identity associated with the source of the received information signal. The filtered material can be configured to select a context associated with the identity, such as the text downloaded by the identity. The filtered data can be stored in a profile associated with an identity at 138157.doc 200951836. When the text is downloaded by the identity, the filtered material can be stored in the setting. The filtered material can be selected from the group consisting of the IP address of the identity, geographic data, and media type preferences. The delta first poor set may include a plurality of first resolution fingerprints. The second data set may include a plurality of second resolution fingerprints. Preferably, the first resolution is a relatively high resolution and the second resolution is a relatively low resolution. In this manner, embodiments of the present invention allow a relatively small number of computationally expensive high resolution fingerprints to be processed along with a larger number of low resolution and less expensive overhead fingerprints. Processing with reference to the first data set can include generating a resolution fingerprint of the received information signal and comparing the high resolution fingerprint to the fingerprint of the first data set. Processing with reference to the second data set can include generating a low resolution fingerprint of the received information signal and comparing the low resolution fingerprint to the fingerprint of the second data set. The method can further include performing - a - at least one action if the information signal includes the particular context, and performing a second at least one action if the information signal does not include a special context. The first=small-to-one action may include preventing the uploading of the f-signal, limiting the distribution of the uploading slogan, configuring the appropriate attribute, or configuring any one of the appropriate at least one of the actions, including allowing the Upload the information signal and store it in a database. A method of providing a requested second aspect according to the present invention, providing 138157.doc 200951836 No. 5 'This method includes receiving a request for a message, a decision, and a status of 4# Responding to the request to generate and store the requested information money - a high-resolution fingerprint, screaming for the request, updating - a profile associated with the identity and transmitting the information signal. Updating a beta file may include a reference to the requested information signal in the set store. Decision - The identity associated with the request may include an identity that determines the source of the request. The present invention &# provides a computer program for performing the above methods. Such a computer program can be carried on the appropriate carrier medium. Such a carrier medium may be a tangible carrier medium such as a floppy disk, a hard disk, a cd or a dvd or an intangible carrier medium such as a communication signal. Another aspect of the present invention provides means for performing the above described methods of the present invention. [Embodiment] Hereinafter, embodiments of the present invention will be described with reference to examples. Figure 1 shows the connection to the Internet! Multiple computers. It can be seen that the two user computers 2, 3 are connected to the Internet i, as is the case with a feeding machine. The user computers 2, 3 and the server computer 4 can transfer data between each other via the Internet 1 using methods apparent to those skilled in the art. The servo computer 4 can access a library 7 that stores a plurality of rights that can be downloaded to the user's electronic version. The server computer 4 is configured to receive the buildings from the computers 2, 3 and store the structures 138157.doc 200951836 in the context database 5. The server computer 4 can also access a low resolution fingerprint database 6''. The low-resolution fingerprint database 6 stores low-resolution fingerprints of all the texts stored in the library 7 in Xiyugu~jin. – The fingerprint of a particular textual item is based on the information generated by the processing of a particular contextual item and is based on the characteristics of that particular contextual project. Low resolution fingerprints are generated from relatively large segments of the text (e.g., 5 or 1 second of a video file). The stored fingerprints can then be used to generate a low solution by generating a received slot

析度指紋及將其與儲存於該低解析度指紋資料庫6中的各 個低解析度指紋對比而辨識一接收之檔案是否包含一特定 内文項目。一指紋可用於辨識至少與該指紋之解析度—樣 大的内文片段。舉例來說,如果一組低解析度指紋具有$ 秒之一解析度,則使用該等低解析度指紋辨識一視頻檔案 中之特疋内文可能需要在該視頻檔案中包含至少5秒的特 定内文。 該伺服器電腦4亦可對一高解析度指紋資料庫7進行存 取。高解析度指紋係產生以對應於内文之相對較小片段 (例如一視頻檔案之一訊框)。因此各個高解析度指紋可用 於基於内文之比可能採用低解析度指紋之更小的片段辯識 特定内文。舉例來說,如果使用所述類型之一高解析度指 紋’則在一視頻檔案中特定内文之辯識只需要包含該視頻 檔案中之特定内文的一訊框。 該伺服器電腦4亦可對一用於使用者電腦2、3之各個使 用者的使用者設定檔資料庫8進行存取,該等使用者存取 該飼服器電腦4。在與該 司服器電腦4通信前,使用該等使 138157.doc 200951836 用者電腦2、3之一者的一佬用去音止… J便用者首先利用熟練此技術者熟 知的方法向該伺服器電腦41主撒A i八 私胸4”主冊其身份。此註冊過程為該 用者產生#1存於該使用者設定槽資料庫8中的使用者 設定稽。雖然該註冊過程可能需要_使用者主動建立一使 用者設定棺’但亦可使用包括咖kie或職址建播的替代 技術。 圖2為-顯示當-樓案從該伺服器電腦4下載至該使用者 電腦2時執行之處理的流程圖。應瞭解如果該檔案被下載 至該使用者電腦3,則該方法相同。圖2之流程圖所顯示之 該處理係由該伺服器電腦4執行。 在步驟S1,從一使用該使用者電腦2的使用者接收對一 檔案之請求。在步驟S2,該伺服器電腦4決定發出該請求 之該使用者的身份。這可藉由例如要求該使用者在其等能 下載一檔案之前提供一使用者名稱及密碼而實現。一旦已 決定該使用者之身份,則處理轉至步驟S3,其中對應的使 用者設定檔於使用者設定檔資料庫8中被更新。經更新之 對應的使用者設定播現在將指示該使用者已下載所請求之 檔案且處理轉至步驟S4。在步驟S4,該伺服器電腦4產生 所請求之檔案之一高解析度指紋並將該指紋儲存於該高解 析度指紋資料庫7中。現在處理轉至步驟S5,其中該伺服 器電腦4發送所請求之檔案至該使用者。利用這種方式, 該飼服器電腦4可存取從該内文資料庫5下载的各個播案之 高解析度指紋。 圖3為一流程圖,其顯示用於控制一使用者將一檔案從 138157.doc 10 200951836 該使用者電腦2上傳至與該伺服器電腦4關聯之該内文資料 庫5之方法。應瞭解如果該檔案係從該使用者電腦3上傳, 則》亥方法相同。圖2之流程圖所顯示之該處理係由該伺服 器電腦4執行。 在步驟S6,用於上傳之一檔案係從該使用者電腦2接 收。在步驟S7 ’產生該上傳㈣之一高解析度指紋。在步 '驟S8,來自該高解析度指紋資料庫7的一高解析度指紋子 ❹ 錢基於過濾、資料而被選擇。在此實施例中,該過濾資料 為先前由該上傳使用者所下载之權案的清卩,其係自儲存 於使用者設定檔資料庫8中的該使用者之使用者設定檔獲 取。在步驟S9,所產生之高解析度指紋與在步驟S8產生的 該高解析度指紋子集中的各個高解析度指紋對比。如果該 上傳指紋與該所產生之子集中的一個或更多個指紋匹配f 則上傳檔案含有來自該使用者先前從該内文資料庫5下載 之該等標案中之-者的内文。因此該樓案被拒絕且處理終 〇 止於步驟81〇。但如果所產生之高解析度指紋不與所產生 之子集中的一指紋匹配,則處理轉至步驟s丨丨。一特定檔 . 案之上傳可因多種原因而被禁止,例如如果一特定檔案之 上傳將違反相關版權法則該上傳可被禁止。在當前情況 ' 下,包含於該内文資料庫5中的任何内文都被認為受到限 制且因此於步驟S10阻止上傳。 在步驟S11,產生該上傳檔案之一低解析度指紋。在步 驟S12,將所產生之低解析度指紋與該低解析度指紋資料 庫6中的所有低解析度指紋對比。如果所產生之低解析度 138157.doc 200951836 指紋與該低解析度指紋資料庫6中的一個或更多個指紋匹 配則°亥上傳之槽案含有來自儲存於内文資料庫5中之古亥 等檔案中之一者的内文。因此該檔案被拒絕且處理終止於 步驟S13。但如果所產生之低解析度指紋不與該低解析度 指紋資料庫6中的一指紋匹配,則允許該上傳且該上傳之 檔案被儲存於該内文資料庫5中,且該上傳檔案之低解析 度指紋被儲存於該低解析度指紋資料庫6中。 從前述可看出一檢查係基於高解析度指紋的一選定子集 而做出1解析度指紋之使用纟許辨識特定敎的相對較 小部分,而一子集之使用減少了執行搜尋所用的時間。低 解析度指紋之制提供關於特定内文Μ包含於上傳内文 中的第二層之決定。隸該等低解析度指紋只能檢測特定 内文的相對較大部分,但Λ等之使用提供一備用方案其 可找到未由該上傳使用者從該内文f料庫5下冑且因此; 具有該高解析度指紋子集中之—高解析度指紋之版權保護 一前述係關於一實施例,在該實施例中詳細指紋的一所決 定子集係基於使用者設定檔f料庫8中儲存的—上傳使用 者的下載歷史而選擇。應瞭解可使用其他標準。事實上, 本發明可使用包含(但不限於)Ip位址、人口學資訊及偏好 媒體類型之標準的一者或任何組合。 此外,前述係關於本發明之—實施例,在該實施例中若 —上傳槽案包含來自-儲存於該内文f料庫^之播案的 内文則其被拒絕。應瞭解在本發明之替代實施财,該系 J38157.doc 200951836 2可取決於版權所有者之要求而採取—些行動。潛在的行 =包含但不限於配置正確屬性、配置任何必要的付款或 間單地允許或拒絕上傳。 圖4為-根據本發明之一個一般化實施例顯示為辨識上 傳檔案之來源所採取之決策的決策樹。於該词服器電腦4 接收一檔案Ci。在決策9,接著決定以下是否成立: FH(Ci) €DhThe fingerprint is degraded and compared with each low-resolution fingerprint stored in the low-resolution fingerprint database 6 to identify whether a received file contains a specific context item. A fingerprint can be used to identify a segment of the text that is at least as large as the resolution of the fingerprint. For example, if a set of low-resolution fingerprints has a resolution of one second, then using the low-resolution fingerprints to identify the feature text in a video file may require at least 5 seconds of specificity in the video file. Internal text. The server computer 4 can also access a high-resolution fingerprint database 7. The high resolution fingerprint is generated to correspond to a relatively small segment of the context (e.g., a frame of a video file). Thus, each high-resolution fingerprint can be used to identify a particular context based on a smaller portion of the lower-resolution fingerprint than the context-based fingerprint. For example, if one of the types of high-resolution fingerprints is used, then the identification of a particular context in a video archive requires only a frame containing the particular context in the video archive. The server computer 4 can also access a user profile database 8 for each user of the user computers 2, 3, who accesses the server computer 4. Before using the server 4 to communicate with the server 4, use one of the 138157.doc 200951836 user computers 2, 3 to use the de-speech... J-user first uses a method familiar to those skilled in the art. The server computer 41 is the main user of the Ai eight private chest 4" main book. This registration process is for the user to generate #1 stored in the user setting slot database 8 user settings. Although the registration process It may be necessary for the user to actively establish a user setting, but may also use an alternative technology including a coffee kie or a site. Figure 2 is a display of the when-story file downloaded from the server computer 4 to the user computer. A flowchart of the processing performed at 2 o'clock. It should be understood that if the file is downloaded to the user computer 3, the method is the same. The processing shown in the flowchart of Fig. 2 is performed by the server computer 4. In step S1 Receiving a request for a file from a user who uses the user computer 2. In step S2, the server computer 4 determines the identity of the user who made the request. This can be done, for example, by requiring the user to Wait until you can download a file before you provide one The user name and password are implemented. Once the identity of the user has been determined, the process proceeds to step S3, in which the corresponding user profile is updated in the user profile database 8. The updated corresponding user The setting broadcast will instruct the user to download the requested file and the process goes to step S4. In step S4, the server computer 4 generates a high resolution fingerprint of the requested file and stores the fingerprint in the high resolution. In the fingerprint database 7, the process now proceeds to step S5, in which the server computer 4 sends the requested file to the user. In this manner, the server computer 4 can access the database from the context. 5 downloading a high-resolution fingerprint of each broadcast. Figure 3 is a flow chart showing a method for controlling a user to upload a file from the user computer 2 to 138157.doc 10 200951836 to be associated with the server computer 4. The method of the database 5 should be understood that if the file is uploaded from the user computer 3, the method is the same. The processing shown in the flowchart of FIG. 2 is executed by the server computer 4. In step S6, one of the files for uploading is received from the user computer 2. In step S7', one of the uploading (four) high-resolution fingerprints is generated. In step S8, the high-resolution fingerprint database 7 is obtained. A high-resolution fingerprint sub-money is selected based on filtering and data. In this embodiment, the filtering data is a clear list of rights previously downloaded by the uploading user, which is stored in the user profile. The user profile of the user in the database 8 is obtained. In step S9, the generated high-resolution fingerprint is compared with each high-resolution fingerprint in the high-resolution fingerprint subset generated in step S8. The fingerprint matches one or more fingerprints in the generated subset. The upload file contains the text from the one of the criteria previously downloaded by the user from the context database 5. Therefore, the building was rejected and the processing was terminated at step 81. However, if the resulting high resolution fingerprint does not match a fingerprint in the generated subset, then processing transfers to step s. A specific file may be banned for a number of reasons, such as uploading a specific file that would violate the relevant copyright laws. In the current case ', any text contained in the context database 5 is considered to be restricted and thus the upload is blocked in step S10. In step S11, a low resolution fingerprint of the uploaded file is generated. In step S12, the resulting low resolution fingerprint is compared to all low resolution fingerprints in the low resolution fingerprint database 6. If the generated low resolution 138157.doc 200951836 fingerprint matches one or more fingerprints in the low resolution fingerprint database 6, then the slot uploaded contains the information from the Chinese version stored in the database 5 The text of one of the files. Therefore the file is rejected and processing ends in step S13. However, if the generated low-resolution fingerprint does not match a fingerprint in the low-resolution fingerprint database 6, the upload is allowed and the uploaded file is stored in the context database 5, and the uploaded file is The low resolution fingerprint is stored in the low resolution fingerprint database 6. It can be seen from the foregoing that an inspection is based on a selected subset of high-resolution fingerprints to make use of a resolution fingerprint to identify a relatively small portion of a particular flaw, and the use of a subset reduces the use of a search. time. The low resolution fingerprint system provides a decision about the particular context contained in the second layer of the uploaded text. The low-resolution fingerprints can only detect a relatively large portion of a particular context, but the use of Λ etc. provides an alternative scheme that can be found not to be downloaded by the uploading user from the contextual library 5 and thus; Copyright Protection with High Resolution Fingerprints of the High Resolution Fingerprint Set - In the foregoing embodiment, a determined subset of detailed fingerprints is stored in the user profile f library 8 in this embodiment. - Upload the user's download history and choose. It should be understood that other standards can be used. In fact, the invention may use one or any combination of criteria including, but not limited to, Ip addresses, demographic information, and preferred media types. Furthermore, the foregoing relates to the embodiment of the present invention, in which the upload slot contains a text from the broadcast stored in the context, which is rejected. It should be understood that in the alternative implementation of the present invention, the system may take some action depending on the requirements of the copyright holder. Potential Lines = include but are not limited to configuring the correct attributes, configuring any necessary payments, or allowing or rejecting uploads. 4 is a decision tree showing the decisions taken to identify the source of the uploaded file in accordance with a generalized embodiment of the present invention. The file server 4 receives a file Ci. In decision 9, then decide if the following is true: FH(Ci) €Dh

即,該槽案。之高解析度指紋是否為儲存於高解析度指 紋資料庫7中的該高解析度指紋集之一成員。如果心之高 解析度指紋被包含於該高解析度指纹資料庫7中,則已: 該内文資料庫5下載該檔案Ci,並係該下載檔㈣集之一 成員。因此需要檢查該檔案。之許可證以決定一適當的行 動方針,例如配置適當的屬性。 但如果Ci之該高解析度指紋未包含於該高解析度指紋資 料庫7中,則在決策10處,決定以下是否成立: FL(Ci) €Dl 即,該低解析度指紋是否為儲存於該低解析度指紋資料 庫6中之該低解析度指紋集的一成員。如果Q之低解析度 指紋被包含於該低解析度指紋資料庫6中,則該檔案被包 含於該内文資料庫5中。因此需檢查該檔案^之許可證以 決定一適當之行動方針。 如果該低解析度指紋未被包含於該低解析度指紋資料庫 6中,則Ci未被包含於該内文資料庫5中,因此假定q為一 -13- 138157.doc 200951836 新内文。 該等低及高解析度指紋可以熟練此技術者所熟知的多種 方式從内文中獲取。舉例來說’處理音頻-視頻内文的一 特定實施例可將該等低解析度指紋作為一固定長度的二進 位字串計算,該二進位字串代表對應於5秒之視頻的各個 訊框組’而該等高解析度指紋可為從各個訊框獲取的另一That is, the slot case. Whether the high resolution fingerprint is a member of the high resolution fingerprint set stored in the high resolution fingerprint database 7. If the high-resolution fingerprint is included in the high-resolution fingerprint database 7, then: the context database 5 downloads the file Ci and is a member of the download file (four) set. Therefore you need to check the file. The license is used to determine an appropriate action policy, such as configuring the appropriate attributes. However, if the high-resolution fingerprint of Ci is not included in the high-resolution fingerprint database 7, then at decision 10, it is determined whether the following is true: FL(Ci) €Dl That is, whether the low-resolution fingerprint is stored in A member of the low resolution fingerprint set in the low resolution fingerprint database 6. If the low resolution fingerprint of Q is included in the low resolution fingerprint database 6, the file is included in the context database 5. It is therefore necessary to check the license of the file to determine an appropriate course of action. If the low-resolution fingerprint is not included in the low-resolution fingerprint database 6, Ci is not included in the context database 5, so it is assumed that q is a new text of -13-138157.doc 200951836. Such low and high resolution fingerprints can be obtained from the text in a variety of ways well known to those skilled in the art. For example, a particular embodiment of processing an audio-video context may calculate the low-resolution fingerprints as a fixed-length binary string representing individual frames corresponding to 5 seconds of video. Group' and the high resolution fingerprints can be another one obtained from each frame

固定長度之二進位字串。以這種方式配置的一個實施例將 允許該系統可辨識來自一客戶事先已從該飼服器下載之一 内文的個別訊框,同時該系統將需要5秒之内文以便正確 辨識該客戶事先未從該伺服器下載的材料。 應瞭解本發明上述該等實施例可用於控制任何類型之擋 案至-資料庫的上傳。舉例來說’本發明之實施例可用於 音頻、視頻、影像、照片及多媒體檔案。 應瞭解圖1之該電腦網路僅為例示性。在本發明之許多 實施例中該網路可包含不止兩台使用者電腦2、3。 應將本文使用之術語「資料座 ^A fixed-length binary string. An embodiment configured in this manner would allow the system to recognize individual frames from a client that has previously downloaded a text from the feeder, while the system would require 5 seconds of text to properly identify the client. Materials not previously downloaded from this server. It will be appreciated that the above-described embodiments of the present invention can be used to control the uploading of any type of file to a repository. For example, embodiments of the present invention can be used in audio, video, video, photo, and multimedia archives. It should be understood that the computer network of Figure 1 is merely illustrative. In many embodiments of the invention the network may include more than two user computers 2, 3. The term "data block" should be used in this article.

買Ή厚」廣泛地理解以涵蓋任何 用於儲存資料使得所儲存之眘斜 > 子炙貝枓了破操取的機構。即,雖 然s亥術語涵蓋正規化資料庫 早s理糸統,其亦可包含諸如一 種由許多作業系統提供的檔 儲存系統。 储存糸統之類型的任何資料 ’應理解在不脫離附 出多種修飾。 上文已描述本發明之實施例。然而 加申請專利範圍下可對所述實施例作 【圖式簡單說明】 圖1為一種電腦網路之概要圖; 138157.doc •14· 200951836 圖2為一種根據本發明之一實施例從圖1之伺服器下載一 檔案之方法的流程圖; 圖3為一種根據本發明之一實施例控制檔案上傳至圖^之 伺服器之方法的流程圖;及 圖4為一根據本發明之一實施例決定上傳樓案 決策樹。 之來源的 【主要元件符號說明】 Φ 1 2 3 4 5 6 7 8 網際網路 使用者電腦 使用者電腦 伺服器電腦 内文資料庫 低解析度指紋資料庫 高解析度指紋資料庫 使用者設定檔 參 138157.doc 15·"Buy thick" is widely understood to cover any mechanism used to store data so that the stored cautiousness > That is, although the terminology covers a regularized database, it may also include a file storage system such as that provided by many operating systems. Any information on the type of storage system ‘ should be understood without departing from the various modifications. Embodiments of the invention have been described above. However, in the scope of the patent application, the embodiment may be briefly described. FIG. 1 is a schematic diagram of a computer network; 138157.doc • 14· 200951836 FIG. 2 is a diagram of an embodiment of the present invention. 1 is a flow chart of a method for downloading a file by a server; FIG. 3 is a flow chart of a method for controlling uploading of a file to a server according to an embodiment of the present invention; and FIG. 4 is an implementation according to the present invention. The example decides to upload the decision tree for the project. Source [Key component symbol description] Φ 1 2 3 4 5 6 7 8 Internet user computer user computer server computer internal database low resolution fingerprint database high resolution fingerprint database user profile Reference 138157.doc 15·

Claims (1)

200951836 七、申請專利範圍: 1. 一種用於辨識一被接收之資訊信號中之内文的方法,該 方法包括: 接收(S6)—資訊信號; 藉由使用過濾資料選擇一資料子集(7)而產生(S8)一第 ' 一資料集;及 ' 參考該第一資料集而處理(S9)該資訊信號以決定該資 訊信號是否包括特定内文。 鲁2.如請求項1之方法,其進一步包括: 參考一第二資料集(6)而處理(s 12)該資訊信號以決定 訊信號是否包括特定内文。 3.如請求項1之方法,其進一步包括: 取決於參考該第一資料集之該處理(S9)之結果,參考 一第二資料集(6)而處理(S12)該資訊信號以決定該資訊 信號是否包括特定内文。 參 4_如請求項1、2或3之方法,其中該過濾資料係基於一與 5玄被接收之資訊信號之一來源相關聯的身份。 5. 如请求項4之方法,其中該過濾資料經配置以選擇與該 身份關聯的内文。 6. 如请求項4之方法’其中該過濾資料係儲存於一與該身 份關聯的設定檔(8)中。 7. 如4求項4之方法,其中該過濾資料係在内文由該身份 下載時儲存。 8·如明求項4之方法,其中該過濾資料係從由該身份之IP位 138157.doc 200951836 址、地理資料及媒體類型偏好所組成的群組中選出。 9. 如請求項2之方法,其中 該第一資料集包括複數個一第一解析度的指紋;且 該第二資料集(6)包括複數個一第二解析度的指紋。 10. 如請求項2之方法,其中: 該第—資料集包括相對較高解析度的指紋;且 該第二資料集(6)包括複數個相對較低解析度的指紋。 11. 如請求項10之方法,其中: 參考該第一資料集的該處理包括產生該被接收資訊信 號之一高解析度指紋及與該第一資料集對比;及 參考該第二資料集之該處理包括產生該被接收資訊信 號之一低解析度指紋及與該第二資料集對比。 12. 如凊求項1之方法’其進一步包括: 如果該資訊信號包括該特定内文,則阻止該資訊信號 之上傳;及 如果該資訊信號不包括該特定内文,則允許該資訊信 號之上傳。 13. —種用於提供一被請求之資訊信號的方法,其包括: 接收(S1)對一資訊信號的一請求; 決定(S2)與該請求相關的一身份; 產生(S4)該被請求之資訊信號之一高解析度指蚊並回 應於該請求而儲存該產生之高解析度指紋; 回應於該請求而更新與該身份相關的一設定權(8);及 傳送(S5)該資訊信號。 138157.doc 200951836 14. 如請求項13之方法,其中更新該設定檔(8)包括在該設定 檔中儲存對該被請求之資訊信號的一參考。 15. 如請求項13或14之方法,其中決定與該請求關聯的一身 份包括決定該請求之來源之一身份。 16· 一種電腦程式,其係經組態以控制一電腦以執行如任何 前述請求項之一方法。 1 7. —種攜載一如請求項16之電腦程式的載體媒體。200951836 VII. Patent application scope: 1. A method for identifying a text in a received information signal, the method comprising: receiving (S6)-information signal; selecting a data subset by using filtering data (7) And generating (S8) a 'th data set; and 'receiving (S9) the information signal with reference to the first data set to determine whether the information signal includes a specific context. The method of claim 1, further comprising: processing (s 12) the information signal with reference to a second data set (6) to determine whether the signal includes a particular context. 3. The method of claim 1, further comprising: processing (S12) the information signal to determine the result by reference to a second data set (6) depending on a result of the processing (S9) with reference to the first data set Whether the information signal includes a specific context. The method of claim 1, 2 or 3, wherein the filtering data is based on an identity associated with a source of one of the information signals received by the UI. 5. The method of claim 4, wherein the filtered material is configured to select a context associated with the identity. 6. The method of claim 4 wherein the filtered data is stored in a profile (8) associated with the identity. 7. The method of claim 4, wherein the filtering data is stored in the context of downloading the identity. 8. The method of claim 4, wherein the filtering data is selected from the group consisting of the IP address of the identity 138157.doc 200951836, geographic data, and media type preferences. 9. The method of claim 2, wherein the first data set comprises a plurality of fingerprints of a first resolution; and the second data set (6) comprises a plurality of fingerprints of a second resolution. 10. The method of claim 2, wherein: the first data set comprises a relatively high resolution fingerprint; and the second data set (6) comprises a plurality of relatively low resolution fingerprints. 11. The method of claim 10, wherein: the processing of referring to the first data set comprises generating a high resolution fingerprint of the received information signal and comparing with the first data set; and referring to the second data set The processing includes generating a low resolution fingerprint of the received information signal and comparing to the second data set. 12. The method of claim 1, further comprising: if the information signal includes the particular context, preventing uploading of the information signal; and if the information signal does not include the specific context, allowing the information signal Upload. 13. A method for providing a requested information signal, comprising: receiving (S1) a request for an information signal; determining (S2) an identity associated with the request; generating (S4) the requested One of the information signals is a high-resolution finger pointing to the mosquito and storing the generated high-resolution fingerprint in response to the request; updating a setting right associated with the identity in response to the request (8); and transmitting (S5) the information signal. 138157.doc 200951836 14. The method of claim 13, wherein updating the profile (8) comprises storing a reference to the requested information signal in the profile. 15. The method of claim 13 or 14, wherein determining the identity associated with the request comprises determining the identity of the source of the request. 16. A computer program configured to control a computer to perform a method as in any of the preceding claims. 1 7. A carrier medium carrying a computer program as claimed in claim 16. ^ 一種用於辨識一被接收之資訊信號中之内文的電腦裝 置’該電腦裝置包括: 一記憶體,其儲存處理器可讀取之指令;及 一處理器,其經組態以讀取及執行儲存於該程式記憶 體中之指令; 其中該等處理器可讀取之指令包括控制該處理器以執 行如請求項1至15中任一項之一方法的指令。 19. 一種用於辨識一被接收之資訊信號中之内文的裝置該 裝置包括: ~ / 一接收器,其經配置以接收一資訊信號; 一處理器,其經配置以: 藉由使用過濾資料選擇一資料子集而產生—第一資 料集;及 參考該第一㈣集而處理該資訊信號以決定該資訊 信號是否包括特定内文。 2〇.—種用於提供一被請求之資訊信號的裝置該裝置包 括: . ^ 138157.doc 200951836 一接收器,其經配置以接收對一資訊信號之一請求; 一處理器,其經配置以: 決定與該請求關聯的一身份; 回應於該請求而產生該被請求之資訊信號之一高解 析度指紋; 回應於該請求而更新與該身份關聯的一設定檔;及 傳送該資訊信號。 138157.doc 4-^ A computer device for identifying a text in a received information signal. The computer device includes: a memory that stores instructions readable by a processor; and a processor configured to read And executing instructions stored in the program memory; wherein the instructions readable by the processors include instructions to control the processor to perform the method of any one of claims 1 to 15. 19. A device for identifying a text in a received information signal. The device comprises: ~ / a receiver configured to receive an information signal; a processor configured to: filter by using The data is selected from a subset of data to generate a first data set; and the information signal is processed with reference to the first (four) set to determine whether the information signal includes a particular context. 2. Apparatus for providing a requested information signal. The apparatus comprises: . 138157.doc 200951836 A receiver configured to receive a request for one of the information signals; a processor configured Determining an identity associated with the request; generating a high resolution fingerprint of the requested information signal in response to the request; updating a profile associated with the identity in response to the request; and transmitting the information signal . 138157.doc 4-
TW098105677A 2008-02-26 2009-02-23 Content identification method TW200951836A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
EP08151925 2008-02-26

Publications (1)

Publication Number Publication Date
TW200951836A true TW200951836A (en) 2009-12-16

Family

ID=40934217

Family Applications (1)

Application Number Title Priority Date Filing Date
TW098105677A TW200951836A (en) 2008-02-26 2009-02-23 Content identification method

Country Status (2)

Country Link
TW (1) TW200951836A (en)
WO (1) WO2009107049A2 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2598691B (en) * 2016-08-15 2022-10-05 Intrasonics Sarl Audio matching
GB2556023B (en) 2016-08-15 2022-02-09 Intrasonics Sarl Audio matching

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6990453B2 (en) * 2000-07-31 2006-01-24 Landmark Digital Services Llc System and methods for recognizing sound and music signals in high noise and distortion
US7167574B2 (en) * 2002-03-14 2007-01-23 Seiko Epson Corporation Method and apparatus for content-based image copy detection
US8244639B2 (en) * 2003-03-05 2012-08-14 Digimarc Corporation Content identification, personal domain, copyright notification, metadata and e-Commerce
US7382905B2 (en) * 2004-02-11 2008-06-03 Microsoft Corporation Desynchronized fingerprinting method and system for digital multimedia data
US7516074B2 (en) * 2005-09-01 2009-04-07 Auditude, Inc. Extraction and matching of characteristic fingerprints from audio signals
WO2007148290A2 (en) * 2006-06-20 2007-12-27 Koninklijke Philips Electronics N.V. Generating fingerprints of information signals

Also Published As

Publication number Publication date
WO2009107049A2 (en) 2009-09-03
WO2009107049A3 (en) 2009-12-10

Similar Documents

Publication Publication Date Title
US10819782B2 (en) Personal digital server (PDS)
US11232080B2 (en) Systems and methods for providing access to a data file stored at a data storage system
US7483958B1 (en) Methods and apparatuses for sharing media content, libraries and playlists
US9288210B2 (en) Revocable object access
US9195840B2 (en) Application-specific file type generation and use
KR101312125B1 (en) Contents filtering apparatus and method thereof
US8078693B2 (en) Inserting a multimedia file through a web-based desktop productivity application
EP2473932B1 (en) A method and system for tunable distribution of content
JP2004259283A (en) Issue of digital right management (drm) license for content based on cross-forest directory information
JP2004259284A (en) Review of user/group cached information related to issue of digital right management(drm) license of content
CN101304414A (en) Information processing system, information processing apparatus, information processing method and program
FR2868896A1 (en) METHOD AND DEVICE FOR CONTROLLING ACCESS TO A SHARED DIGITAL DOCUMENT IN A POST-TO-POST COMMUNICATION NETWORK
US9615116B2 (en) System, method and apparatus for securely distributing content
US20150205755A1 (en) Extensible Media Format System and Methods of Use
CN101635000A (en) Content playing device for retrieving and binding lacking content from the internet as copyright free sample
US20150161119A1 (en) Playlist resolver
US20240111738A1 (en) Object management system for efficient content item management
US20040111604A1 (en) Method and system for protection against unauthorized distribution of copyrighted computer files over peer-to-peer networks
EP2237170A1 (en) Data sorage system
TW200951836A (en) Content identification method
JP5115339B2 (en) Information processing system and information processing method
EP2237144A1 (en) Method of remotely storing data and related data storage system
US11861039B1 (en) Hierarchical system and method for identifying sensitive content in data
CN115714766A (en) File conversion method, system and device and electronic equipment
CN116134784A (en) System and method for remote ownership and content control of media files on an untrusted system