TWI550540B

TWI550540B - Advertising trading system

Info

Publication number: TWI550540B
Application number: TW104144752A
Authority: TW
Inventors: Kuan Lan Wang
Original assignee: Wang Kung Lan
Priority date: 2014-03-31
Filing date: 2014-03-31
Publication date: 2016-09-21
Also published as: TW201614567A

Description

Advertising transaction system

本發明是有關於一種廣告交易系統，特別是指一種適用於各種具有傳遞音頻能力的電子媒體的廣告交易系統。 The present invention relates to an advertisement transaction system, and more particularly to an advertisement transaction system suitable for various electronic media having audio transmission capabilities.

語音處理技術包括語音識別(Speech Recognition)及語者識別(Speaker Recognition)，語音識別是用於識別出語音中的詞彙，在目前多應用於自動化的人機互動介面，語者識別是用於識別出語者的身分，常用的是聲紋識別(Voiceprint Recognition)，主要應用於監聽或蒐證的個人身分識別。 Speech processing technology includes Speech Recognition and Speaker Recognition. Speech recognition is used to recognize vocabulary in speech. Currently, it is used in automated human-computer interaction interface. Speaker recognition is used for recognition. The identity of the utterer is commonly used in Voiceprint Recognition, which is mainly used for personal identity recognition of surveillance or search.

另一方面，商業廣告無所不在，包括店面音樂、電話、電視、廣播或網站等電子媒體，無非是希望增加銷售量，因此，如何讓廣告更有效且快速的達成行銷目的是現有經由電子媒體的交易技術面臨的課題。 On the other hand, commercial advertising is ubiquitous, including electronic music such as storefront music, telephone, television, radio or websites. It is nothing more than an increase in sales. Therefore, how to make advertising more effective and fast to achieve marketing purposes is the existing trading through electronic media. The subject of technology.

本發明之目的，即在提供一種適用於解決先前技術缺失的廣告交易系統。 It is an object of the present invention to provide an advertising transaction system suitable for addressing the deficiencies of the prior art.

本發明之廣告交易系統在一些實施態樣中具有一與多數廣告商用戶彼此通訊的音訊管理伺服器及一帳戶管理伺服器。該帳戶伺服器建立該等廣告商用戶的帳戶資料並轉送該廣告商用戶的音訊資料予該音訊管理伺服器。各該廣告商用戶上傳一用於在一電子媒體播放的音源予該音訊管理伺服器。該音訊管理伺服器將各該音訊片段經由一音頻轉換程序處理為一聲紋資料，該聲紋資料係對應一包含時間及頻率之軌跡圖，並將該聲紋資料處理為保留主要軌跡特徵並去除背景雜訊的一預定軌跡資料。該音訊管理伺服器依據該帳戶管理伺服器之請求而建立對應於各該音訊片段的一商品資訊的網址。 The advertising transaction system of the present invention has some implementation aspects An audio management server and an account management server that communicate with each advertiser user. The account server establishes account information of the advertiser users and forwards the audio data of the advertiser user to the audio management server. Each of the advertiser users uploads a sound source for playing on an electronic media to the audio management server. The audio management server processes each of the audio segments into a voiceprint data via an audio conversion program, and the voiceprint data corresponds to a track map including time and frequency, and processes the voiceprint data to retain the main track feature and A predetermined trajectory data of the background noise is removed. The audio management server establishes a web address corresponding to a piece of product information of each of the audio segments according to the request of the account management server.

在一些實施態樣中，該音訊管理伺服器對於該音頻之處理，係將該音訊片段切割為小片段並將各小片段以部分重疊方式經過傅立葉轉換、小波轉換以得到該時段的每一時刻對應的數個頻率峰值，並依據該段時間的每一時刻的該等頻率峰值繪製出一以時間及頻率分別為二軸的二維軌跡圖，並將該二維軌跡圖轉換為二值化的一稀疏矩陣。 In some implementations, the audio management server processes the audio segment into small segments and divides the small segments into Fourier transforms and wavelet transforms in a partially overlapping manner to obtain each time of the time period. Corresponding frequency peaks, and drawing a two-dimensional trajectory with time and frequency as two axes according to the frequency peaks at each moment of the period, and converting the two-dimensional trajectory into binarization a sparse matrix.

在一些實施態樣中，該音訊管理伺服器對於該稀疏矩陣之處理，係以一聚類化處理產生該預定軌跡資料，該聚類化處理係採用基於密度的聚類演算法，藉由界定一鄰接區域的最大半徑值及該鄰接區域中的最少的點數量以將背景雜訊移除。 In some implementations, the audio management server processes the sparse matrix by using a clustering process to generate the predetermined trajectory data, and the clustering process uses a density-based clustering algorithm to define The maximum radius value of an adjacent area and the minimum number of points in the adjacent area to remove background noise.

在一些實施態樣中，該音訊管理伺服器之處理還包括進行多重解析度的處理以產生縮減資料量的待比對的軌跡資料。 In some implementations, the processing of the audio management server further includes performing multiple resolution processing to generate a reduced amount of data to be compared. Track data.

在一些實施態樣中，該廣告商用戶是上傳用於在一店面音響、一電話、一電視、一廣播或一網站播放的音源予該音訊管理伺服器。 In some implementations, the advertiser user uploads a sound source for playing on a storefront sound, a phone, a television, a broadcast, or a website to the audio management server.

在一些實施態樣中，所述的廣告交易系統還包括一帳戶管理伺服器，該帳戶伺服器用於建立多數廣告商用戶的帳戶資料並轉送各該廣告商用戶的音訊資料予該音訊管理伺服器。 In some implementations, the advertisement transaction system further includes an account management server, the account server is configured to establish account information of a majority of the advertiser users and forward the audio data of each of the advertiser users to the audio management server. Device.

在一些實施態樣中，所述的廣告交易系統還包括一帳戶管理伺服器，該帳戶伺服器用於建立多數使用終端的帳戶資料並轉送各該使用終端的音訊資料予該音訊管理伺服器。 In some implementations, the advertisement transaction system further includes an account management server, the account server is configured to establish account information of the majority of the use terminals and forward the audio data of each of the use terminals to the audio management server.

在一些實施態樣中，所述的廣告交易系統還包括一客服伺服器，並配合一支付伺服器運作，該客服伺服器係接收各該使用終端的請求而發送載有一商品網址的內容訊息給各該使用終端，供各該使用終端通過該商品網址向該支付伺服器發送該商品之購買請求訊息。 In some implementations, the advertisement transaction system further includes a customer service server and operates in conjunction with a payment server, and the service server receives a request from each of the user terminals to send a content message carrying a product website address. Each of the user terminals is configured to send, by each of the user terminals, a purchase request message of the product to the payment server via the product website.

本發明之功效在於：藉由建置聲紋資料庫並配合軌跡資料的比對方式，除了可去除雜訊而避免誤判，也可快速且準確比對出獲取音訊片段的來源，具有商業應用的價值而適用於各種具有傳遞音頻能力的電子媒體。 The utility model has the advantages that: by constructing the voiceprint database and matching the trajectory data, in addition to removing noise and avoiding misjudgment, the source of the obtained audio segment can be quickly and accurately compared, and the commercial application is used. Value is applicable to a variety of electronic media with the ability to deliver audio.

1‧‧‧使用終端 1‧‧‧Use terminal

200‧‧‧通訊網路 200‧‧‧Communication network

30‧‧‧樣本資料庫 30‧‧‧Sample database

31‧‧‧客服伺服器 31‧‧‧Customer Server

32‧‧‧帳戶管理伺服器 32‧‧‧Account Management Server

33‧‧‧音訊管理伺服器 33‧‧‧Optical management server

331‧‧‧轉換模組 331‧‧‧Transition module

332‧‧‧聚類模組 332‧‧‧ clustering module

333‧‧‧比對模組 333‧‧‧ Alignment module

34‧‧‧支付伺服器 34‧‧‧Payment Server

300‧‧‧交易系統 300‧‧‧ trading system

S11~S26‧‧‧訊息 S11~S26‧‧‧Message

301~308‧‧‧步驟 301~308‧‧‧Steps

本發明之其他的特徵及功效，將於參照圖式的實施方式中清楚地呈現，其中：圖1是一系統圖，說明本發明基於聲紋資料的交易方法之較佳實施例是應用於一使用終端及一交易系統；圖2是一示意圖，說明本發明基於聲紋資料的交易方法在使用終端及交易系統之間的溝通過程；圖3是一流程圖，說明本發明基於聲紋資料的交易方法之較佳實施例；圖4是一示意圖，說明本實施例是設定每32毫秒為一音框的一時刻單位，且各音框以重疊50%的方式進行短時距傅立葉轉換；圖5是一示意圖，說明某一時刻的每個頻率對應的波峰值；圖6a及圖6b是一示意圖，說明背景雜訊點移除前後的二維軌跡圖；圖7a及圖7b是一示意圖，說明不同階數的稀疏矩陣；圖8是一示意圖，說明將二值化的稀疏矩陣儲存為整數值矩陣；圖9是一示意圖，說明將二值化的稀疏矩陣儲存為整數值陣列；圖10a至圖10c是一示意圖，說明分別取自客戶端的軌跡資料、伺服端的軌跡資料及兩者的比對結果。 Other features and effects of the present invention will be apparent from the embodiments of the drawings, in which: 1 is a system diagram illustrating a preferred embodiment of a method for trading voiceprint data based on the present invention, which is applied to a use terminal and a transaction system; FIG. 2 is a schematic diagram illustrating a transaction method based on voiceprint data of the present invention. A communication process between the terminal and the transaction system is used; FIG. 3 is a flow chart illustrating a preferred embodiment of the method for translating voiceprint data according to the present invention; FIG. 4 is a schematic diagram showing that the embodiment is set every 32 milliseconds. a time unit of a sound box, and each sound box performs a short time interval Fourier transform in a manner of overlapping 50%; FIG. 5 is a schematic diagram illustrating a wave peak corresponding to each frequency at a certain time; FIG. 6a and FIG. 6b are A schematic diagram illustrating a two-dimensional trajectory map before and after background noise removal; FIGS. 7a and 7b are schematic diagrams illustrating sparse matrices of different orders; FIG. 8 is a schematic diagram illustrating storing a binarized sparse matrix as An integer value matrix; FIG. 9 is a schematic diagram illustrating storing a binarized sparse matrix as an integer value array; FIG. 10a to FIG. 10c are schematic diagrams illustrating trajectory data and servo trajectories taken from the client respectively. Than both the materials and the results.

參閱圖1，本發明基於聲紋資料的交易方法之較佳實施例是應用於一使用終端1及一交易系統300，較佳的，該交易系統300是一廣告交易系統，用於與至少一廣告商用戶(圖未示)彼此通訊，廣告商用戶可上傳一用於在一電子媒體播放的音源予交易系統300而將其經由本發明基於聲紋資料的交易方法處理。其中，該音源是由一電子媒體的廣告節目播放的配樂或人聲，該電子媒體是包括店面音響、電話、電視、廣播或網站等具有傳遞音頻能力的電子媒體。 Referring to FIG. 1, a preferred embodiment of the method for trading voiceprint data according to the present invention is applied to a use terminal 1 and a transaction system 300. Preferably, the transaction system 300 is an advertisement transaction system for at least one wide The advertisers (not shown) communicate with each other, and the advertiser user can upload a source for playing on an electronic media to the transaction system 300 and process it via the voiceprint based transaction method of the present invention. The sound source is a soundtrack or a human voice played by an electronic media advertisement program, and the electronic medium is an electronic medium having a transmission audio capability, such as a storefront sound, a telephone, a television, a broadcast, or a website.

另外，交易系統300可與一使用終端1彼此通訊連接，使用終端1可以從該音源擷取其中的某一時段的音訊片段予該交易系統300。例如：使用終端1可以是(但不限於)一智慧型手機，且使用終端1可錄下播放中的廣告節目播放的配樂或人聲中的某小段(如：5秒)的音訊片段並將該音訊片段發送給交易系統300。 In addition, the transaction system 300 can be communicatively coupled to a user terminal 1 from which the audio segment of the audio system can be retrieved from the audio source for a certain period of time. For example, the use terminal 1 can be, but is not limited to, a smart phone, and the terminal 1 can record a short segment (eg, 5 seconds) of the audio track or the vocal played by the playing commercial. The audio segment is sent to the transaction system 300.

交易系統300包括一客服伺服器31、一帳戶管理伺服器32、一音訊管理伺服器33及一支付伺服器34，且使用終端1、客服伺服器31、帳戶管理伺服器32、音訊管理伺服器33及支付伺服器34通過一通訊網路200彼此傳遞訊息及溝通，通訊網路200包括各種無線通訊及/或有線通訊形式的架構，只要是可以傳送語音資料的網路架構均為本發明適用的範疇。 The transaction system 300 includes a customer service server 31, an account management server 32, an audio management server 33, and a payment server 34, and uses the terminal 1, the customer service server 31, the account management server 32, and the audio management server. 33 and payment server 34 communicate information and communication with each other through a communication network 200. The communication network 200 includes various wireless communication and/or wired communication architectures, as long as the network architecture capable of transmitting voice data is applicable to the scope of the present invention. .

以下內容請參閱圖2，茲將本發明基於聲紋資料的交易方法之流程說明如下。 Referring to Fig. 2, the flow of the method for trading voiceprint data according to the present invention will be described below.

客服伺服器31主要是執行下述步驟：客服伺服器31向帳戶管理伺服器32發出使用本服務的廣告商用戶的帳戶的請求訊息S11。客服伺服器31自帳戶管理伺服器 32接收廣告商用戶的註冊結果的訊息S14。接著，客服伺服器31向帳戶管理伺服器32上傳音訊片段配對網址及分類資訊的訊息S15。客服伺服器31自帳戶管理伺服器32接收音訊片段配對網址的接受訊息S18。客服伺服器31自使用終端1接收請求而發送載有商品網址的內容訊息S23。客服伺服器31向使用終端1發送商品網址的內容訊息S24。 The customer service server 31 mainly performs the following steps: the customer service server 31 issues a request message S11 to the account management server 32 for the account of the advertiser user who uses the service. Customer service server 31 from account management server 32 A message S14 of receiving the registration result of the advertiser user. Next, the service server 31 uploads the message segment matching URL and the classification information message S15 to the account management server 32. The customer service server 31 receives the acceptance message S18 of the audio clip pairing web address from the account management server 32. The customer service server 31 transmits a content message S23 carrying the product website address from the use terminal 1 to receive the request. The service server 31 transmits the content message S24 of the product website to the user terminal 1.

帳戶伺服器32用於建立多數廣告商用戶的帳戶資料並轉送各廣告商用戶的音訊資料予音訊管理伺服器33以供其建立一預定軌跡資料，並用於建立多數使用終端1的帳戶資料並轉送各使用終端1的音訊資料予音訊管理伺服器33供其建立一待比對的軌跡資料(作用容後再述)。 The account server 32 is configured to establish account information of most advertiser users and forward the audio data of each advertiser user to the audio management server 33 for establishing a predetermined trajectory data, and used to establish and transfer the account data of the majority terminal 1 Each of the audio data of the terminal 1 is used by the audio management server 33 for establishing a track data to be compared (the effect will be described later).

帳戶管理伺服器32主要是執行下述步驟：帳戶伺服器32自客服伺服器31接收一使用本服務的廣告商用戶的帳戶的請求訊息S11。帳戶伺服器32向音訊管理伺服器33發出請求訊息S12以對於廣告商用戶提供儲存空間。帳戶伺服器32接收使用終端1的請求訊息S13以建立一註冊帳戶。帳戶管理伺服器32向客服伺服器31發出使用終端1的註冊結果的訊息S14。帳戶伺服器32自客服伺服器31接收音訊片段配對網址S15。帳戶伺服器32向音訊管理伺服器33發出請求訊息S16以儲存廣告商用戶的音訊片段至音訊管理伺服器33並進行前處理以便於後續的辨識工作。帳戶伺服器32自音訊管理伺服器33接收請求訊息S17，以建立該音訊片段的查表網址索引。帳戶伺服器32向客服伺服器31發送音訊片段配對網址的接受訊息S18。帳戶伺服器32自使用終端1接收含有音訊片段的訊息S19。帳戶伺服器32轉送含有音訊片段的訊息S20至音訊管理伺服器33。帳戶伺服器32自音訊管理伺服器33接收與音訊片段匹配的商品網址的訊息S21。帳戶伺服器32轉送一含有商品網址的訊息S22至使用終端1。 The account management server 32 mainly performs the following steps: The account server 32 receives a request message S11 from the customer service server 31 for the account of the advertiser user who uses the service. The account server 32 sends a request message S12 to the audio management server 33 to provide storage space for the advertiser user. The account server 32 receives the request message S13 using the terminal 1 to establish a registration account. The account management server 32 issues a message S14 to the customer service server 31 using the registration result of the terminal 1. The account server 32 receives the audio clip pairing URL S15 from the customer service server 31. The account server 32 sends a request message S16 to the audio management server 33 to store the audio segment of the advertiser's user to the audio management server 33 and perform pre-processing for subsequent identification work. The account server 32 receives the request message S17 from the audio management server 33 to establish a lookup table index of the audio segment. The account server 32 transmits an acceptance message S18 of the audio clip pairing address to the customer service server 31. The account server 32 receives the message S19 containing the audio segment from the use terminal 1. The account server 32 forwards the message S20 containing the audio segment to the audio management server 33. The account server 32 receives the message S21 of the product website matching the audio segment from the audio management server 33. The account server 32 forwards a message S22 containing the product website to the use terminal 1.

音訊管理伺服器33主要是執行下述步驟：音訊管理伺服器33自帳戶管理伺服器32接收廣告商用戶的請求訊息S12對於廣告商用戶提供儲存空間。。音訊管理伺服器33自帳戶管理伺服器32接收請求訊息S16，以儲存音訊片段並進行前處理以便於後續的辨識工作。音訊管理伺服器33向帳戶管理伺服器32發送請求訊息S17以建立該音訊片段的查表網址索引。音訊管理伺服器33自帳戶管理伺服器32接收音訊片段的訊息S20，依據音訊片段的特徵匹配方式查找對應的商品。音訊管理伺服器33向帳戶管理伺服器32發送與音訊片段匹配的商品網址的訊息S21。 The audio management server 33 mainly performs the following steps: the audio management server 33 receives the request message S12 of the advertiser user from the account management server 32 to provide storage space for the advertiser user. . The audio management server 33 receives the request message S16 from the account management server 32 to store the audio segments and perform pre-processing for subsequent identification work. The audio management server 33 sends a request message S17 to the account management server 32 to establish a lookup table index of the audio segment. The audio management server 33 receives the message S20 of the audio segment from the account management server 32, and searches for the corresponding item according to the feature matching manner of the audio segment. The audio management server 33 transmits a message S21 of the product website matching the audio segment to the account management server 32.

然後，使用終端1向支付伺服器34發送商品購買請求訊息S25。支付伺服器34向使用終端1發送商品支付及運送的相關訊息S26。 Then, the use terminal 1 transmits a product purchase request message S25 to the payment server 34. The payment server 34 transmits a related message S26 for the payment and delivery of the product to the use terminal 1.

本實施例中，音訊管理伺服器33包括一轉換模組331、一聚類模組332、一比對模組333及一樣本資料庫30，轉換模組331將音訊片段經由一音頻轉換程序處理為一聲紋資料；聚類模組332將該聲紋資料聚類化處理以產生保留主要軌跡(trajectory)特徵並去除背景雜訊的預定軌跡資料；比對模組333以待比對的軌跡資料當作一索引而與樣本資料庫30預存的預定軌跡資料比對是否相似，若相似則輸出一對應的資訊內容經由帳戶管理伺服器32予使用終端1。交易模組334依據使用終端1發送的一含有該資訊內容的訊息(如：含有與音訊片段匹配的商品網址的訊息S21)，然後，使用終端1可據以向支付伺服器34發送商品購買請求(如：訊息S25)以執行對應該商品網址的交易需求。 In this embodiment, the audio management server 33 includes a conversion module 331, a clustering module 332, a comparison module 333, and the same database 30. The conversion module 331 processes the audio segments via an audio conversion program. a clustering data; the clustering module 332 clusters the voiceprint data to generate predetermined trajectory data for retaining the main trajectory feature and removing the background noise; and comparing the trajectory of the module 333 to be compared Data as an index Whether the comparison with the predetermined trajectory data pre-stored by the sample database 30 is similar, if similar, a corresponding information content is output to the terminal 1 via the account management server 32. The transaction module 334 is configured to send a product purchase request to the payment server 34 according to a message containing the information content (for example, a message S21 containing a product URL matching the audio segment) sent by the terminal 1. (eg message S25) to execute the transaction needs corresponding to the product URL.

參閱圖3，並配合圖2，音訊管理伺服器33的準備程序說明如下。 Referring to Fig. 3, and in conjunction with Fig. 2, the preparation procedure of the audio management server 33 will be described below.

音訊管理伺服器33擷取廣告音訊(步驟301)，廣告音訊是例如廣告商用戶經由帳戶管理伺服器32上傳的一段廣告音樂檔案。然後，音訊管理伺服器33依據帳戶管理伺服器32之請求而儲存該廣告商用戶上傳的一音訊片段並將該音訊片段經由轉換模組331將廣告音訊處理為傅立葉轉換資料(步驟302)，在本實施例是將該音訊片段切割為小片段並將各小片段以部分重疊方式進行短時距傅立葉轉換(short-time Fourier transform，簡稱STFT)以得到一傅立葉轉換資料。 The audio management server 33 retrieves the advertisement audio (step 301), and the advertisement audio is, for example, an advertisement music file uploaded by the advertiser user via the account management server 32. Then, the audio management server 33 stores an audio segment uploaded by the advertiser user according to the request of the account management server 32, and processes the audio segment into the Fourier transform data via the conversion module 331 (step 302). In this embodiment, the audio segment is cut into small segments and each small segment is subjected to a short-time Fourier transform (STFT) in a partially overlapping manner to obtain a Fourier transform data.

參閱圖4，為設定每32毫秒為一音框的一時刻單位，且各音框以重疊50%的方式進行短時距傅立葉轉換。 Referring to FIG. 4, it is set to a time unit of one frame every 32 milliseconds, and each frame performs short-time Fourier transform with overlapping 50%.

接著，轉換模組331將傅立葉轉換資料處理為小波轉換資料(步驟303)，並利用小波轉換資料取得峰值組(步驟304)，該峰值組是小波轉換資料在該時段的每一時刻對應的數個頻率峰值。 Next, the conversion module 331 processes the Fourier transform data into wavelet transform data (step 303), and uses the wavelet transform data to obtain a peak group (step 304), which is the number corresponding to the wavelet transform data at each moment of the time period. Frequency peaks.

參閱圖5，本實施例是依據該段時間的每一時刻的該等頻率峰值繪製出一以時間及頻率分別為二軸的二維曲線圖。然後，需採用小波分析的隱藏式多尺度特性分析(inherent multi-scale nature of wavelet analysis)處理得到不同時刻的每個頻率對應的波峰值。 Referring to FIG. 5, this embodiment is based on each moment of the period of time. The peaks of the frequencies are plotted as a two-dimensional graph with time and frequency being two axes. Then, the wavelet multi-scale nature of wavelet analysis is used to obtain the peak value corresponding to each frequency at different times.

然後，轉換模組331將二維軌跡圖轉換為二值化的稀疏矩陣M(binary sparse matrix)(步驟305)，聚類模組332對於二值化稀疏矩陣M處理為密度空間聚類演算資料(步驟306)，再將密度空間聚類演算資料處理為多重解析度矩陣資料M1,M2(步驟307)，最後輸出矩陣資料M,M1,M2(步驟308)將其儲存於樣本資料庫30中。 Then, the conversion module 331 converts the two-dimensional trajectory map into a binarized sparse matrix M (step 305), and the clustering module 332 processes the binarized sparse matrix M into density spatial clustering calculus data. (Step 306), the density spatial clustering calculation data is processed into the multi-resolution matrix data M1, M2 (step 307), and finally the matrix data M, M1, M2 is output (step 308) and stored in the sample database 30. .

參閱圖6a~6b，本實施例是採用基於密度的聚類演算法(Density-Based clustering algorithm)，藉由界定鄰接區域的最大半徑值(Eps)及鄰接區域中的最少的點數量(MinPts)，如此，即可將原來如圖6a的背景雜訊移除，得到聚類化之後如圖6b的二維軌跡圖。 Referring to Figures 6a-6b, this embodiment uses a density-based clustering algorithm (Density-Based clustering algorithm) to define the maximum radius value (Eps) of the adjacent region and the minimum number of points in the adjacent region (MinPts). In this way, the background noise as shown in FIG. 6a can be removed to obtain a two-dimensional trajectory diagram as shown in FIG. 6b after clustering.

參閱圖7a~7b，小波分析的隱藏式多尺度特性可設定不同解析度而可得到不同階數(level)的稀疏矩陣，圖7b相較於途7a的階數較低，解析度也較低。 Referring to Figures 7a-7b, the hidden multi-scale characteristics of wavelet analysis can be set to different resolutions to obtain different levels of sparse matrices. Figure 7b has lower order and lower resolution than way 7a. .

參閱圖8，即為儲存於樣本資料庫30預定軌跡資料格式，也就是將二值化的稀疏矩陣儲存為整數值矩陣；參閱圖9，是將二值化的稀疏矩陣儲存為整數值陣列(array)。 Referring to FIG. 8, the predetermined trajectory data format stored in the sample database 30 is stored, that is, the binarized sparse matrix is stored as an integer value matrix; referring to FIG. 9, the binarized sparse matrix is stored as an integer value array ( Array).

藉由從原始的二值化矩陣M可縮減其大小，例如：在伺服端，在30秒鐘的廣告音訊的每16毫秒可得到 32位元的一組資料，假設一組資料為一個整數值(integer value)元素，30秒鐘的原始矩陣有8 x 1874個整數值元素。藉此，可得到兩個較低階數的矩陣M1(大小為4 x 936個整數值元素)及矩陣M2(大小為2 x 468個整數值元素)，大小則分別為15KB及3.7KB。整體而言，在樣本資料庫30只需使用18.7KB的空間去儲存。 By reducing the size from the original binarization matrix M, for example: on the servo side, every 16 milliseconds of the 30 seconds of advertising audio is available. A set of 32-bit data, assuming that a set of data is an integer value element, and a 30-second original matrix has 8 x 1874 integer value elements. Thereby, two lower order matrices M1 (sizes of 4 x 936 integer value elements) and matrix M2 (sizes of 2 x 468 integer value elements) are obtained, and the sizes are 15 KB and 3.7 KB, respectively. Overall, the sample database 30 only needs to use 18.7 KB of space for storage.

本實施例是採用統一計算架構(Compute Unified Device Architecture，簡稱CUDA)，藉此，可利用4個統一計算架構卡的24G位元的記憶空間來存放約120萬筆音訊資料。在客戶端，每紀錄5秒廣告音訊，原始矩陣大小為624行x256列。假設儲存一個整數值元素為32位元，則原始矩陣M大小等於624 x 8個整數值元素，二個較低階的矩陣M1大小等於4 x 312個整數值元素及矩陣M2大小等於2x156個整數值元素，分別為5KB及1.28KB。因此，藉由矩陣M1及矩陣M2當作向伺服端檢索的索引只需要6.3KB的封包，可降低資料傳輸量及伺服端的負擔。 In this embodiment, a Compute Unified Device Architecture (CUDA) is used, so that about 1.2 million audio data can be stored by using the memory space of 24 Gigabits of four unified computing architecture cards. On the client side, for every 5 seconds of advertising audio, the original matrix size is 624 rows x 256 columns. Assuming that an integer value element is stored as 32 bits, the original matrix M size is equal to 624 x 8 integer value elements, and the two lower order matrices M1 are equal to 4 x 312 integer value elements and the matrix M2 size is equal to 2x156 integers. The numerical elements are 5KB and 1.28KB respectively. Therefore, the matrix M1 and the matrix M2 are used as indexes for searching to the server, and only 6.3 KB of packets are required, which can reduce the amount of data transmission and the load on the server.

藉此，當音訊管理伺服器33接收到經過類似於步驟301~308處理後的該預定軌跡資料，藉由比對模組333則可將該待比對的特徵資料當作一索引而與預存在樣本資料庫30的特徵資料比對是否相似，若相似則輸出一對應的資訊內容。 Therefore, when the audio management server 33 receives the predetermined trajectory data processed by the steps 301 to 308, the comparison module 333 can use the comparison feature data as an index and pre-exist. Whether the feature data comparisons of the sample database 30 are similar, and if they are similar, output a corresponding information content.

參閱圖10a~10c，圖10a是取自客戶端經由處理後的廣告音訊的軌跡資料，圖10b是伺服端預先儲存已經過處理後的廣告音訊的軌跡資料，圖10c是比對自客戶端經由處理後的廣告音訊的軌跡資料及伺服端預先儲存已經過處理後的廣告音訊的軌跡資料，其中，綠色的點表示是二者比對相符的部分。 Referring to FIGS. 10a-10c, FIG. 10a is trace data taken from the client through the processed advertisement audio, and FIG. 10b is pre-stored by the server. The trajectory data of the processed advertisement audio, FIG. 10c is a trajectory data comparing the trajectory data from the client through the processed advertisement audio and the servo terminal pre-storing the processed advertisement audio, wherein the green dot indicates The two match the matching part.

補充說明的是，在其他實施例中，使用終端1在短時間內可接收一個音源或多個音源並成為多束軌跡，其中一段或多段的集束軌跡也可比對所有軌跡而表列出配對成功的一個或多個廣告音源，也屬於本發明的應用；另外，廣告商也可依據多個使用終端1的回應而評估廣告效益。 In addition, in other embodiments, the terminal 1 can receive one sound source or multiple sound sources and become a multi-beam trajectory in a short time, wherein one or more segments of the trajectory can also be successfully paired with all the trajectories. One or more advertising sound sources also belong to the application of the present invention; in addition, the advertiser can also evaluate the advertising effectiveness according to the responses of the plurality of using terminals 1.

綜上所述，本發明之功效在於：藉由建置聲紋資料庫30並配合軌跡資料的比對方式，除了可去除雜訊而避免誤判，也可快速且準確比對出獲取音訊片段的來源，具有商業應用的價值而適用於各種具有傳遞音頻能力的電子媒體，故確實能達成本發明之目的。 In summary, the effect of the present invention is that by constructing the voiceprint database 30 and matching the trajectory data, in addition to removing noise, the misjudgment can be avoided, and the audio segment can be quickly and accurately compared. The source, which has the value of commercial application and is applicable to various electronic media having the ability to transmit audio, can indeed achieve the object of the present invention.

惟以上所述者，僅為本發明之較佳實施例而已，當不能以此限定本發明實施之範圍，即大凡依本發明申請專利範圍及專利說明書內容所作之簡單的等效變化與修飾，皆仍屬本發明專利涵蓋之範圍內。 The above is only the preferred embodiment of the present invention, and the scope of the present invention is not limited thereto, that is, the simple equivalent changes and modifications made by the patent application scope and patent specification content of the present invention, All remain within the scope of the invention patent.

1‧‧‧使用終端 1‧‧‧Use terminal

30‧‧‧樣本資料庫 30‧‧‧Sample database

31‧‧‧客服伺服器 31‧‧‧Customer Server

32‧‧‧帳戶管理伺服器 32‧‧‧Account Management Server

33‧‧‧音訊管理伺服器 33‧‧‧Optical management server

331‧‧‧轉換模組 331‧‧‧Transition module

332‧‧‧聚類模組 332‧‧‧ clustering module

333‧‧‧比對模組 333‧‧‧ Alignment module

34‧‧‧支付伺服器 34‧‧‧Payment Server

S11~S26‧‧‧訊息 S11~S26‧‧‧Message

Claims

An advertisement transaction system having an audio management server and an account management server communicating with a majority of advertiser users, wherein the account server establishes account information of the advertiser users and forwards the audio data of the advertiser user And the audio management server uploads a sound source for playing on an electronic media to the audio management server; the audio management server processes an audio segment into a voice data through an audio conversion program. The voiceprint data corresponds to a trajectory map including time and frequency, and the voiceprint data is processed as a predetermined trajectory data for retaining the main trajectory feature and removing background noise; and the audio management server manages the servo according to the account The request for the device establishes a web address corresponding to a piece of product information of the audio segment.

The advertisement transaction system of claim 1, wherein the audio management server comprises: a conversion module, the audio segment is processed into a voiceprint data by an audio conversion program, and the voiceprint data corresponds to a time-containing data And a trajectory map of frequency; a clustering module, the voiceprint data is processed as a predetermined trajectory data for retaining the main trajectory features and removing background noise; and a comparison module is compared with the predetermined trajectory data Whether the trajectory data of the pair is similar, if similar, outputting a corresponding information content; and a transaction module, executing a corresponding transaction requirement according to the information content.

The advertisement transaction system of claim 2, wherein the audio conversion program executed by the conversion module cuts the audio segment into small segments and performs Fourier transform and wavelet transform on the partial segments in a partially overlapping manner to obtain the a plurality of frequency peaks corresponding to each time of the time period, and drawing a two-dimensional trajectory map with time and frequency respectively as two axes according to the frequency peaks at each moment of the time period, and the two-dimensional trajectory map Converted to a sparse matrix of binarization.

The advertisement transaction system of claim 3, wherein the clustering module generates the predetermined trajectory data by clustering the sparse matrix, and the clustering processing adopts a density-based clustering algorithm. The background noise is removed by defining a maximum radius value of a contiguous region and a minimum number of points in the contiguous region.

The advertisement transaction system of any one of claims 2 to 4, wherein the conversion module further comprises a process of performing multiple resolutions to generate a track data to be compared with a reduced amount of data.

The advertisement transaction system according to any one of claims 1 to 4, wherein each of the advertiser users uploads a sound source for playing on a storefront sound, a telephone, a television, a broadcast or a website. Audio management server.

The advertisement transaction system according to any one of claims 1 to 4, wherein the account server further establishes account information of the majority of the use terminals and forwards the audio data of each of the use terminals to the audio management server.

The advertisement transaction system of claim 7, further comprising a customer service server and operating with a payment server, the customer service server receiving each The content message of the product website is sent to each of the user terminals by the request of the terminal, and each of the user terminals sends a purchase request message of the product to the payment server through the product website.