TWI783084B

TWI783084B - Method and system of weight-based usage model for dynamic speech recognition channel selection

Info

Publication number: TWI783084B
Application number: TW107142335A
Authority: TW
Inventors: 詹佳燕; 魏慶麟; 林浩廷
Original assignee: 中華電信股份有限公司
Priority date: 2018-11-27
Filing date: 2018-11-27
Publication date: 2022-11-11
Also published as: TW202020858A

Abstract

The disclosure provides a method and system of weight-based usage model for dynamic speech recognition channel selection. The system includes a speech recognition channel setting module, a speech recognition channel database, a speech recognition channel module, a speech recognition channel weighting calculating module, a speech recognition channel capacity calculating module, a speech recognition channel decision module, and a speech recognition channel usage calculating module.

Description

Method and system for applying weight usage model to dynamically select speech recognition channel

本發明係有關於一種語音辨識方法及系統，且特別是一種權重用量模型應用於動態選擇語音辨識通道的方法與系統。The present invention relates to a speech recognition method and system, and in particular to a method and system for dynamically selecting a speech recognition channel using a weight usage model.

在人工智慧的時代，語音辨識是生活中不可或缺的一項人工智慧辨識技術。語音辨識有其特有的領域性，無法提供一體適用的解決方案，再加上其訓練成本偏高，使得企業或個人用途之使用者無法負荷其費用。In the era of artificial intelligence, speech recognition is an indispensable artificial intelligence recognition technology in life. Speech recognition has its own unique domain, and it cannot provide an all-in-one solution. In addition, its training cost is too high, making it unaffordable for corporate or personal users.

近期相關專利：使用多重語法網絡之語音辨識系統（中華民國專利編號：I394926，申請日：2000/06/21），此篇提出一種多語系語音辨識方法，步驟為：接收聲音音框；複數個不同語系的語音模型；根據聲音音框進入語音模型所產生對應於複數個語音狀態的複數個語音狀態分數；由該些語音狀態分數中選擇複數個最高的狀態分數；以及根據該些狀態分數產生修正值。多語系語音辨識方法以動態Dynamic）和個別（Frame by Frame）的方式，消除多語系偏移現象。Recent related patents: Speech recognition system using multi-grammar network (Republic of China patent number: I394926, application date: 2000/06/21), this article proposes a multilingual speech recognition method, the steps are: receive sound frame; plural Speech models of different language families; entering the speech model according to the sound frame into a plurality of speech state scores corresponding to a plurality of speech states; selecting a plurality of the highest state scores from these speech state scores; and generating according to the state scores correction value. The multilingual speech recognition method eliminates the phenomenon of multilingual deviation in a dynamic and individual (Frame by Frame) manner.

近期相關專利：多語系語音辨識裝置及其方法（中華民國專利I579829, 2017/04/21），此專利提出一種多語系語音辨識方法，其可包含下列步驟：接收聲音音框；複數個不同語系的語音模型；根據聲音音框進入語音模型所產生對應於複數個語音狀態的複數個語音狀態分數；由該些語音狀態分數中選擇複數個最高的狀態分數；以及根據該些狀態分數產生修正值。此多語系語音辨識方法以動態（Dynamic）和個別（Frame by Frame）的方式，消除多語系偏移現象。Recent related patents: multilingual speech recognition device and method (Republic of China patent I579829, 2017/04/21), this patent proposes a multilingual speech recognition method, which may include the following steps: receiving sound frame; multiple different language families The voice model of the voice; according to the sound frame entering the voice model, a plurality of voice state scores corresponding to the multiple voice states are generated; selecting a plurality of the highest state scores from the voice state scores; and generating correction values according to the state scores . This multilingual speech recognition method eliminates the phenomenon of multilingual deviation in a dynamic (Dynamic) and individual (Frame by Frame) manner.

以上與本發明相近之專利仍有以下缺點：（1）未考慮使用者語音辨識需求提供動態切換；以及（2）未考慮動態更新語音辨識通道使用量，用以提升下一次計算使用者選用語音辨識通道的準確度。The above patents that are similar to the present invention still have the following disadvantages: (1) provide dynamic switching without considering the needs of the user's voice recognition; and (2) do not consider dynamically updating the usage of the voice recognition channel to improve the voice selected by the user for the next calculation Identify the accuracy of the channel.

近期相關美國專利編號US9569174 B2，發明名稱為「Methods and systems for managing speech recognition in a multi-speech system environment 」（申請日為：2017/02/17），此篇專利提出管理多語音系統環境的語音管理方法與系統，透過使用者的語音、手勢、注視方式選擇語音辨識系統，此篇專利提出記錄使用者動作、語音、手勢模組，處理器依據使用者動作、語音、手勢模組紀錄判斷選用的語音辨識系統。The recent related U.S. patent number US9569174 B2, the invention name is " Methods and systems for managing speech recognition in a multi-speech system environment " (application date: 2017/02/17), this patent proposes to manage voice in a multi-speech system environment Management method and system, select the voice recognition system through the user's voice, gesture, and gaze. This patent proposes to record the user's actions, voice, and gesture modules, and the processor judges the selection based on the records of the user's actions, voice, and gesture modules. voice recognition system.

以上與本發明相近之專利仍有以下缺點：（1）未考慮語音辨識通道權重及使用者語音辨識通道使用量；以及（2）未考慮語音辨識通道容量，無法依據通道容量動態配置語音辨識通道資源。The above patents similar to the present invention still have the following disadvantages: (1) The voice recognition channel weight and the user's voice recognition channel usage are not considered; and (2) The voice recognition channel capacity is not considered, and the voice recognition channel cannot be dynamically configured according to the channel capacity resource.

由此可見，上述習用方式仍有諸多缺失，實非一良善之設計，而亟待加以改良。本案發明人鑑於上述習用方式所衍生的各項缺點，乃亟思加以改良創新，並經多年苦心孤詣潛心研究後，終於成功研發完成本件利用一種權重用量模型應用於動態選擇語音辨識通道的方法與系統。It can be seen that the above-mentioned customary method still has many deficiencies, which is not a good design and needs to be improved urgently. In view of the shortcomings derived from the above-mentioned conventional methods, the inventor of this case is eager to improve and innovate. After years of painstaking research, he finally successfully developed and completed the method and system for dynamically selecting speech recognition channels using a weight usage model. .

本發明提供一種可達成上述發明目的之一體適用的語音辨識通道選擇解決方案，係指一種權重用量模型應用於動態選擇語音辨識通道的方法與系統。前述系統包括：（1）語音辨識通道設定模組，用以進行通道與通道權重設定與取用；（2）語音辨識通道模組，用以接收使用者語音資料及通道策略資料，並進行通道切換與回傳辨識結果資料；（3）語音辨識通道權重運算模組，用以計算語音辨識通道的權重值；（4）語音辨識通道容量運算模組，用以動態計算語音辨識通道容量；（5）語音辨識通道決策模組，用以決策與排序使用者選用語音辨識通道結果；（6）語音辨識通道使用量運算模組，用以依據使用者通道使用量動態更新通道權重，由上述模組可達成於滿足使用者服務考量，提升語音辨識通道動態資源配置與配置精確度。The present invention provides an all-in-one voice recognition channel selection solution that can achieve the above-mentioned purpose of the invention, which refers to a method and system for dynamically selecting voice recognition channels using a weight usage model. The aforementioned system includes: (1) voice recognition channel setting module, used for channel and channel weight setting and access; (2) voice recognition channel module, used to receive user voice data and channel strategy data, and perform channel Switching and returning the recognition result data; (3) voice recognition channel weight calculation module, used to calculate the weight value of the voice recognition channel; (4) voice recognition channel capacity calculation module, used to dynamically calculate the voice recognition channel capacity; ( 5) Speech recognition channel decision-making module, which is used to decide and sort the results of the voice recognition channel selected by the user; (6) Speech recognition channel usage calculation module, which is used to dynamically update the channel weight according to the user's channel usage. The group can meet the user's service considerations and improve the dynamic resource allocation and allocation accuracy of the speech recognition channel.

為讓本發明的上述特徵和優點能更明顯易懂，下文特舉實施例，並配合所附圖式作詳細說明如下。In order to make the above-mentioned features and advantages of the present invention more comprehensible, the following specific embodiments are described in detail together with the accompanying drawings.

在人工智慧的時代，語音辨識是生活中不可或缺的一項人工智慧辨識技術，語音辨識有其特有的領域性，無法提供一體適用的解決方案，再加上其訓練成本偏高，使得企業或個人用途之使用者無法負荷其費用，本方法依據使用者語音辨識通道設定、語音辨識通道容量及使用者語音辨識通道用量提供動態切換語音辨識通道。In the era of artificial intelligence, speech recognition is an indispensable artificial intelligence recognition technology in life. Speech recognition has its own unique domain and cannot provide an all-in-one solution. In addition, its high training cost makes enterprises Or if the user for personal use cannot afford the cost, this method provides dynamic switching of the voice recognition channel based on the user's voice recognition channel setting, voice recognition channel capacity, and user voice recognition channel usage.

請參閱圖1所示，為本發明一種權重用量模型應用於動態選擇語音辨識通道的方法與系統之架構圖，透過本發明的方法可有效提升語音辨識通道資源調配，本系統係包括：語音辨識通道設定模組1，係為設定語音辨識通道與通道權重並取用通道設定資料；語音辨識通道資料庫2，係為儲存語音辨識通道資料；語音辨識通道模組3，係用以接收使用者語音資料及通道策略資料，並進行通道切換且回傳辨識結果資料；語音辨識通道權重運算模組4，係為計算通道權重並傳送與計算通道權重值；語音辨識通道容量運算模組5，係用以計算通道容量與傳遞通道資料；語音辨識通道決策模組6，係為計算通道決策結果，並進行通道決策排序；語音辨識通道使用量運算模組7，係為計算使用者的通道使用量並更新通道權重值。Please refer to Fig. 1, which is a structure diagram of a method and a system for dynamically selecting speech recognition channels based on a weight usage model of the present invention. The method of the present invention can effectively improve resource allocation of speech recognition channels. The system includes: speech recognition Channel setting module 1 is for setting voice recognition channels and channel weights and accessing channel setting data; voice recognition channel database 2 is for storing voice recognition channel data; voice recognition channel module 3 is for receiving users Voice data and channel strategy data, and switch channels and return the identification result data; the voice recognition channel weight calculation module 4 is to calculate the channel weight and transmit and calculate the channel weight value; the voice recognition channel capacity calculation module 5 is the It is used to calculate channel capacity and transfer channel data; the speech recognition channel decision module 6 is to calculate the channel decision result and perform channel decision sorting; the speech recognition channel usage calculation module 7 is to calculate the user's channel usage And update the channel weight value.

請參閱圖2所示，為本發明應用一種權重用量模型應用於動態選擇語音辨識通道的方法與系統建立權重用量模型的元件圖。Please refer to FIG. 2 , which is a component diagram for establishing a weight usage model for the method and system for dynamically selecting speech recognition channels by applying a weight usage model according to the present invention.

首先，需透過語音辨識通道設定模組1的通道設定元件11設定使用者（User: U_i ，其中i=1,2,…,M表示使用者代碼）的語音辨識通道策略（Method: M_j ，其中j=1,2,…,N表示語音辨識通道代碼），表一為使用者選用此語音辨識通道策略，0表示非選用此策略，1表示選用此策略，通道設定元件11會將語音辨識通道設定資料寫入語音辨識通道資料庫2，資料寫入內容如下表1：

表1First of all, it is necessary to set the voice recognition channel strategy (Method: _M _j , where j=1,2,...,N represents the voice recognition channel code), Table 1 shows that the user chooses this voice recognition channel strategy, 0 means not to choose this strategy, 1 means to choose this strategy, and the channel setting element 11 will set the voice The identification channel setting data is written into the speech recognition channel database 2, and the content of the data writing is as follows in Table 1:

Table 1

接著，透過語音辨識通道設定模組1的通道權重設定元件12設定使用者（User : U_i ，其中i=1,2,…,M表示使用者代碼）的語音辨識通道策略權重（Weight: WM_j ，其中j=1,2,…,N語音辨識通道代碼），通道權重設定元件12會將使用者的語音辨識通道權重（WM_1,UM 表示第M個User設定語音辨識通道策略1的權重值）的設定資料寫入語音辨識通道資料庫2，資料寫入內容如下表2：

表2Then, through the channel weight setting element 12 of the voice recognition channel setting module 1, set the voice recognition channel strategy weight (Weight: _WM _j , where j=1,2,...,N voice recognition channel code), the channel weight setting element 12 will set the user's voice recognition channel weight (WM _{1, UM} means that the Mth User sets the weight of voice recognition channel strategy 1 Value) setting data is written into the speech recognition channel database 2, and the content of the data writing is as follows in Table 2:

Table 2

當語音資料傳送至系統時，由語音辨識通道模組3的通道策略元件31接收語音資料，並將傳送語音資料的使用者資訊取出，經由語音辨識通道設定模組1的通道設定取用元件13至語音辨識通道資料庫取出語音辨識通道設定資料及語音辨識通道權重設定資料，將取出的資料傳送至語音辨識通道權重運算模組4的權重資料傳送元件41，權重計算元件42負責計算使用者語音辨識通道權重，語音辨識通道權重計算公式如下：

（1）

（2）When the voice data is sent to the system, the voice data is received by the channel strategy component 31 of the voice recognition channel module 3, and the user information for transmitting the voice data is taken out, and the channel setting access component 13 of the voice recognition channel setting module 1 is used. Go to the speech recognition channel database to take out the speech recognition channel setting data and the speech recognition channel weight setting data, and send the fetched data to the weight data transmission component 41 of the speech recognition channel weight calculation module 4, and the weight calculation component 42 is responsible for calculating the user's voice Recognition channel weight, voice recognition channel weight calculation formula is as follows:

(1)

(2)

式（1）為使用者1使用語音辨識通道策略1的計算公式，W’_M1,U1 表示使用者1語音辨識通道策略1的權重計算結果，式（2）為使用者1使用語音辨識通道策略N的計算公式，W’_MN,U1 表示使用者1語音辨識通道策略N的權重計算結果。Equation (1) is the calculation formula for user 1 using speech recognition channel strategy 1, W' _{M1, U1} represents the weight calculation result of user 1 speech recognition channel strategy 1, and equation (2) is the speech recognition channel strategy for user 1 The calculation formula of N, W' _{MN, U1} represents the weight calculation result of user 1's speech recognition channel strategy N.

權重資料傳送元件41將各語音辨識通道權重計算結果傳送至語音辨識容量運算模組5的通道資料傳送元件51，容量計算元件52負責即時運算語音辨識通道容量，計算公式如下：

（3） …

（4）The weight data transmission component 41 transmits the weight calculation results of each voice recognition channel to the channel data transmission component 51 of the voice recognition capacity calculation module 5, and the capacity calculation component 52 is responsible for real-time calculation of the voice recognition channel capacity, and the calculation formula is as follows:

(3)…

(4)

式（3）為語音辨識通道1的通道容量，其中C_CH1 為語音辨識通道1可用通道總數量，C_CH1,R 為語音辨識通道1剩餘的可用通道數量，式（4）為語音辨識通道N的通道容量，其中C_CHN 為語音辨識通道N可用通道總數量，C_CHN,R 為語音辨識通道N剩餘的可用通道數量。Equation (3) is the channel capacity of voice recognition channel 1, where C _CH1 is the total number of channels available for voice recognition channel 1, C _{CH1, R} is the remaining number of channels available for voice recognition channel 1, and formula (4) is voice recognition channel N The channel capacity of , where C _CHN is the total number of channels available for speech recognition channel N, and C _CHN,R is the remaining number of available channels for speech recognition channel N.

通道資料傳送元件51至語音辨識通道使用量運算模組7的通道使用量更新元件71取得即時通道使用量數據，通道使用量更新元件71至語音辨識通道資料庫2取得通道使用量數據，如下表3：

表3Channel data transfer component 51 to channel usage update component 71 of voice recognition channel usage calculation module 7 to obtain real-time channel usage data, and channel usage update component 71 to voice recognition channel database 2 to obtain channel usage data, as shown in the following table 3:

table 3

通道使用量更新元件71將表3數據傳送至通道使用量權重計算元件72，計算通道使用量權重，計算公式如下：

（5） …

（6）The channel usage update component 71 transmits the data in Table 3 to the channel usage weight calculation component 72 to calculate the channel usage weight, and the calculation formula is as follows:

(5)…

(6)

式（5）為使用者1使用語音辨識通道策略1的使用量權重，其中P_M1,U1 為使用者1使用語音辨識通道策略1的機率，式（6）為使用者1使用語音辨識通道策略N的使用量權重，其中P_MN,U1 為使用者1使用語音辨識通道策略N的機率。Formula (5) is the usage weight of user 1 using speech recognition channel strategy 1, where P _{M1, U1} is the probability of user 1 using speech recognition channel strategy 1, and formula (6) is user 1 using speech recognition channel strategy The usage weight of N, where P _MN,U1 is the probability that user 1 uses speech recognition channel strategy N.

通道資料傳送元件51傳送語音辨識通道權重計算結果、即時運算語音辨識通道容量結果及通道使用量權重計算結果傳送至語音辨識通道決策模組6的決策排序元件61，決策排序元件61先將資料傳送至決策計算元件62計算決策結果，計算公式如下：

（7）

（8）The channel data transmission component 51 transmits the voice recognition channel weight calculation result, the real-time voice recognition channel capacity result and the channel usage weight calculation result to the decision sorting component 61 of the voice recognition channel decision module 6, and the decision sorting component 61 first transmits the data To the decision calculation element 62 to calculate the decision result, the calculation formula is as follows:

(7)

(8)

在式（7）中

為使用者1的語音辨識通道1的決策結果，

為使用者1的語音辨識通道N的決策結果，決策排序元件61將排序

、…、

的結果，輸出結果以陣列方式排序{

, … ,

}，數值最高的結果排序於第一位，以D’ = {

, … ,

}表示決策結果數值陣列，其中CM表示決策結果陣列中元素個數，排序最後一個以1表示，排序最高以CM表示，即D’={CM,CM-1,…,1}。決策的結果數值若為0則不排列於結果陣列中；決策排序元件61將結果陣列傳送至語音辨識通道模組3的通道切換元件32，通道切換元件依據結果陣列進行切換，將辨識結果資料輸出。In formula (7)

is the decision result of user 1's speech recognition channel 1,

For the decision result of user 1's voice recognition channel N, the decision sorting component 61 will sort

,...,

The results, the output results are sorted in array {

, … ,

}, the result with the highest value is sorted first, with D' = {

, … ,

} represents the numerical array of decision results, where CM represents the number of elements in the decision result array, the last one is represented by 1, and the highest rank is represented by CM, that is, D'={CM,CM-1,…,1}. If the result value of the decision is 0, it will not be arranged in the result array; the decision sorting component 61 transmits the result array to the channel switching component 32 of the voice recognition channel module 3, and the channel switching component switches according to the result array, and outputs the recognition result data .

決策排序元件61將決策結果數值陣列{

, … ,

}傳送至語音辨識通道使用量運算模組7的通道使用量更新元件71，通道使用量更新元件71至語音辨識通道資料庫2取得通道使用量記錄如表3，再將記錄傳送至通道使用量權重計算元件72進行使用量更新，語音辨識通道使用量更新的公式如下：

（9） …

（10）The decision sorting component 61 will make the decision result numerical array {

, … ,

} sent to the channel usage update component 71 of the voice recognition channel usage calculation module 7, the channel usage update component 71 is sent to the voice recognition channel database 2 to obtain the channel usage record as shown in Table 3, and then the record is sent to the channel usage The weight calculation component 72 updates the usage amount, and the formula for updating the usage amount of the speech recognition channel is as follows:

(9) …

(10)

式（9）為P’_M1,U1 使用者1的語音辨識通道1之使用量更新數值，式（10）

P’_MN,U1 為使用者1的語音辨識通道N之使用量更新數值，表3為經過使用量計算元件72更新數據如表4，通道使用量更新元件71將表五結果儲存至語音辨識通道資料庫2。

表4Equation (9) is the update value of the usage of P' _{M1, U1} user 1's voice recognition channel 1, Equation (10)

P' _{MN, U1} is the update value of the usage of the voice recognition channel N of user 1, Table 3 shows the update data of the usage calculation component 72 as shown in Table 4, and the channel usage update component 71 stores the results of Table 5 to the voice recognition channel database2.

Table 4

請參閱圖3所示，為本發明之一種權重用量模型應用於動態選擇語音辨識通道的方法與系統之系統流程圖。首先步驟101由通道策略元件31通知通道設定取用元件13至語音辨識通道資料庫2取出語音辨識通道設定資料。步驟102判斷是否已存在語音辨識通道設定資料。若否，則須先執行步驟103，由通道設定元件11設定語音辨識通道資料，接著步驟104通道權重設定元件12設定語音辨識通道權重資料，步驟105將語音辨識通道資料及語音辨識通道權重資料寫入語音辨識通道資料庫2；若是，通道策略元件31將語音辨識通道資料傳送至權重資料傳送元件41，再傳送至權重計算元件42進行步驟106計算語音辨識通道權重。計算的結果需透過權重資料傳送元件41傳送至通道資料傳送元件51，再傳送至容量計算元件52執行步驟107計算語音辨識通道容量。通道資料傳送元件51將語音辨識通道使用量資料傳送至通道使用量更新元件71，再傳送至通道使用量權重計算元件72。執行步驟108計算語音辨識通道使用量，最後由通道資料傳送元件51將語音辨識通道權重計算結果（式（1）、式（2）的計算結果）、語音辨識通道容量計算結果（式（3）、式（4）的計算結果）、語音辨識通道使用量計算結果（式（5）、式（6）的計算結果）透過通道資料傳送元件51傳送至決策排序元件61，再傳送至62決策計算元件執行步驟109計算語音辨識通道決策。決策計算結果（式（7）、式（8）的計算結果）將由決策排序元件61進行排序並將排序結果傳送決策結果數值至通道使用量更新元件71進行步驟110更新語音辨識通道使用量（式（9）、式（10）的計算結果）。決策排序元件61將排序結果傳送至通道切換元件32執行步驟111執行語音辨識通道切換，最後輸出語音辨識結果資料輸出。Please refer to FIG. 3 , which is a system flowchart of a method and system for dynamically selecting speech recognition channels using a weight usage model according to the present invention. Firstly, in step 101, the channel policy component 31 notifies the channel setting access component 13 to the voice recognition channel database 2 to retrieve the voice recognition channel setting data. Step 102 judges whether there is voice recognition channel setting data. If not, step 103 must be executed first, the voice recognition channel data is set by the channel setting component 11, then step 104 channel weight setting component 12 sets the voice recognition channel weight data, and step 105 writes the voice recognition channel data and the voice recognition channel weight data If so, the channel strategy component 31 transmits the voice recognition channel data to the weight data transmission component 41, and then transmits it to the weight calculation component 42 to perform step 106 to calculate the voice recognition channel weight. The calculation result needs to be transmitted to the channel data transmission component 51 through the weight data transmission component 41 , and then transmitted to the capacity calculation component 52 to perform step 107 to calculate the voice recognition channel capacity. The channel data transmission component 51 transmits the voice recognition channel usage data to the channel usage updating component 71 , and then to the channel usage weight calculation component 72 . Execute step 108 to calculate the voice recognition channel usage, and finally the channel data transmission component 51 will calculate the voice recognition channel weight calculation results (calculation results of formula (1) and formula (2)), voice recognition channel capacity calculation results (formula (3) , the calculation result of formula (4), the calculation result of speech recognition channel usage (the calculation result of formula (5), formula (6)) is transmitted to the decision sorting component 61 through the channel data transmission component 51, and then sent to 62 for decision calculation The component executes step 109 to calculate the speech recognition channel decision. The decision calculation results (the calculation results of formula (7) and formula (8)) will be sorted by the decision sorting component 61 and the sorting results will be sent to the channel usage updating component 71 to perform step 110 to update the speech recognition channel usage (Formula (9), calculation results of formula (10)). The decision sorting component 61 transmits the sorting result to the channel switching component 32 to execute step 111 to switch the speech recognition channel, and finally outputs the speech recognition result data.

以3個使用者及3個語音辨識通道策略為例，其中3個語音辨識通道策略各策略總通道數為：策略1=4個、策略2=4個、策略3=2個通道；首先須進行語音辨識通道資料設定，使用者透過通道設定元件11設定語音辨識通道資料，如下表5，再將設定的資料寫入語音辨識通道資料庫2：

表5Taking 3 users and 3 voice recognition channel strategies as an example, the total number of channels for each strategy of the 3 voice recognition channel strategies is: strategy 1=4, strategy 2=4, strategy 3=2 channels; To set the voice recognition channel data, the user sets the voice recognition channel data through the channel setting component 11, as shown in Table 5 below, and then writes the set data into the voice recognition channel database 2:

table 5

接著，使用者透過通道權重設定元件12設定語音辨識通道策略權重，如下表6，再將設定的資料寫入語音辨識通道資料庫2：

表6Next, the user sets the speech recognition channel strategy weight through the channel weight setting component 12, as shown in Table 6 below, and then writes the set data into the speech recognition channel database 2:

Table 6

當使用者1傳送語音資料至本系統時，由語音辨識通道模組3的通道策略元件31接收語音資料，並將使用者1的語音辨識通道設定資料及語音辨識通道權重設定資料取出，將取出的設定資料傳送至語音辨識通道權重運算模組4的權重資料傳送元件41。權重計算元件42負責計算使用者1語音辨識通道權重，計算結果如下：

When user 1 sends voice data to this system, the channel policy element 31 of voice recognition channel module 3 receives the voice data, and takes out user 1's voice recognition channel setting data and voice recognition channel weight setting data, and takes out The setting data is sent to the weight data transmission component 41 of the voice recognition channel weight calculation module 4 . The weight calculation component 42 is responsible for calculating the weight of the speech recognition channel of user 1, and the calculation result is as follows:

權重資料傳送元件41將使用者1語音辨識通道權重計算結果傳送至語音辨識容量運算模組5的通道資料傳送元件51。容量計算元件52負責即時運算語音辨識通道容量。此範例假設策略1共有4個通道數，未使用1個通道、策略2共有4個通道數，未使用3個通道、策略3共有2個通道數。未使用任何通道時，即時運算語音辨識通道容量計算公式如下：

= 0.25

= 0.75

= 1The weight data transmission component 41 transmits the calculation result of the voice recognition channel weight of the user 1 to the channel data transmission component 51 of the voice recognition capacity calculation module 5 . The capacity calculating component 52 is responsible for calculating the capacity of the speech recognition channel in real time. This example assumes that strategy 1 has a total of 4 channels, 1 channel is not used, strategy 2 has a total of 4 channels, unused 3 channels, and strategy 3 has a total of 2 channels. When no channel is used, the formula for calculating the channel capacity of the real-time voice recognition channel is as follows:

= 0.25

= 0.75

= 1

通道資料傳送元件51至語音辨識通道使用量運算模組7的通道使用量更新元件71取得即時通道使用量數據，通道使用量更新元件71至語音辨識通道資料庫2取得通道使用量數據，如下表7：

表7Channel data transfer component 51 to channel usage update component 71 of voice recognition channel usage calculation module 7 to obtain real-time channel usage data, and channel usage update component 71 to voice recognition channel database 2 to obtain channel usage data, as shown in the following table 7:

Table 7

The channel usage update component 71 transmits the data in Table 3 to the channel usage weight calculation component 72 to calculate the channel usage weight, and the calculation formula is as follows:

通道資料傳送元件51傳送使用者1語音辨識通道權重計算結果、即時運算語音辨識通道容量結果及通道使用量權重結果至語音辨識通道決策模組6的決策排序元件61，決策排序元件61先將資料傳送至決策計算元件62計算決策結果，計算結果如下：

= 0.0975

.39The channel data transmission component 51 transmits the calculation result of the voice recognition channel weight of user 1, the real-time operation voice recognition channel capacity result and the channel usage weight result to the decision sorting component 61 of the voice recognition channel decision module 6, and the decision sorting component 61 first sends the data Send to the decision calculation element 62 to calculate the decision result, the calculation result is as follows:

= 0.0975

.39

決策排序元件61將排序

、

、

的結果，輸出結果以陣列方式排序{

,

}={0.39,0.0975}，決策的結果

數值為0則不列於結果陣列中；決策結果數值陣列D’ = {

,

}={2,1}，決策排序元件61將排序結果傳送至語音辨識通道模組3的通道切換元件32，通道切換元件依據排序結果進行切換，將辨識結果資料輸出。Decision Sequencing Element 61 will sort

,

The results, the output results are sorted in array {

,

}={0.39,0.0975}, the result of the decision

If the value is 0, it will not be listed in the result array; the decision result value array D' = {

,

}={2,1}, the decision sorting component 61 transmits the sorting result to the channel switching component 32 of the voice recognition channel module 3, the channel switching component switches according to the sorting result, and outputs the recognition result data.

決策排序元件61將決策結果數值陣列傳送至語音辨識通道使用量運算模組7的通道使用量更新元件71。通道使用量更新元件71至語音辨識通道資料庫取得通道使用量記錄如表7，再將記錄傳送至通道使用量權重計算元件72進行使用量更新，依據決策結果數值結果為{

,

}={2,1}，其中共有兩個非0的方案，故CM=2，更新的公式如下：

=

3.33

= 2

=

1.67The decision sorting component 61 transmits the numerical array of decision results to the channel usage updating component 71 of the speech recognition channel usage calculation module 7 . The channel usage updating component 71 goes to the speech recognition channel database to obtain the channel usage record as shown in Table 7, and then sends the record to the channel usage weight calculation component 72 to update the usage, and the numerical result according to the decision result is {

,

}={2,1}, there are two non-zero solutions, so CM=2, the updated formula is as follows:

=

3.33

= 2

=

1.67

將上述的通道使用量更新資料透過通道使用量更新元件71寫入語音辨識通道資料庫2，更新的資料如表8：

表8Write the above-mentioned channel usage update data into the speech recognition channel database 2 through the channel usage update component 71, and the updated data is shown in Table 8:

Table 8

特點及功效Features and functions

本發明所提供之一種權重用量模型應用於動態選擇語音辨識通道的方法與系統，與其他習用技術相互比較時，更具有下列之優點：A weight usage model provided by the present invention is applied to the method and system for dynamically selecting speech recognition channels. When compared with other conventional technologies, it has the following advantages:

1. 本發明之方法係依據使用者通道設定資料、語音辨識通道容量及通道使用量權重決策語音辨識通道，提升語音辨識通道切換的精確度，於使用者通道服務滿意度考量下有效提升語音辨識通道資源配置。1. The method of the present invention is based on the user channel setting data, voice recognition channel capacity and channel usage weight to determine the voice recognition channel, improve the accuracy of voice recognition channel switching, and effectively improve voice recognition under the consideration of user channel service satisfaction Channel resource configuration.

2. 本發明之語音辨識通道設定模組用以設定使用者選用的語音辨識通道策略，使用者可依據需求設定多個語音辨識通道，提供使用者多元選擇方案。2. The voice recognition channel setting module of the present invention is used to set the voice recognition channel strategy selected by the user. The user can set multiple voice recognition channels according to the needs, providing users with multiple options.

3. 本發明之語音辨識通道模組用以依據語音辨識通道決策結果動態切換通道。3. The voice recognition channel module of the present invention is used to dynamically switch channels according to the decision result of the voice recognition channel.

4. 本發明之語音辨識通道權重運算模組用以依據使用者的通道權重設定值計算語音辨識通道的權重。4. The speech recognition channel weight calculation module of the present invention is used to calculate the weight of the speech recognition channel according to the user's channel weight setting value.

5. 本發明之語音辨識通道容量運算模組用以動態計算語音辨識通道容量，提升語音辨識通道容量，可提升語音辨識通道資源配置。5. The voice recognition channel capacity calculation module of the present invention is used to dynamically calculate the voice recognition channel capacity, increase the voice recognition channel capacity, and improve the resource allocation of the voice recognition channel.

6. 本發明之語音辨識通道決策模組，依據語音辨識通道權重運算、語音辨識通道容量運算模組及語音辨識通道使用量運算模組進行語音辨識通道決策計算與排序，於考量使用者需求與通道容量下，提升語音辨識通道動態資源分配。6. The voice recognition channel decision-making module of the present invention performs voice recognition channel decision-making calculation and sorting according to the voice recognition channel weight calculation, voice recognition channel capacity calculation module and voice recognition channel usage calculation module, considering user needs and Under the channel capacity, the dynamic resource allocation of the speech recognition channel is improved.

7. 本發明之語音辨識通道使用量運算模組，係用以計算使用者通道使用量權重，提供動態更新使用者通道使用量權重，提升使用者下一次使用語音通道辨識的精確度。7. The speech recognition channel usage calculation module of the present invention is used to calculate the weight of the user channel usage, provide dynamic update of the user channel usage weight, and improve the accuracy of the user's voice channel recognition next time.

8. 本發明係有關於一種權重用量模型應用於動態選擇語音辨識通道的方法與系統，特別是應用語音辨識通道權重運算模組、語音辨識通道容量運算模組、語音辨識通道使用量運算模組及語音辨識通道決策模組，利用此方法依據使用者語音辨識通道設定與語音辨識通道容量及語音辨識通道使用量權重提供動態語音辨識通道選擇，提升切換語音辨識通道的精確度並有效提升語音辨識通道資源調配。8. The present invention relates to a method and system for applying a weight usage model to dynamically select a voice recognition channel, especially the application of a voice recognition channel weight calculation module, a voice recognition channel capacity calculation module, and a voice recognition channel usage calculation module And voice recognition channel decision-making module, using this method to provide dynamic voice recognition channel selection according to user voice recognition channel settings, voice recognition channel capacity and voice recognition channel usage weight, improve the accuracy of switching voice recognition channels and effectively improve voice recognition Channel resource allocation.

本發明可獲致的功效如下：（1）經語音辨識通道權重運算模組、語音辨識通道容量運算模組及語音辨識通道使用量運算模組，依據使用者語音辨識通道權重及語音辨識通道容量動態計算權重，於考量使用者通道服務滿意度，動態配置語音辨識通道；（2）經語音辨識通道決策模組，依據決策運算結果切換使用者的語音辨識通道，有效提升語音辨識通道資源調配（3）經語音辨識通道使用量運算模組動態更新通道使用量權重，提升切換語音辨識通道的精確度。The effects obtained by the present invention are as follows: (1) Through the voice recognition channel weight calculation module, the voice recognition channel capacity calculation module and the voice recognition channel usage calculation module, according to the user's voice recognition channel weight and voice recognition channel capacity dynamic Calculate the weight to dynamically configure the speech recognition channel in consideration of the service satisfaction of the user channel; (2) The speech recognition channel decision module switches the user's speech recognition channel according to the decision calculation result, effectively improving the resource allocation of the speech recognition channel (3 ) through the voice recognition channel usage calculation module to dynamically update the channel usage weight to improve the accuracy of switching voice recognition channels.

雖然本發明已以實施例揭露如上，然其並非用以限定本發明，任何所屬技術領域中具有通常知識者，在不脫離本發明的精神和範圍內，當可作些許的更動與潤飾，故本發明的保護範圍當視後附的申請專利範圍所界定者為準。Although the present invention has been disclosed above with the embodiments, it is not intended to limit the present invention. Anyone with ordinary knowledge in the technical field may make some changes and modifications without departing from the spirit and scope of the present invention. The scope of protection of the present invention should be defined by the scope of the appended patent application.

1:語音辨識通道設定模組11:通道設定元件12:通道權重設定元件13:通道設定取用元件2:語音辨識通道資料庫3:語音辨識通道模組31:通道策略元件32:通道切換元件4:語音辨識通道權重運算模組41:權重資料傳送元件42:權重計算元件5:語音辨識通道容量運算模組51:通道資料傳送元件52:容量計算元件6:語音辨識通道決策模組61:決策排序元件62:決策計算元件7:語音辨識通道使用量運算模組71:通道使用量更新元件72:通道使用量權重計算元件101~111:一種權重用量模型應用於動態選擇語音辨識通道的方法與系統之流程圖1: Voice recognition channel setting module 11: Channel setting component 12: Channel weight setting component 13: Channel setting access component 2: Voice recognition channel database 3: Voice recognition channel module 31: Channel strategy component 32: Channel switching component 4: voice recognition channel weight calculation module 41: weight data transmission component 42: weight calculation component 5: voice recognition channel capacity calculation module 51: channel data transmission component 52: capacity calculation component 6: voice recognition channel decision module 61: Decision sorting component 62: Decision calculation component 7: Voice recognition channel usage calculation module 71: Channel usage updating component 72: Channel usage weight calculation components 101~111: A method for dynamically selecting a voice recognition channel using a weight usage model Flow chart of the system

圖1為一種權重用量模型應用於動態選擇語音辨識通道的方法與系統之系統架構圖。圖2為一種權重用量模型應用於動態選擇語音辨識通道的方法與系統之元件圖。圖3為一種權重用量模型應用於動態選擇語音辨識通道的方法與系統之流程圖。FIG. 1 is a system architecture diagram of a method and system for dynamically selecting speech recognition channels using a weight usage model. FIG. 2 is a component diagram of a method and system for dynamically selecting speech recognition channels using a weight usage model. FIG. 3 is a flow chart of a method and system for dynamically selecting speech recognition channels using a weight usage model.

1:語音辨識通道設定模組 1: Speech recognition channel setting module

2:語音辨識通道資料庫 2: Speech recognition channel database

3:語音辨識通道模組 3: Speech recognition channel module

4:語音辨識通道權重運算模組 4: Speech recognition channel weight calculation module

5:語音辨識通道容量運算模組 5: Speech recognition channel capacity calculation module

6:語音辨識通道決策模組 6:Speech recognition channel decision module

7:語音辨識通道使用量運算模組 7: Speech recognition channel usage calculation module

Claims

A weight usage model is applied to a system for dynamically selecting speech recognition channels, including: a speech recognition channel setting module, which is used to set a user's selection strategy for multiple speech recognition channels and the strategy weight of each speech recognition channel, wherein If the ath user chooses the bth speech recognition channel strategy, set M _{b, Ua} to be 1, otherwise set _{Mb, Ua} to be 0, and the ath user has a strategy for the bth speech recognition channel The set strategy weight is represented as W _{Mb, Ua} ; a voice recognition channel database, used to store the selection strategy of each voice recognition channel and the strategy weight of each voice recognition channel; a voice recognition channel module, used To receive the voice data of the user, and receive the selection strategy of each of the voice recognition channels of the user and the strategy weight of each of the voice recognition channels retrieved by the voice recognition channel setting module from the voice recognition channel database ; A speech recognition channel weight calculation module, used to calculate the channel weight of each speech recognition channel based on the selection strategy of each speech recognition channel and the strategy weight of each speech recognition channel, and transmit the channel weight of each speech recognition channel The channel weight, wherein the channel weight corresponding to the a-th user and the b-th voice recognition channel strategy is W' _{Mb, Ua} , and

, where N is the number of these voice recognition channels; a voice recognition channel capacity computing module, which receives the channel weights of each of the voice recognition channels, and calculates the channel capacity of each of the voice recognition channels accordingly, wherein the kth voice The channel capacity of the identification channel is characterized by

, where C _CHk is the total number of channels available for the k-th voice recognition channel, C _{CHk, R} is the remaining number of available channels for the k-th voice recognition channel; a voice recognition channel usage calculation module for Obtain the real-time usage amount of each voice recognition channel, and calculate the channel usage weight of each voice recognition channel accordingly, wherein the real-time usage amount corresponding to the a-th user and the k-th voice recognition channel Represented as P _Mk,Ua , and the channel usage weight corresponding to the ath user and the kth speech recognition channel strategy is represented as

; A voice recognition channel decision module, used to calculate the decision result of each voice recognition channel based on the channel capacity, the channel weight and the channel usage weight of each voice recognition channel, and sort the voice recognition channels accordingly To generate a result array, wherein the voice recognition channel decision module provides the result array to the voice recognition channel usage calculation module and the voice recognition channel module, wherein the ath user is for the kth voice The decision result of the identification channel is characterized by D _Ua,CHk = W' _Mk,Ua ×

, and the result array corresponding to the a-th user is represented as { D' _{U1 , CH1} ,..., D' _{U1 , CHN} }, wherein each element in the result array is not 0, and the number is CM; the speech recognition channel usage calculation module calculates the usage update value of each of the speech recognition channels, wherein the usage update value of the kth speech recognition channel of the a user is represented as

; Wherein, the voice recognition channel module switches among the voice recognition channels according to the result array, and returns the voice recognition result.

The system as described in item 1 of the scope of the patent application, wherein the speech recognition channel setting module includes: a channel setting element, which is used to provide the user with settings for each of the speech recognition channels The selection strategy of the channel, and write it into the voice recognition channel database; a channel weight setting element, used for the user to set the strategy weight of each voice recognition channel, and write it into the voice recognition channel accordingly database; a channel setting access element, used for accessing and transmitting the selection strategy of each speech recognition channel and the strategy weight of each speech recognition channel.

The system described in item 1 of the scope of patent application, wherein the voice recognition channel module includes: a channel strategy element for receiving the voice data of the user, and receiving the user from the voice recognition channel setting module The selection strategy of each of the speech recognition channels and the strategy weight of each of the speech recognition channels; a channel switching element, used to receive the result array of the speech recognition channel decision module, and according to these speech recognition channels switch between.

The system described in item 1 of the scope of the patent application, wherein the speech recognition channel weight calculation module includes: a weight calculation element for calculating the channel weight of each of the speech recognition channels; a weight data transmission element for transmitting The channel weight of each speech recognition channel.

The system described in item 1 of the scope of the patent application, wherein the voice recognition channel capacity calculation module includes: a channel data transmission component, responsible for receiving the channel weight of each voice recognition channel and forwarding it to the voice recognition channel The usage calculation module and the voice recognition a recognition channel decision module; and a capacity calculation element, which is used to dynamically calculate the channel capacity of each speech recognition channel.

The system as described in item 1 of the scope of patent application, wherein the voice recognition channel decision module includes: a decision calculation element, which is used to calculate the channel capacity, the channel weight result and the channel usage weight of each voice recognition channel calculating the decision result of each of the speech recognition channels; and a decision sorting component for sorting the decision results of each of the speech recognition channels to generate the result array, and sending the result array to the speech recognition channel module and the Speech recognition channel usage calculation module.

The system as described in item 1 of the scope of the patent application, wherein the voice recognition channel usage calculation module includes: a channel usage weight calculation element for calculating the channel usage weight of each voice recognition channel; and a channel The usage updating component is used for dynamically receiving the channel usage weight of each voice recognition channel from the channel usage weight calculation component, and updating the voice recognition channel database accordingly.

A kind of weight consumption model is applied to the method for dynamically selecting the voice recognition channel, and its process steps are: A. set a user's selection strategy to a plurality of voice recognition channels by the voice recognition channel setting element; B. set the voice recognition channel weight by the element Set the strategy weight of each speech recognition channel, wherein if the ath user selects the bth speech recognition channel strategy, set _{Mb, Ua} to 1, otherwise set _{Mb, Ua} to 0, and the ath uses Or the strategy weight set for the b-th voice recognition channel strategy is represented as W _{Mb, Ua} : C. transmit the voice data of the user to the channel strategy element; D. call the channel setting and access by the channel strategy element The component uses the selection strategy and the strategy weight of each speech recognition channel; E. Set the access component from the channel to a speech recognition channel database to obtain the selection strategy and the strategy weight of each speech recognition channel; F. The selection strategy and the strategy weight of each of the speech recognition channels are transmitted to the weight data transmission element, and then the weight calculation element calculates the channel weight of each of the speech recognition channels, which corresponds to the ath user and the ath user The channel weight of the b speech recognition channel strategy is W' _{Mb, Ua} , and W' _{Mb , Ua} =

, wherein N is the quantity of these voice recognition channels; G. The channel weight of each of the voice recognition channels is transmitted to the channel data transmission element; H. The capacity calculation element of the voice recognition channel is called by the channel data transmission element for real-time dynamics Calculate the channel capacity of each speech recognition channel, wherein the channel capacity of the kth speech recognition channel is represented as

, where C _CHk is the total number of available channels of the k-th voice recognition channel, C _{CHk, R} is the remaining available channel number of the k-th voice recognition channel; 1. is updated by channel data transmission element call channel usage The component obtains the real-time usage amount of each voice recognition channel, and calculates the channel usage weight of each voice recognition channel, wherein the real-time usage amount corresponding to the a-th user and the k-th voice recognition channel represents is P _Mk,Ua , and the channel usage weight corresponding to the ath user and the kth speech recognition channel strategy is expressed as

J. The channel capacity, the channel weight and the channel usage weight of each of the speech recognition channels are transmitted to the decision-making sequence element by the channel data transmission element; The decision result of the recognition channel, wherein the decision result of the ath user for the kth speech recognition channel is represented as

, and the result array corresponding to the a-th user is represented as { D' _{U1 , CH1} ,..., D' _{U1 , CHN} }, wherein each element in the result array is not 0, and the number is CM; L. The decision-making sorting component transmits the decision result of each voice recognition channel to the channel usage update component for calculation, and updates the channel of each voice recognition channel according to the real-time usage of each voice recognition channel weight, and update to the speech recognition channel database accordingly, wherein the usage update value of the kth speech recognition channel of the ath user is represented as

M. The decision ordering component transmits the decision result of each voice recognition channel to the channel switching component; N. The channel switching component switches among the voice recognition channels and returns the voice recognition result.