TWI752682B

TWI752682B - Method for updating speech recognition system through air

Info

Publication number: TWI752682B
Application number: TW109136375A
Authority: TW
Inventors: 陳信宏; 廖元甫; 王逸如; 黃紹華; 姚秉志; 葉政育; 陳又碩; 鍾耀興; 黃彥鈞; 黃啟榮; 沈立得; 古甯允
Original assignee: 國立陽明交通大學
Priority date: 2020-10-21
Filing date: 2020-10-21
Publication date: 2022-01-11
Also published as: TW202217797A

Abstract

The present invention provides a method for updating speech recognition system through air. Client ASR servers connect with a central ASR cloud server through Internet. New edition of ASR system is stored in the central ASR cloud server for being selected and downloaded by the client ASR servers for using.

Description

Method for updating speech recognition system in cloud

本發明有關於更新語音辨識系統的方法，尤其是指經由雲端更新語音辨識系統的方法。 The present invention relates to a method for updating a speech recognition system, in particular to a method for updating the speech recognition system via the cloud.

一般的雲端自動語音辨識系統(ASR,Automatic Speech Recognition)若需要更新時，必須專業人員攜帶USB隨身碟進入一控制該雲端自動語音辨識系統的機房進行更新，相當耗費人力與時間。 When a general cloud automatic speech recognition system (ASR, Automatic Speech Recognition) needs to be updated, professionals must bring a USB flash drive into a computer room that controls the cloud automatic speech recognition system to update, which is quite labor-intensive and time-consuming.

雲端自動語音辨識系統既然在雲端，則從雲端更新自動語音辨識系統，是更加便捷的方式。這種技術由雲端自動語音辨識系統的開發廠商直接設計並供客戶使用，開發廠商將新版的自動語音辨識系統放在雲端，由其客戶的雲端自動語音辨識系統經由網路而選擇新版的自動語音辨識系統以便使用。 Since the cloud automatic speech recognition system is in the cloud, it is more convenient to update the automatic speech recognition system from the cloud. This technology is directly designed by the developer of the cloud automatic speech recognition system and used by customers. The developer puts the new version of the automatic speech recognition system on the cloud, and the customer's cloud automatic speech recognition system selects the new version of the automatic speech through the network. Identify the system for use.

本發明的目的在提出一種經由雲端更新語音辨識系統的方法，以供客戶ASR服務端與中央ASR雲端伺服端以網路相連，而能選擇新版的自動語音辨識系統。本發明的方法，其內容敘述如下。 The purpose of the present invention is to provide a method for updating the speech recognition system via the cloud, so that the client ASR server and the central ASR cloud server are connected via the network, and a new version of the automatic speech recognition system can be selected. The content of the method of the present invention is described below.

客戶ASR服務端作為提供雲端自動語音辨識的系統，並設置一中央ASR雲端伺服端與該客戶ASR服務端以網路相連。 The client ASR server serves as a system for providing automatic speech recognition in the cloud, and a central ASR cloud server is set up to be connected to the client ASR server via the network.

新版的自動語音辨識系統放在中央ASR雲端伺服端，由客戶ASR服務端經由網路而選擇新版的自動語音辨識系統以便使用。 The new version of the automatic speech recognition system is placed on the central ASR cloud server, and the customer ASR server selects the new version of the automatic speech recognition system for use through the network.

新版中的自動語音辨識系統分析語音的步驟，順序為音訊前處理、抽取語音特徵參數、聲學模型和語言模型，其中該聲學模型和該語言模型是雲端更新的主體。 The steps of the automatic speech recognition system in the new version to analyze speech are in the order of audio preprocessing, extraction of speech feature parameters, acoustic model and language model, where the acoustic model and the language model are the main body of the cloud update.

1:客戶ASR服務端 1: Client ASR server

2:客戶ASR服務端 2: Client ASR server

3:客戶ASR服務端 3: Client ASR server

4:中央ASR雲端伺服端 4: Central ASR cloud server

21:音訊前處理 21: Audio preprocessing

22:抽取語音特徵參數 22: Extract speech feature parameters

23:聲學模型 23: Acoustic Model

24:語言模型 24: Language Models

31:語音辨識執行程序 31: Speech recognition executive program

32:根據設定檔描述決定使用何種版本 32: Decide which version to use based on the profile description

41:步驟 41: Steps

42:步驟 42: Steps

43:步驟 43: Steps

44:步驟 44: Steps

45:步驟 45: Steps

46:步驟 46: Steps

47:步驟 47: Steps

48:步驟 48: Steps

49:步驟 49: Steps

50:步驟 50: Steps

A:版本 A: version

B:版本 B: version

C:版本 C:version

圖1為本發明的基本架構說明圖。 FIG. 1 is an explanatory diagram of the basic structure of the present invention.

圖2為本發明自動語音辨識系統分析語音的步驟示意圖。 FIG. 2 is a schematic diagram of the steps of analyzing speech by the automatic speech recognition system of the present invention.

圖3為本發明雲端自動語音辨識系統選擇版本的流程圖。 FIG. 3 is a flow chart of the version selection of the cloud automatic speech recognition system of the present invention.

圖4為本發明自動語音辨識系統經由雲端通訊更新版本的流程圖。 FIG. 4 is a flow chart of the updated version of the automatic speech recognition system of the present invention via cloud communication.

圖1說明本發明的基本架構。客戶ASR服務端1、客戶ASR服務端2、客戶ASR服務端3都是提供雲端自動語音辨識的系統，都與本發明中央ASR雲端伺服端4以網路相連。本發明中央ASR雲端伺服端4由雲端自動語音辨識系統的開發廠商直接設計並供客戶ASR服務端1、ASR服務端2、ASR服務端3使用，開發廠商將新版的自動語音辨識系統放在中央ASR雲端伺服端4，由其客戶的雲端自動語音辨識系統經由網路而選擇新版的自動語音辨識系統以便使用。 Figure 1 illustrates the basic architecture of the present invention. The client ASR server 1, the client ASR server 2, and the client ASR server 3 are all systems that provide automatic speech recognition in the cloud, and are connected to the central ASR cloud server 4 of the present invention through the network. The central ASR cloud server 4 of the present invention is directly designed by the developer of the cloud automatic speech recognition system and used by the client ASR server 1, ASR server 2, and ASR server 3. The developer puts the new version of the automatic speech recognition system in the central The ASR cloud server 4 selects a new version of the automatic speech recognition system for use by its customer's cloud automatic speech recognition system via the Internet.

圖2說明自動語音辨識系統分析語音的步驟，順序為音訊前處理21、抽取語音特徵參數22、聲學模型23和語言模型24。其中聲學模型23和語言模型24是雲端更新的主體，開發廠商著力於此，使雲端更新簡單輕便快速。 FIG. 2 illustrates the steps of the automatic speech recognition system for analyzing speech, the sequence is audio preprocessing 21 , extraction of speech feature parameters 22 , acoustic model 23 and language model 24 . Among them, the acoustic model 23 and the language model 24 are the main body of the cloud update, and developers focus on this to make the cloud update simple Light and fast.

請見圖3，說明客戶ASR服務端1、客戶ASR服務端2、客戶ASR服務端3、、、等提供雲端自動語音辨識的系統如何選擇版本的流程。語音辨識系統首先進行「語音辨識執行程序」31，然後根據其設定檔描述決定使用何種版本32。若其設定檔描述的是版本A，則導向版本A的聲學模型與語言模型。若描述的是版本B，則導向版本B的聲學模型與語言模型。若未來需要進行雲端版本更新時，則留一個位置給版本C。 Please refer to Figure 3 to illustrate the process of how to select the version of the system that provides cloud automatic speech recognition, such as customer ASR server 1, customer ASR server 2, customer ASR server 3, , , etc. The speech recognition system first performs a "speech recognition execution program" 31 , and then decides which version to use 32 according to its profile description. If its profile describes version A, it leads to the acoustic model and language model of version A. If it is describing version B, it leads to the acoustic model and language model of version B. If the cloud version needs to be updated in the future, leave a location for version C.

圖4說明客戶ASR服務端1、客戶ASR服務端2、客戶ASR服務端3、、、等與中央ASR雲端伺服端4的雲端通訊更新流程。客戶ASR服務端一般會在比方說凌晨兩點主動詢問中央ASR雲端伺服端4上的新版(步驟41)，中央ASR雲端伺服端4答覆其新版(步驟42)。客戶ASR服務端比較其設定檔中的版本(步驟43)，如果與新版相同就不會進行雲端更新。若與新版不同，客戶的ASR服務端就會向中央ASR雲端伺服端4請求下載新版(步驟44)。 FIG. 4 illustrates the cloud communication update process between the client ASR server 1, the client ASR server 2, the client ASR server 3, , , etc. and the central ASR cloud server 4. Generally, the client ASR server will actively inquire about the new version on the central ASR cloud server 4 at, for example, two in the morning (step 41 ), and the central ASR cloud server 4 will reply to the new version (step 42 ). The client ASR server compares the version in its configuration file (step 43), and if it is the same as the new version, the cloud update will not be performed. If it is different from the new version, the client's ASR server will request the central ASR cloud server 4 to download the new version (step 44).

中央ASR雲端伺服端4將已經打包成ZIP(壓縮檔案)的新版的聲學和語言模型，計算其MD5(訊息摘要演算法)(步驟45)，然後下載到客戶ASR服務端，並且告知其MD5數值(步驟46)。 The central ASR cloud server 4 calculates the MD5 (Message Digest Algorithm) of the new version of the acoustic and language model that has been packaged into a ZIP (compressed file) (step 45), then downloads it to the client ASR server, and informs its MD5 value (step 46).

客戶ASR服務端對於下載後的ZIP進行MD5運算(步驟47)，並比較回應訊令中的MD5數值(步驟48)。步驟48是為了驗證下載的ZIP檔案是否完整，MD5數值相同就表示ZIP檔案的完整性。 The client ASR server performs MD5 operation on the downloaded ZIP (step 47 ), and compares the MD5 value in the response message (step 48 ). Step 48 is to verify whether the downloaded ZIP archive is complete, and the same MD5 value indicates the completeness of the ZIP archive.

最後客戶ASR服務端進行ZIP解壓(步驟49)，並將「設定檔」的描述指向新版(步驟50)，最後重啟整個系統，即完成雲端更新。 Finally, the client ASR server performs ZIP decompression (step 49), and points the description of the "configuration file" to the new version (step 50), and finally restarts the entire system, that is, the cloud update is completed.

本發明的精神與範圍決定於下面的申請專利範圍，不受限於上述實施例。 The spirit and scope of the present invention are determined by the following patent application scope, and are not limited to the above-mentioned embodiments.

1:客戶ASR服務端 1: Client ASR server

2:客戶ASR服務端 2: Client ASR server

3:客戶ASR服務端 3: Client ASR server

4:中央ASR雲端伺服端 4: Central ASR cloud server

Claims

A method for updating a speech recognition system in the cloud, comprising the following steps: (1) setting at least one client ASR server as a system for providing automatic speech recognition in the cloud, and setting up a central ASR cloud server and the at least one client ASR server via a network (2) A new version of the automatic speech recognition system is placed on the central ASR cloud server, and the at least one client ASR server chooses to download the new version of the automatic speech recognition system via the network for use; (3) the At least one client ASR server actively inquires about the new version on the central ASR cloud server: (4) the central ASR cloud server replies to the new version; (5) the at least one client ASR server compares the version in one of its configuration files , if it is the same as the new version, the cloud update will not be performed; (6) if it is different from the new version, the at least one client ASR server requests the central ASR cloud server to download the new version to complete the update; (7) the central ASR cloud server The terminal packs the new version into a ZIP (compressed file), calculates an MD5 (Message Digest Algorithm) value, and then downloads it to the at least one client ASR server, and informs the MD5 value; (8) the at least one client The ASR server performs an MD5 operation on the downloaded ZIP, and compares and responds to the MD5 value, and the same value as the MD5 indicates the integrity of the ZIP file; (9) The at least one client ASR server performs a ZIP decompression, And point the description of a profile to the new version, and then restart the entire system to complete the cloud update.

For the method for updating a speech recognition system in the cloud according to item 1 of the scope of the patent application, in the new version, the The steps of the automatic speech recognition system to analyze the speech are in the sequence of an audio preprocessing, an extraction of speech feature parameters, an acoustic model and a language model, wherein the acoustic model and the language model are the main bodies of the cloud update.