JPH0540494A

JPH0540494A - Composite voice tester

Info

Publication number: JPH0540494A
Application number: JP3196591A
Authority: JP
Inventors: Jun Kametani; 潤亀谷; Hisae Hashimoto; 久恵橋本
Original assignee: NEC Corp; NEC Engineering Ltd
Current assignee: NEC Corp; NEC Engineering Ltd
Priority date: 1991-08-06
Filing date: 1991-08-06
Publication date: 1993-02-19

Abstract

PURPOSE:To decide whether a composite voice is satisfactory or not, based on a result of matching with a pattern sequence of an inputted composite voice by registering in advance the pattern sequence of a standard composite voice. CONSTITUTION:The tester is constituted of a voice input part 1 for digitizing an inputted composite voice, a voice analyzing part 2 for extracting a feature pattern sequence of an input voice, a pattern memory part 3 for storing in advance a standard feature pattern sequence, a pattern matching part 4 for executing DP matching between the input and the standard feature pattern sequence, a result deciding part 5 for comparing similarity of a matching result with a prescription and deciding it, and a whole control part 6 for executing control of each constituting unit and a communication to a host.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は音声合成装置の自動試験
装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech synthesizer automatic test apparatus.

【０００２】[0002]

【従来の技術】従来、音声合成装置の検査、試験におい
ては、最終的な試験として決められた単語、語句の合成
音声出力信号を検査者が実際に試聴し、合成音声に誤り
の無いことを確認している。2. Description of the Related Art Conventionally, in the inspection and testing of a speech synthesizer, it is necessary for an inspector to actually listen to a synthesized speech output signal of a word or a phrase determined as a final test and check that the synthesized speech has no error. I have confirmed.

【０００３】[0003]

【発明が解決しようとする課題】しかしながら、この従
来の検査方法では、検査者が被試験対象装置にかかりっ
きりで検査を行なわなければならず、検査工数の増加と
いう課題があった。However, this conventional inspection method has a problem in that the inspector must perform the inspection on the device under test all the time, which increases the number of inspection steps.

【０００４】また、音声合成装置の素片データを格納し
ておくデータメモリや、合成を行うシグナルプロセッサ
等の試験では、自動診断用ソウトウェアの導入がはから
れているにも拘らず、最終的な合成音声出力の確認のみ
人手を要するために、検査工程全体のスループットが向
上しない原因となっていた。Further, in the test of the data memory for storing the segment data of the speech synthesizer, the signal processor for synthesizing, etc., although the software for automatic diagnosis is introduced, the final result is obtained. Since it requires manpower only to check the output of the synthesized voice, the throughput of the entire inspection process is not improved.

【０００５】本発明は従来の上記実情に鑑みてなされた
ものであり、従って本発明の目的は、従来の技術に内在
する上記諸課題を解決することを可能とした新規な合成
音声試験器を提供することにある。The present invention has been made in view of the above-mentioned conventional circumstances, and therefore, an object of the present invention is to provide a novel synthetic speech tester capable of solving the above-mentioned problems inherent in the prior art. To provide.

【０００６】[0006]

【課題を解決するための手段】上記目的を達成するため
に、本発明に係る合成音声試験器は入力される合成音声
信号をディジタル化し始終端を決定する音声入力部と、
入力されたディジタル化合成音声から音響的な特徴パタ
ーン系列を抽出する音声分析部と、特徴パターン系列を
格納、登録しておくパターンメモリ部と、あらかじめ登
録しておいて特徴パターン系列と入力合成音声から抽出
した特徴パターン系列との間でパターンマッチングを行
うパターンマッチング部と、マッチングの結果として得
られるパターン系列間の類似度から入力合成音声の正当
性を判定する結果判定部と、本発明の各構成ユニットを
制御する全体制御部とを備えて構成される。In order to achieve the above object, a synthetic speech tester according to the present invention comprises a speech input section for digitizing a synthesized speech signal to be input and determining the start and end points,
A voice analysis unit that extracts an acoustic characteristic pattern sequence from the input digitized synthetic speech, a pattern memory unit that stores and registers the characteristic pattern sequence, a characteristic pattern sequence that is registered in advance, and an input synthetic speech. A pattern matching unit that performs pattern matching with the characteristic pattern sequence extracted from the result pattern, a result determination unit that determines the validity of the input synthesized speech from the similarity between the pattern sequences obtained as a result of matching, And an overall control unit for controlling the constituent units.

【０００７】[0007]

【実施例】次に本発明をその好ましい一実施例について
図面を参照して具体的に説明する。BEST MODE FOR CARRYING OUT THE INVENTION The present invention will now be described in detail with reference to the accompanying drawings with reference to the accompanying drawings.

【０００８】図１は本発明の一実施例を示すブロック構
成図である。FIG. 1 is a block diagram showing an embodiment of the present invention.

【０００９】図１を参照するに、音声入力部１にはマイ
クロフォン７を通して被試験対象である音声合成装置
（図示せず）からの出力合成音声が入力され、ここで合
成音声のディジタル化、始終端検出が行われる。音声入
力部１でディジタル化された合成音声は、音声分析部２
に送られ、メルケプストラム分析等の音響分析によって
特徴パターン系列に変換される。パターンメモリ部３
は、あらかじめ音声分析部２によって抽出した特徴パタ
ーン系列を格納しておくメモリである。パターンマッチ
ング部４は、音声分析部２で得られた入力合成音声の特
徴パターン系列とパターンメモリ部３に登録されている
特徴パターン系列との間でＤＰマッチングを実行する。
このパターンマッチング部４のＤＰマッチングにより得
られたパターン系列間の類似度は、結果判定部５におい
て規定の類似度と比較され、規定以上の類似度を示す入
力合成音声に対しては合格と判定して、全体制御部６に
結果を通知する。Referring to FIG. 1, an output synthesized voice from a voice synthesizer (not shown), which is an object to be tested, is input to a voice input section 1 through a microphone 7, where the synthesized voice is digitized and the whole process is started. Edge detection is performed. The synthesized voice digitized by the voice input unit 1 is used as the voice analysis unit 2
And is converted into a feature pattern series by acoustic analysis such as mel cepstrum analysis. Pattern memory unit 3
Is a memory for storing the characteristic pattern series extracted by the voice analysis unit 2 in advance. The pattern matching unit 4 executes DP matching between the characteristic pattern series of the input synthesized speech obtained by the speech analysis unit 2 and the characteristic pattern series registered in the pattern memory unit 3.
The similarity between the pattern sequences obtained by the DP matching of the pattern matching unit 4 is compared with the prescribed similarity in the result determination unit 5, and it is determined that the input synthesized speech showing the similarity higher than the prescribed is acceptable. Then, the overall control unit 6 is notified of the result.

【００１０】全体制御部６は、ホスト８からの指示に基
づき特徴パターン系列のパターンメモリ部３への登録、
パターンマッチング部４がマッチングテンプレートに使
用する特徴パターン系列の指定、各構成ユニットの動作
シーケンスの制御等を行う。The overall control unit 6 registers a characteristic pattern sequence in the pattern memory unit 3 based on an instruction from the host 8,
The pattern matching unit 4 specifies the characteristic pattern series used for the matching template, controls the operation sequence of each constituent unit, and the like.

【００１１】以下に本実施例の動作を簡単に説明する。The operation of this embodiment will be briefly described below.

【００１２】本実施例により音声合成装置の出力合成音
声の検査を行う場合には、あらかじめ標準となる合成音
声の特徴パターン系列を登録する必要がある。そのため
にはまずマイクロフォン７を通じ音声入力部１に対して
標準の合成音声を単語または語句単位に入力し、ディジ
タル化、始終端の決定を行った後、音声分析部２におい
て特徴パターン系列に変換してパターンメモリ部３に格
納する。この際に全体制御部６は、ホスト８よりこの標
準合成音声に対応したフレーズ番号を受け取り、特徴パ
ターン系列と一緒にパターンメモリ部３に登録する。When the output synthesized speech of the speech synthesizer is inspected according to this embodiment, it is necessary to register the standard characteristic pattern series of synthesized speech in advance. For this purpose, first, a standard synthesized voice is input to the voice input unit 1 through the microphone 7 in units of words or phrases, digitized and the start and end are determined, and then converted into a feature pattern sequence in the voice analysis unit 2. And stores it in the pattern memory unit 3. At this time, the overall control unit 6 receives the phrase number corresponding to this standard synthesized voice from the host 8 and registers it in the pattern memory unit 3 together with the characteristic pattern series.

【００１３】音声合成装置の出力合成音声の検査を行う
際には、まず全体制御部６が入力される合成音声のフレ
ーズ番号をホスト８より受け取り、パターンマッチング
部４に指定を行う。次にマイクロフォン７を通じて音声
入力部１に入力される合成音声をディジタル化後始終端
決定し、音声分析部２で特徴パターン系列に変換してパ
ターンマッチング部４に転送する。パターンマッチング
部４は、全体制御部６より指定されたフレーズ番号に対
応する特徴パターン系列と、音声分析部２から送られて
きた特徴パターン系列の間でＤＰマッチングを実行し、
パターン系列間の類似度を計算して結果を結果判定部５
に送る。When the output synthetic speech of the speech synthesizer is inspected, the overall control section 6 first receives the phrase number of the synthetic speech input from the host 8 and designates it to the pattern matching section 4. Next, the synthesized voice input to the voice input unit 1 through the microphone 7 is digitized, and the start and end are determined, and the voice analysis unit 2 converts it into a characteristic pattern sequence and transfers it to the pattern matching unit 4. The pattern matching unit 4 executes DP matching between the characteristic pattern sequence corresponding to the phrase number designated by the overall control unit 6 and the characteristic pattern sequence sent from the voice analysis unit 2,
The result judging unit 5 calculates the similarity between the pattern series and outputs the result.
Send to.

【００１４】結果判定部５は、送られてきた類似度とあ
らかじめ全体制御部６により規定された類似度を比較
し、規定値より大きければ合格判定を、小さければ不合
格判定を全体制御部６に送る。全体制御部６は、受け取
った合否判定を該当するフレーズ番号に添えてホスト８
に通知し、次のホスト８からの指示を待つ。The result judging section 5 compares the sent similarity with the similarity defined in advance by the overall control section 6, and if it is larger than the specified value, the pass determination is made. Send to. The overall control unit 6 adds the received pass / fail judgment to the corresponding phrase number
And waits for the next instruction from the host 8.

【００１５】[0015]

【発明の効果】以上説明した様に、本発明によれば、あ
らかじめ標準となる合成音声の音響的特徴パターン系列
をメモリに登録しておき、それに対応した合成音声の入
力を分析して得た特徴パターン系列の間でパターンマッ
チングを行った結果の類似度の大小から合成音声の正当
性を検証するために、検査者の人手を介さず自動的に音
声合成装置の検査を行うことが可能となり、検査工数の
削減をはかれるという効果が得られる。As described above, according to the present invention, a standard acoustic feature pattern sequence of synthesized speech is registered in a memory in advance, and the input of the synthesized speech corresponding thereto is obtained. In order to verify the correctness of synthesized speech based on the degree of similarity of the results of pattern matching between feature pattern series, it becomes possible to automatically inspect the speech synthesizer without human intervention by the inspector. The effect is that the number of inspection steps can be reduced.

【００１６】また本発明によれば、音声合成特有の再現
性の高さと安定性を考えると、正しい合成音声に対して
は非常に高い類似度を示すために、合成音声出力の正当
性すなわち被試験対象装置の良否を高い精度で判定でき
るという効果が得られる。Further, according to the present invention, considering the high reproducibility and stability peculiar to speech synthesis, the correctness of the synthesized speech output, that is, the correctness of the synthesized speech output, is shown because a very high degree of similarity is shown for a correct synthesized speech. The effect that the quality of the device under test can be determined with high accuracy is obtained.

[Brief description of drawings]

【図１】本発明の一実施例を示すブロック構成図であ
る。FIG. 1 is a block diagram showing an embodiment of the present invention.

[Explanation of symbols]

１…音声入力部２…音声分析部３…パターンメモリ部４…パターンマッチング部５…結果判定部６…全体制御部７…マイクロフォン８…ホスト 1 ... Voice input unit 2 ... Voice analysis unit 3 ... Pattern memory unit 4 ... Pattern matching unit 5 ... Result determination unit 6 ... Overall control unit 7 ... Microphone 8 ... Host

Claims

[Claims]

1. A means for digitizing a synthetic voice input from a microphone, a means for extracting an acoustic feature pattern from a digitized voice signal, a storage means for storing the acoustic feature pattern, and a pre-stored unit. Means for performing pattern matching between the feature pattern being input and the feature pattern obtained from the input voice, means for determining the validity of the input synthetic voice from the similarity of the pattern matching results, It has a control means for controlling each unit, registers a correct characteristic pattern of synthesized speech in advance, and performs pattern matching with the synthesized speech output from the synthesized speech apparatus to be tested to inspect the apparatus to be tested. A synthetic speech tester characterized by performing automatically.

2. The control means further receives a phrase number corresponding to a standard synthesized voice from the host, and registers the phrase number in the storage means together with the characteristic pattern series. The synthetic speech tester described.