KR100484665B1

KR100484665B1 - Voice Synthesis Service System and Control Method Thereof

Info

Publication number: KR100484665B1
Application number: KR10-2003-0000280A
Authority: KR
Inventors: 강동규
Original assignee: (주) 코아보이스; 정보통신연구진흥원
Priority date: 2003-01-03
Filing date: 2003-01-03
Publication date: 2005-04-22
Also published as: KR20040062763A

Abstract

본 발명은 음성합성 서비스 시스템 및 그의 제어방법에 관한 것으로, 음성합성 서비스 요청과 함께 음성합성을 하고자 하는 문장을 입력받아 음절 또는 문장 단위로 합성을 요구하는 조건 및 상기 입력된 문장을 출력하고, 해당 조건에 따라 합성된 합성음을 입력받아 출력하는 서비스제어부와; 상기 서비스제어부로부터 문장이 입력되면 합성 요구 조건을 판독하여 그에 따라 합성을 한 후, 상기 서비스제어부로 출력하는 음성합성엔진으로 구성한 장치 및 그의 제어방법을 제공하여 합성문장을 합성시에 특정어휘에 대하여 읽는 방법을 음절단위로 발성할 수 있도록 표기를 하여 이 표기된 어휘에 대해서만 한 음절씩 발음하도록 하여 알아듣기 쉽도록 하고, 또한 기 저장한 음성데이터를 이용하여 문장단위 및 음절단위로 자연스런 합성음을 생성할 수 있도록 하여 시스템의 효율을 증대시킨다. The present invention relates to a voice synthesis service system and a control method thereof, and receives a sentence to be synthesized together with a voice synthesis service request and outputs a condition for requesting synthesis in syllables or sentences and outputs the input sentence. A service control unit which receives and outputs synthesized sound synthesized according to a condition; When a sentence is inputted from the service control unit, the synthesis request condition is read and synthesized accordingly, and then, a device composed of a voice synthesis engine that is output to the service control unit and a control method thereof are provided for a specific vocabulary during synthesis. In order to utter the reading method by syllable unit, the syllable is pronounced one syllable only for the written vocabulary so that it is easy to understand. Also, it is possible to generate natural synthesized sound by sentence unit and syllable unit by using pre-stored voice data. Increase the efficiency of the system.

Description

Voice Synthesis Service System and Control Method Thereof}

본 발명은 음성합성 서비스 시스템 및 그의 제어방법으로서, 특히 합성한 문장에 대하여 다시 듣고자 할 경우와 같은 문장 내의 특정 어휘에 대하여 보다 정확히 알아들을 수 있도록 합성엔진의 성능을 높여 서비스 품질을 향상시키기 위한 음성합성 서비스 시스템 및 그의 제어방법에 관한 것이다. The present invention relates to a voice synthesis service system and a control method thereof, in particular to improve the quality of service by increasing the performance of a synthesis engine so that it is possible to more accurately understand a specific vocabulary in a sentence, such as when it is desired to hear a synthesized sentence again. A synthesis service system and a control method thereof.

최근 많은 정보 서비스 업체들이 서비스 시스템의 지능화, 다양화, 효율화를 시키기 위해 음성합성엔진을 도입하고 있다.Recently, many information service companies are introducing voice synthesis engines to make service systems intelligent, diversified and efficient.

도 1은 종래 음성합성 서비스 시스템의 구성을 보인 예시도로서, 이에 도시된 바와 같이 일반적으로 사용되는 음성합성엔진은 음성합성된 합성음이 명료하지 않아 다시 듣고자 할 경우, 이전에 합성하기 위해 사용된 문장을 동일하게 사용하여 음성을 합성한다.1 is an exemplary view showing a configuration of a conventional speech synthesis service system, and as shown in FIG. 1, a speech synthesis engine generally used is previously used to synthesize a speech synthesized sound if it is not clear and wants to listen again. Synthesize speech using the same sentences.

그러나, 이 방법은 이전에 합성한 합성음과 동일한 명료하지 못한 합성음을 반복하여 청취하도록 하는 문제점이 있고, 또한 많이 사용되는 어휘나 문장을 합성할 경우에는 고품질이 유지되지만 다양한 분야에서는 명료도가 현저히 저하되는 단점이 있으며, 특히, 상품명, 인명, 주소 등과 같은 고유명사의 경우에서 두드러진다. 즉, 문장 중에 핵심적인 특정어휘를 보다 명료하게 발성할 수 있는 방법이 없어서 고유명사가 많이 사용되는 서비스에서는 적용이 어렵다는 문제점을 갖고 있었다. However, this method has a problem of repeatedly listening to the same unclear synthesized sound as the previously synthesized synthesized sound, and high quality is maintained when synthesizing a frequently used vocabulary or sentence, but the intelligibility is significantly reduced in various fields. There are drawbacks, especially in the case of proper nouns such as trade names, person names, addresses and the like. In other words, there is no way to clarify the core specific vocabulary in the sentence more clearly, it is difficult to apply in a service that is used a lot of proper nouns.

따라서, 본 발명은 상기와 같은 종래 기술의 문제점을 해결하기 위하여 창안한 것으로써, 합성문장을 합성엔진에 보낼 때 특정어휘에 대하여 읽는 방법을 음절단위로 발성할 수 있도록 표기를 하여 음성합성엔진에 보내면 합성엔진에서는 표기된 어휘에 대해서만 한 음절씩 발음하도록 하여 알아듣기 쉽도록 하고, 합성엔진측에서는 문장단위 및 음절단위로 자연스런 합성음을 생성할 수 있도록 하는 시스템 및 방법을 제공하는데 그 목적이 있다.Therefore, the present invention was devised to solve the above problems of the prior art, and when a compound sentence is sent to a synthesis engine, a method of expressing a method for reading a specific vocabulary in speech units can be expressed in the speech synthesis engine. The purpose of the present invention is to provide a system and method that allows the synthesized engine to pronounce the syllables only one syllable, making it easier to understand, and the synthesized engine to generate natural synthesized sounds in sentence units and syllable units.

이와 같은 목적을 달성하기 위한 본 발명 음성합성 서비스 시스템은, 음성합성 서비스 요청과 함께 음성합성을 하고자 하는 문장을 입력받아 합성 요구조건의 선택에 따라 음절 또는 문장 단위로의 합성을 요구하는 조건 및 상기 입력된 문장을 출력하는 서비스 시나리오 제어부와; 상기 합성 요구 조건의 판독결과에 따라 문장 단위로의 합성인 경우 이를 문장단위 합성 데이터베이스에 저장되어 있는 해당 문장단위의 음성데이터를 읽어들여 문장단위 합성을 하여 출력하는 문장단위합성부; 및 합성 요구 조건의 판독결과에 따라 음절 단위로의 합성인 경우 이를 음절단위 합성 데이터베이스에 저장되어 있는 해당 음절단위의 음성데이터를 읽어들여 음절단위 합성을 하여 출력하는 음절단위합성부로 구성한 것을 특징으로 한다.The speech synthesis service system of the present invention for achieving the above object is a condition for requesting synthesis in syllable or sentence units according to the selection of the synthesis requirements by receiving a sentence to be synthesized with the speech synthesis service request and the A service scenario controller for outputting an input sentence; A sentence unit synthesizing unit that reads the speech data of the sentence unit stored in the sentence unit synthesis database and synthesizes the unit of sentence in case of synthesis in units of sentences according to the reading result of the synthesis requirement condition; And a syllable unit synthesis unit for synthesizing the syllable unit based on the result of the synthesis requirement and reading the syllable unit speech data stored in the syllable unit synthesis database and synthesizing the syllable unit to output the syllable unit synthesis unit. .

그리고, 제어방법에 있어서는 서비스 명령으로 음성합성 서비스 요청과 함께 음성합성을 하고자 하는 문장이 입력되면 음절 또는 문장 단위로의 합성 요구 조건을 선택하도록 하는 합성선택단계와; 상기 합성선택단계의 선택에 따른 합성 요구 조건에 해당하는 조건데이터 및 상기 입력된 문장을 출력하는 합성요청단계와; 상기 합성요청단계의 합성 요구 조건을 판독하여 그에 따른 합성 동작을 수행하는 합성수행단계와; 상기 합성수행단계에서 합성된 합성음을 외부로 출력하는 합성음출력단계로 이루어진 것을 특징으로 한다.The control method includes: a synthesis selection step of selecting a synthesis requirement in syllable or sentence units when a sentence to be synthesized together with a voice synthesis service request is input as a service command; A synthesis request step of outputting condition data corresponding to the synthesis requirement condition and the input sentence according to the selection of the synthesis selection step; A synthesizing step of reading a synthesis requirement condition of the synthesis request step and performing a synthesis operation according thereto; Characterized in that the synthesized sound output step of outputting the synthesized sound synthesized in the synthesis performing step to the outside.

이하, 본 발명에 따른 일 실시예를 첨부한 도면을 참조하여 상세히 설명한다.Hereinafter, with reference to the accompanying drawings an embodiment according to the present invention will be described in detail.

도 2는 본 발명을 적용하기 위한 음성합성 시스템의 구성을 보인 예시도이다. 도 2를 참조하면 음성합성 서비스 요청과 함께 음성합성을 하고자 하는 문장을 입력받아 합성 요구조건의 선택에 따라 음절 또는 문장 단위로의 합성을 요구하는 조건 및 상기 입력된 문장을 출력하는 서비스 시나리오 제어부(110)와, 상기 서비스 시나리오 제어부(110)의 합성조건에 따라 최종 합성된 합성음을 외부로 출력하는 음성출력부(120)를 포함하는 서비스제어부(100)와; 합성 요구 조건의 판독결과에 따라 문장 단위로의 합성인 경우 이를 문장단위 합성 데이터베이스(220)에 저장되어 있는 해당 문장단위의 음성데이터를 읽어들여 문장단위 합성을 하여 출력하는 문장단위합성부(210)와; 합성 요구 조건의 판독결과에 따라 음절 단위로의 합성인 경우 이를 음절단위 합성 데이터베이스(240)에 저장되어 있는 해당 음절단위의 음성데이터를 읽어들여 음절단위 합성을 하여 출력하는 음절단위합성부(230)를 포함하는 음성합성엔진(200)으로 구성한다. Figure 2 is an exemplary view showing the configuration of a speech synthesis system for applying the present invention. Referring to FIG. 2, a service scenario controller for receiving a sentence to be synthesized together with a voice synthesis service request, and outputting the input sentence and a condition for synthesizing the syllable or sentence unit according to the selection of the synthesis requirement; A service control unit (100) including a voice output unit (120) for outputting the synthesized sound synthesized finally according to the synthesis condition of the service scenario control unit (110); In the case of synthesizing in sentence units according to the reading result of the synthesis requirements, the sentence unit synthesizing unit 210 reads the speech data of the corresponding sentence units stored in the sentence unit synthesis database 220 and performs sentence unit synthesis to output the sentence unit synthesis unit 210. Wow; The syllable unit synthesizing unit 230 reads the speech data of the syllable unit stored in the syllable unit synthesis database 240 and synthesizes the syllable unit and outputs the syllable unit synthesized in syllable units according to the result of the synthesis requirement. It consists of a speech synthesis engine 200 comprising a.

이와 같이 구성한 일실시예의 동작 과정을 설명하면 다음과 같다.Referring to the operation of the embodiment configured as described above are as follows.

먼저, 도 3은 본 발명에 따른 서비스 제어부의 동작 과정을 보인 예시도로서, 이에 도시한 바와 같이 서비스제어부(100)는 외부로부터 서비스 명령이 입력되면 서비스 시나리오 제어부(110)에서 음성합성 서비스인가를 판단한다(S100 ~ S101). First, Figure 3 is an exemplary view showing the operation of the service control unit according to the present invention, as shown in the service control unit 100 is a voice synthesis service in the service scenario control unit 110 when a service command is input from the outside Determine (S100 ~ S101).

상기 제1 단계의 판단결과 음성합성 서비스가 아닌 경우에는 해당하는 서비스 처리를 수행하고(S012), 음성합성 서비스인 경우에는 문장 입력후에 합성 요구 조건을 선택하도록 한다(S103 ~ S104).If it is determined that the first step is not a voice synthesis service, a corresponding service process is performed (S012). In the case of a voice synthesis service, a synthesis request condition is selected after inputting a sentence (S103 to S104).

상기 합성 요구 조건의 선택에 따라 음절단위 합성단위 또는 문장단위 합성을 나타내는 조건데이터를 생성하여 음성합성엔진(200)으로 출력한다(S105 ~ 107).Condition data indicating syllable unit synthesis unit or sentence unit synthesis is generated according to the selection of the synthesis requirement condition, and output to the speech synthesis engine 200 (S105 to 107).

이후, 도 4에 도시한 바와 같이 문장 입력과 함께 조건데이터가 입력되면 음성합성엔진(200)은 판독을 통하여 문장단위 합성인가를 판단한다(S200 ~ S202).Subsequently, when condition data is input together with sentence input as shown in FIG. 4, the speech synthesis engine 200 determines whether the sentence unit is synthesized through reading (S200 to S202).

상기 단계(S202)의 판단결과 문장단위 합성이 아닌 경우, 음절단위 합성으로 판단하여 음절단위합성부(230)에서 음절단위 합성 과정을 수행하고(S203), 상기 단계(S202)의 판단결과 문장단위 합성인 경우에는 문장단위합성부(210)에서 문장단위 합성 과정을 수행한다(S204).If the result of the determination in step S202 is not a sentence unit synthesis, the syllable unit synthesis unit performs a syllable unit synthesis process in the syllable unit synthesis unit 230 (S203). In the case of synthesis, the sentence unit synthesis unit 210 performs a sentence unit synthesis process (S204).

상기 합성 동작 수행시에 데이터베이스(220, 240)에 미리 저장한 음절단위 합성 데이터 및 문장단위 음성데이터를 이용하여 음절단위 또는 문장단위 합성을 수행한다.When the synthesis operation is performed, syllable unit or sentence unit synthesis is performed using syllable unit synthesis data and sentence unit voice data previously stored in the databases 220 and 240.

이후, 상기 단계(S203) 또는 단계(S204)의 수행이 완료된 후, 상기 음성합성엔진(200)은 합성음을 연결한 다음 합성을 계속 수행하는지의 여부를 판단하고(S205 ~ S206), 이 판단결과에 따라 처음부터 반복하여 수행하거나 합성 수행과정을 종료한 다음 상기 서비스제어부(100)로 출력한다(S207).Then, after the execution of the step (S203) or step (S204) is completed, the speech synthesis engine 200 determines whether to continue the synthesis after connecting the synthesized sound (S205 ~ S206), this determination result According to it is performed repeatedly from the beginning or after the synthesis execution process and output to the service control unit 100 (S207).

이후에 도 3에 도시한 바와 같이 상기 서비스제어부(100)는 합성음을 수신받아 음성출력부(120)를 통해 외부로 출력한 다음 서비스 종료인가를 판단하여 처음부터 다시 반복 수행하거나 서비스를 종료한다(S108 ~ S111).Thereafter, as shown in FIG. 3, the service control unit 100 receives the synthesized sound and outputs it to the outside through the voice output unit 120, and then determines whether the service is terminated, and repeats the service from the beginning or terminates the service ( S108-S111).

이와 같이 본 발명 음성합성 서비스 시스템 및 그의 제어방법은, 합성문장을 합성시에 특정어휘에 대하여 읽는 방법을 음절단위로 발성할 수 있도록 표기를 하여 이 표기된 어휘에 대해서만 한 음절씩 발음하도록 하여 알아듣기 쉽도록 하고, 또한 기 저장한 음성데이터를 이용하여 문장단위 및 음절단위로 자연스런 합성음을 생성할 수 있도록 하여 시스템의 효율을 증대시키는 등의 효과가 있다. As described above, the speech synthesis service system of the present invention and the method of controlling the same are described so that a method of reading a synthesized sentence with respect to a specific vocabulary is synthesized in syllable units so that the syllable is pronounced one syllable only for the marked vocabulary. In addition, it is possible to generate natural synthesized sounds in sentence units and syllable units using pre-stored voice data, thereby increasing the efficiency of the system.

도 1은 종래 음성합성 서비스 시스템의 구성을 보인 예시도, 1 is an exemplary view showing a configuration of a conventional voice synthesis service system;

도 2는 본 발명을 적용하기 위한 음성합성 시스템의 구성을 보인 예시도, 2 is an exemplary view showing the configuration of a speech synthesis system for applying the present invention,

도 3은 본 발명에 따른 서비스 제어부의 동작 과정을 보인 예시도, 3 is an exemplary view showing an operation process of a service control unit according to the present invention;

도 4는 본 발명에 따른 음성합성엔진의 동작 과정을 보인 예시도이다. 4 is an exemplary view illustrating an operation process of a speech synthesis engine according to the present invention.

*** 도면의 주요 부분에 대한 부호의 설명 ****** Explanation of symbols for the main parts of the drawing ***

100 : 서비스제어부 110 : 서비스 시나리오 제어부100: service control unit 110: service scenario control unit

120 : 음성출력부 200 : 음성합성엔진120: voice output unit 200: voice synthesis engine

210 : 문장단위합성부 220 : 문장단위 합성 데이터베이스210: sentence unit synthesis unit 220: sentence unit synthesis database

230 : 음절단위합성부 240 : 음절단위 합성 데이터베이스230: syllable unit synthesis unit 240: syllable unit synthesis database

Claims

A service scenario control unit which receives a sentence to be synthesized together with a voice synthesis service request and outputs the condition for requesting synthesis by syllable or sentence unit according to the selection of the synthesis requirement and the input sentence;

A sentence unit synthesizing unit that reads the speech data of the sentence unit stored in the sentence unit synthesis database and synthesizes the unit of sentence in case of synthesis in units of sentences according to the reading result of the synthesis requirement condition; And

A syllable unit synthesis unit that reads the speech data of the syllable unit stored in the syllable unit synthesis database and synthesizes the syllable unit and outputs the syllable unit synthesized in syllable units according to the result of the synthesis requirement; Voice synthesis service system comprising a.

A synthesis selection step of selecting a synthesis requirement in syllable or sentence units when a sentence to be synthesized with the voice synthesis service request is input as a service command;

A synthesis request step of outputting condition data corresponding to the synthesis requirement condition and the input sentence according to the selection of the synthesis selection step;

A synthesizing step of reading a synthesis requirement condition of the synthesis request step and performing a synthesis operation according thereto;

Speech synthesis service control method characterized in that consisting of a synthesized sound output step for outputting the synthesized sound synthesized in the synthesis step.

The method of claim 2, wherein the synthesis selection step

A first step of determining whether a voice synthesis service is input when a service command is input;

If the voice synthesis service is not the result of the determination of the first step, a corresponding service process is performed, and in the case of the voice synthesis service, the voice synthesis service comprises a second step of selecting a synthesis requirement after inputting a sentence. Control method.

The method of claim 2, wherein the synthesis request step

And generating condition data indicating syllable unit synthesis or sentence unit synthesis according to the synthesis requirement.

The method of claim 2, wherein the performing of the synthesis

A first step of determining whether sentence units are synthesized through reading when condition data is input together with a sentence input;

A second step of judging syllable unit synthesis and performing syllable unit synthesis according to the result of the determination in the first step, if not sentence unit synthesis;

A third step of performing a sentence unit synthesis process according to the determination result of the first step, in accordance with sentence unit synthesis;

A fourth step of determining whether to perform synthesis after connecting the synthesized sound after the execution of the second step or the third step is completed;

And a fifth step of repeatedly performing from the beginning or ending the synthesis process according to the determination result of the fourth step.

The method of claim 2, wherein the performing of the synthesis

And a syllable unit or sentence unit synthesis using syllable unit synthesis data and sentence unit voice data stored in advance when performing a synthesis operation.