KR970003093B1 - Synthesis unit drawing-up method for high quality korean text to speech transformation - Google Patents

Synthesis unit drawing-up method for high quality korean text to speech transformation Download PDF

Info

Publication number
KR970003093B1
KR970003093B1 KR1019930030028A KR930030028A KR970003093B1 KR 970003093 B1 KR970003093 B1 KR 970003093B1 KR 1019930030028 A KR1019930030028 A KR 1019930030028A KR 930030028 A KR930030028 A KR 930030028A KR 970003093 B1 KR970003093 B1 KR 970003093B1
Authority
KR
South Korea
Prior art keywords
syllable
type
phoneme
unit
vowel
Prior art date
Application number
KR1019930030028A
Other languages
Korean (ko)
Other versions
KR950020391A (en
Inventor
지민제
최운천
김상훈
Original Assignee
양승택
재단법인 한국전자통신연구소
조백제
한국전기통신공사
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 양승택, 재단법인 한국전자통신연구소, 조백제, 한국전기통신공사 filed Critical 양승택
Priority to KR1019930030028A priority Critical patent/KR970003093B1/en
Publication of KR950020391A publication Critical patent/KR950020391A/en
Application granted granted Critical
Publication of KR970003093B1 publication Critical patent/KR970003093B1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Machine Translation (AREA)
  • Document Processing Apparatus (AREA)

Abstract

Context dependent unit editing method for the text-to-speech converter about Korean, in order to increase the intelligibility and the naturalness of composite speech. The said method comprising the steps of: selecting the syllable type and the syllable head of the selected context dependent unit according to the analysis of rhythm type and the restriction of phoneme connection and phoneme type about Korean; deciding the type of detail using part according to the appearance environment of syllable, syllable numbers, rhythm, and intonation type; and displaying the phoneme row in the voice wave of the context dependent unit to enable to adjust the pitch length of phoneme or syllable.

Description

고품질 한국어 문장음성 변환을 위한 합성단위(CDU)작성방법How to Write Synthetic Unit (CDU) for High Quality Korean Sentence Conversion

제1도는 CDU 작성방법의 전체 흐름도,1 is an overall flowchart of a CDU creation method,

제2도는 합성단위 선정 흐름도,2 is a flow chart of the synthesis unit selection,

제3도는 세부 사용부 결정 흐름도,3 is a detailed usage decision flow chart,

제4도는 세부 사용부 분절 및 표기 흐름도.4 is a detailed use segment and notation flowchart.

본 발명은 한국어 문장음성 변환을 위한 합성 단위(CDU)의 작성방법에 관한 것이다.The present invention relates to a method of preparing a synthesis unit (CDU) for Korean sentence-to-speech conversion.

본 발명의 목적은 한국어 무제한 문장/음성(Text-to-Speech)변환장치에서 합성음의 이해도(intelligibility)와 자연성(naturalness)을 대폭 향상시킬 수 있는 새로운 합성단위의 구성방법을 제공하는데 있다.An object of the present invention is to provide a method for constructing a new synthesis unit that can greatly improve the understanding and naturalness of a synthesized speech in an unlimited text-to-speech converter.

상기 목적을 달성하기 위하여 본 발명은, 한국어 음운구조 형태와 음소연결의 제약 및 운율형태 분석에 의하여 선정된 합성단위의 음절형태 및 음절항목을 선정하는 제1단계와, 각 합성단위에서 음절의 출현환경, 음절수, 장단, 억양형태에 의해 결정된 세부 사용부의 형태를 결정하는 제2단계와, 합성단위의 음성파형에 세부 사용부를 표기하되 동일 형태는 동일기호, 합성단위의 세부 사용부의 연결이후 음절이나 음소 단위의 길이 피치(Pitch)조절이 가능하도록 음소열을 표기하는 제3단계를 포함한다.In order to achieve the above object, the present invention provides a first step of selecting a syllable type and syllable item of a synthesized unit selected by Korean phonological structure type, phoneme connection constraint, and rhythm type analysis, and the appearance of syllables in each synthesized unit. The second step of determining the shape of the detailed use part determined by the environment, the number of syllables, the length and shortness, and the intonation type, and the detailed use part in the speech waveform of the synthesized unit, but the same form is the syllable after the connection of the detailed use unit of the same symbol and the synthesized unit. Or a third step of marking a phoneme sequence to enable length pitch adjustment of a phoneme unit.

이하 첨부된 도면을 참조하여 본 발명의 일실시예를 상세히 설명한다.Hereinafter, an embodiment of the present invention will be described in detail with reference to the accompanying drawings.

합성단위의 작성을 위해서는 CPU와 보조기억장치가 있는 메인 프로세서에서 음성 A/D, D/A 변환기를 부착하여, 음성의 파형과 spectrograph, pitch, energy를 분석 디스플레이하는 장치가 필요하다.In order to create a synthesis unit, a device for analyzing and displaying the waveform, spectrograph, pitch, and energy of the voice is required by attaching the voice A / D and D / A converters from the main processor with the CPU and the auxiliary memory.

제1도는 고품질 한국어 문장 음성변환을 위한 합성단위 작성방법의 전체 흐름도로서 1) 사용음절 선정 2) 세부 사용부 결정 3) 합성단위 음성파형에 사용부를 분절 및 표기하는 세단계의 과정을 거친다. 제1과정인 사용음절선정에서는 한국어의 음운 및 운율환경을 분석하여 합성에 필요한 합성단위를 선정하며, 제2과정인 세부 사용부결정에서는 사용음절 내에서 환경에 따라 실제 쓰일 세부 사용부를 결정한다. 그리고, 제3과정인 합성단위 음성파형에 사용부를 분절 및 표기에서는 사용음절의 실제 음성파형에 세부 사용부의 분절과 표기를 하여 환경에 따른 합성단위(CDU : Context Dependent Units)를 구성한다.FIG. 1 is a flow chart of a method for preparing a synthesized unit for high quality Korean sentence speech conversion. The process comprises 1) selection of syllables for use 2) determination of detailed use units 3) segmentation and marking of use units in synthesized unit speech waveforms. In the first syllable selection, the syllable and rhyme environment of Korean are analyzed and the synthesis unit necessary for synthesis is selected. In the second syllable decision, the detailed usage part is decided according to the environment within the syllable. In the third process, the segmentation and the notation of the use unit for the synthesized unit speech waveform form the synthesis units (CDU: Context Dependent Units) according to the environment by using the segmentation and the notation of the detail use unit on the actual speech waveform of the used syllable.

제2도는 한국어 문장/음성 변환에서 사용될 음절을 선정하는 흐름도이다. 음운구조 형태분석에서는 한국어 전자 발음사전(약 6만 단어)을 이용하여 한국어의 음운구조형태는 1음절의 경우에는(자음)(반모음)모음(자음)로 구성되며, 제2음절의 경우는(자음)(반모음)모음(자음)(자음)(반모음)모음(자음)의 형태가 된다. 여기에서()는 옵셔널(optional)한 조건을 뜻한다. 모음과 모음 사이의 자음군은 첫자음에 따라 다음 자음이 제한되는 지배성을 갖는다. 3음절 이상은 1음절 구조와 2음절 구조의 결합으로 구성할 수 있다.2 is a flowchart for selecting syllables to be used in Korean sentence / voice conversion. In phonological structure analysis, Korean phonetic dictionary (about 60,000 words) is used, and the phonological structure of Korean is composed of (consonants) (half vowels) vowels (consonants) in the case of one syllable, and in the case of the second syllable ( Consonants) (half vowels) Vowels (consonants) (consonants) (half vowels) Here, () denotes an optional condition. Consonant groups between vowels and vowels dominate the consonants of the next consonant. Three syllables or more can be composed of a combination of one-syllable structure and two-syllable structure.

운율형태분석은 각종 형태의 문장 낭독체의 운율형태를 담고 있는 한국어 운율 데이타베이스를 사용하여 이루어졌다. 억양구의 말미억양형태를 크게 상승형, 평탄형, 하강형으로 나누었다. 상승형과 하강형은 음원에 해당하는 성대 진동의 특성이 변하며, 포맨트 특성도 변하게 된다.Rhythmic form analysis was performed using a Korean rhyme database containing rhythmic forms of sentence reading. The accent form of the accent was divided into rising type, flat type and descending type. The rising type and the falling type change the characteristics of the vocal cord vibration corresponding to the sound source, and the characteristic of the form changes.

따라서, 보통 높이의 음에서 피치(Pitch)만 조절하는 방법으로는 상승형과 하강형의 음성을 구현하기 어렵다. 기존의 합성단위는 이와같은 운율환경을 고려하지 않았다. 이를 보완하는 방법으로, 상승형과 하강형의 합성단위가 포함되어야 한다.Therefore, it is difficult to implement a rising voice and a falling voice by adjusting only a pitch in a sound having a normal height. Existing synthetic units do not consider this rhyme environment. As a complementary method, the synthesizing units of rising type and falling type should be included.

다음은 한국어의 음운구조 형태와 음소연결의 제약 및 운율형태 분석에 의하여 선정된 음절형태이다.The following is a syllable type selected by analyzing the phonological structure of Korean, the constraints of phoneme connection, and the form of rhyme.

모음(1-8) : 8개Vowels (1-8): 8

반모음+모음(9-20) : 12개Half vowel + vowel (9-20): 12

자음+모음(21-164) : 144개Consonants + Vowels (21-164): 144

자음+반모음+모음(165-326) : 162개Consonant + half vowel + vowel (165-326): 162

모음+자음(327-466) : 140개Vowel + Consonant (327-466): 140

모음+자음(467-530) : 64개Vowel + Consonant (467-530): 64

모음+반모음+모음(에)(531-546) : 16개Vowels + Half Bars + Vowels (531-546): 16

모음+자음+모음(에)(547-698) : 152개Vowels + Consonants + Vowels (547-698): 152

모음(에)+반모음+모음(699-710) : 12개Vowel (E) + Half Bar + Collection (699-710): 12

모음(에)+자음+모음(711-843) : 133개Vowel (+) + Consonant + Collect (711-843): 133

모음(에)+자음+반모음+모음(844-1053) : 210개Vowels (+) + Consonants + Half Bars + Vaults (844-1053): 210

모음+비음/ㅁ/+자음(ㄸ)+모음(아)(1054-1073) : 20개Vowel + nasal / ㅁ / + consonant (ㄸ) + vowel (ah) (1054-1073): 20

모음(아)+비음/ㅁ/+연성자음+모음(으)(1074-1081) : 8개Vowel (N) + Rain / ㅁ / + Composite Consonant + Collect (1074-1081): 8

모음+비음/ㄴ/+자음(ㄸ)+모음(아)(1082-1101) : 20개Vowel + navel / b / + consonant (ㄸ) + vowel (ah) (1082-1101): 20

모음(아)+비음/ㄴ/+연성자음+모음(으)(1102-1109) : 8개Vowel (N) + Non / B / + Composite Consonant + Collect (1102-1109): 8

모음+비음/ㅇ/+자음(ㄸ)+모음(아)(1110-1129) : 20개Vowel + navel / o / + consonant (ㄸ) + vowel (ah) (1110-1129): 20

모음(아)+비음/ㅇ/+연성자음+모음(으)(1130-1137) : 8개Vowel (N) + Rain / ㅇ / + Combined Consonant + Collect (1130-1137): 8

모음+유음/ㄹ/+자음(ㄸ)+모음(아)1138-1157) : 20개Vowel + Yumin / D / + Consonant (ㄸ) + Vowel (A) 1138-1157): 20

모음(아)+유음/ㄹ/+자음(ㄸ)+모음(으)(1158-1164) : 7개Vowel (Y) + Yoon / ㄹ / + Consonant (ㄸ) + vowel (1158-1164): 7

모음+유음/ㄹ/+유음/ㄹ/+모음(에)(1165-1184) : 20개Vowels + Yoon / ㄹ / + Yoon / ㄹ / + vowels (e) (1165-1184): 20

모음(에)+유음/ㄹ/+유음/ㄹ/+모음(1185-1204) : 20개Vowel (E) + Yoon / ㄹ / + Yoon / ㄹ / + Collection (1185-1204): 20

상승억양[모음](1205-1212) : 8개Rising accent [collection] (1205-1212): 8

상승억양[반모음+모음](1213-1224) : 12개Rising accent [half vowel + vowel] (1213-1224): 12

상승억양[모음+유성자음](1225-1256) : 32개Rising accent [vowel + meteor consonant] (1225-1256): 32

하강억양[모음](1257-1264) : 8개Descent accent [collection] (1257-1264): 8

하강억양[반모음+모음](1265-1276) : 12개Descent accent [half vowel + vowel] (1265-1276): 12

하강억양[모음+유성자음](1277-1308 : 32개Descent accent [vowel + meteor consonant] (1277-1308: 32

휴지(1309) : 1개Pause (1309): 1

1309개1309

제3도는 각 합성단위에서 음절의 출현환경, 음절수, 장단, 억양형태 등을 결정하는 과정이다. 선행음절과 후행음절의 유무, 정단모음, 억양형태에 따라 세부 사용부의 형태가 결정된다. 세부 사용부는 장음단독형, 단음단독형, 장음도입형, 연결형, 상승종결형, 평탄종결형, 하강종결형으로 구분된다.3 is a process of determining the appearance environment of syllables, the number of syllables, the length and length of the syllable, and the type of intonation in each synthesis unit. The shape of the detail use part is determined by the presence or absence of leading syllables and trailing syllables, apical vowels, and intonation. The detailed use part is divided into long sound single type, single sound single type, long sound introduction type, connection type, ascending termination type, flat termination type and descending termination type.

기존의 합성단위는 이와같은 음절의 출현환경, 음절수, 장단, 억양형태를 고려하지 않았다.Existing synthetic units do not consider the appearance of syllables, the number of syllables, the length and length of the syllable, and the intonation type.

제4도는 합성단위의 음성파형에 스펙트로그램(Spectrogram), 피치(Pitch), 에너지(Energy)등을 참고하여 세부 사용부를 표기하는 과정을 보이고 있다. 각 합성단위에 사용부 형태에 따라 음성합성시 연결점이 되는 각 부분을 포멘트(Formant), 피치(Pitch), 에너지(Energy)등의 연결을 고려하여 세부 표기한다. 한합성단위에는 중복기호를 사용하지 않으며 동일 형태는 동일기호로 하며, 연결부분은 같은 기호로 연결되도록 고안하였다. 또한, 합성단위의 세부 사용부의 연결후 음절이나 음소단위의 길이 피치 조절이 가능하도록 음소열이 표기되도록 하였다.4 shows a process of marking a detailed use part by referring to a spectrogram, pitch, energy, etc. on the speech waveform of the synthesized unit. According to the type of unit used for each synthesis unit, each part that is the connection point in speech synthesis is described in detail considering the connection of formant, pitch, energy, etc. Duplicate symbols are not used in the unit of synthesis, and the same form is designated by the same symbol, and the connecting parts are designed to be connected by the same symbol. In addition, the phoneme string is marked so that the length pitch of syllables or phoneme units can be adjusted after connection of the detailed unit of the synthesis unit.

따라서, 상기와 같은 처리절차에 의해 수행되어 동작하는 본 발명은, 합성음의 이해도와 자연성이 대폭 향상된 고품질의 한국어 무제한 문장/음성 변환 장치의 개발을 가능하게 하며 정보통신에서의 미디어 변환, 장애자용 음성 낭독기, 음성교정기 등에 활용될 수 있으며, 이에 따라 정보통신에서의 고품질 미디어 변환이 가능해지고, 장애자의 사회적응을 용이하게 하는 효과가 있다.Accordingly, the present invention, which is performed and operated by the above-described processing procedure, enables the development of a high quality Korean unlimited sentence / voice conversion apparatus with significantly improved comprehension and naturalness of synthesized speech, and media conversion and speech for the disabled in information and communication. It can be utilized in a reader, a voice calibrator, etc., thereby enabling high quality media conversion in information and communication, and facilitating social adaptation of the disabled.

Claims (1)

한국어 음운구조 형태와 음소연결의 제약 및 운율형태 분석에 의하여 선정된 합성단위의 음절형태 및 음절항목을 선정하는 제1단계 ; 각 합성단위에서 음절의 출현환경, 음절수, 장단, 억양형태에 의해 결정된 세부 사용부의 형태를 결정하는 제2단계 ; 및 합성단위의 음성파형에 세부 사용부를 표기하되 동일 형태는 동일기호, 합성단위의 세부 사용부의 연결이후 음절이나 음소단위의 길이 피치(Pitch)조절이 가능하도록 음소열을 표시하는 제3단계를 포함하는 것을 특징으로 하는 고품질 한국어 문장음성 변환을 위한 합성단위(CDU) 작성방법.A first step of selecting syllable types and syllable items of the synthesized unit selected by the analysis of the Korean phoneme structure form, the constraints of the phoneme connection, and the form of rhyme; A second step of determining the shape of the detailed use part determined by the appearance environment of the syllables, the number of syllables, the length and length, and the intonation type in each synthesis unit; And a third step of displaying a phoneme sequence so that a detailed use part is indicated on the speech waveform of the synthesized unit, but the same form is used to control the length pitch of the syllable or phoneme unit after connection of the same symbol, the detailed use unit of the synthesized unit. How to write a synthesis unit (CDU) for high-quality Korean sentence-to-speech, characterized in that for.
KR1019930030028A 1993-12-27 1993-12-27 Synthesis unit drawing-up method for high quality korean text to speech transformation KR970003093B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1019930030028A KR970003093B1 (en) 1993-12-27 1993-12-27 Synthesis unit drawing-up method for high quality korean text to speech transformation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1019930030028A KR970003093B1 (en) 1993-12-27 1993-12-27 Synthesis unit drawing-up method for high quality korean text to speech transformation

Publications (2)

Publication Number Publication Date
KR950020391A KR950020391A (en) 1995-07-24
KR970003093B1 true KR970003093B1 (en) 1997-03-14

Family

ID=19373042

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1019930030028A KR970003093B1 (en) 1993-12-27 1993-12-27 Synthesis unit drawing-up method for high quality korean text to speech transformation

Country Status (1)

Country Link
KR (1) KR970003093B1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7805307B2 (en) 2003-09-30 2010-09-28 Sharp Laboratories Of America, Inc. Text to speech conversion system

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100564740B1 (en) * 2002-12-14 2006-03-27 한국전자통신연구원 Voice synthesizing method using speech act information and apparatus thereof

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7805307B2 (en) 2003-09-30 2010-09-28 Sharp Laboratories Of America, Inc. Text to speech conversion system

Also Published As

Publication number Publication date
KR950020391A (en) 1995-07-24

Similar Documents

Publication Publication Date Title
JP3408477B2 (en) Semisyllable-coupled formant-based speech synthesizer with independent crossfading in filter parameters and source domain
Macchi Issues in text-to-speech synthesis
Kayte et al. Di-phone-based concatenative speech synthesis systems for marathi language
KR970003093B1 (en) Synthesis unit drawing-up method for high quality korean text to speech transformation
Hoffmann et al. Evaluation of a multilingual TTS system with respect to the prosodic quality
US6829577B1 (en) Generating non-stationary additive noise for addition to synthesized speech
Kasparaitis Diphone Databases for Lithuanian Text‐to‐Speech Synthesis
Venkatagiri et al. Digital speech synthesis: Tutorial
Waghmare et al. Analysis of pitch and duration in speech synthesis using PSOLA
Furtado et al. Synthesis of unlimited speech in Indian languages using formant-based rules
Lukaszewicz et al. Microphonemic method of speech synthesis
KR0134707B1 (en) Voice synthesizer
Datta et al. Epoch Synchronous Overlap Add (ESOLA)
Kaur et al. BUILDING AText-TO-SPEECH SYSTEM FOR PUNJABI LANGUAGE
Chouireb et al. DEVELOPMENT OF A PROSODIC DATABASE FOR STANDARD ARABIC.
Chowdhury Concatenative Text-to-speech synthesis: A study on standard colloquial bengali
KR100202539B1 (en) Voice synthetic method
JP2910587B2 (en) Speech synthesizer
Davaatsagaan et al. Diphone-based concatenative speech synthesis system for mongolian
KR920009961B1 (en) Unlimited korean language synthesis method and its circuit
Datta et al. Epoch Synchronous Overlap Add (Esola) Algorithm
Kamanaka et al. Japanese text-to-speech conversion system.
Rizk et al. Arabic text to speech synthesizer: Arabic letter to sound rules
JPH01321496A (en) Speech synthesizing device
Kawai et al. A system for speech synthesis from Japanese orthographic text

Legal Events

Date Code Title Description
A201 Request for examination
N231 Notification of change of applicant
G160 Decision to publish patent application
E701 Decision to grant or registration of patent right
GRNT Written decision to grant
FPAY Annual fee payment

Payment date: 20100226

Year of fee payment: 14

LAPS Lapse due to unpaid annual fee