KR920005509B1 - Natural sound synthesizer by adding noise - Google Patents

Natural sound synthesizer by adding noise Download PDF

Info

Publication number
KR920005509B1
KR920005509B1 KR1019890015829A KR890015829A KR920005509B1 KR 920005509 B1 KR920005509 B1 KR 920005509B1 KR 1019890015829 A KR1019890015829 A KR 1019890015829A KR 890015829 A KR890015829 A KR 890015829A KR 920005509 B1 KR920005509 B1 KR 920005509B1
Authority
KR
South Korea
Prior art keywords
data
rom
digital
signal processor
program
Prior art date
Application number
KR1019890015829A
Other languages
Korean (ko)
Other versions
KR910008647A (en
Inventor
이윤근
Original Assignee
주식회사 금성사
이헌조
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 주식회사 금성사, 이헌조 filed Critical 주식회사 금성사
Priority to KR1019890015829A priority Critical patent/KR920005509B1/en
Publication of KR910008647A publication Critical patent/KR910008647A/en
Application granted granted Critical
Publication of KR920005509B1 publication Critical patent/KR920005509B1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/04Details of speech synthesis systems, e.g. synthesiser structure or memory management
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Analogue/Digital Conversion (AREA)
  • Soundproofing, Sound Blocking, And Sound Damping (AREA)

Abstract

The natural tone synthesizer adds noise to impulse to produce natural vocal sound. The synthesizer includes an interfacing unit (9) for interfacing a keyboard with a digital signal processor (4) for receiving Hangul character data, a sound data ROM (7), a program ROM (8), an address decoder (8) for selecting the sound data ROM (7) or the program ROM (8) by decoding data selection signal (DS) and program selection signal (PS), a buffer (6) for buffering address signals (A0-A15) transmitted from a DSP to designate addresses of the ROMs (7,8), and a digital to analog converter (3) for converting digital sound signal transmitted from the DSP into analog signal.

Description

잡음 첨가에 의한 자연음 합성기Natural Sound Synthesizer by Noise Addition

제1도는 본 발명의 자연음 합성기 회로구성도.1 is a circuit diagram of a natural sound synthesizer of the present invention.

제2도는 본 발명의 신호흐름도.2 is a signal flow diagram of the present invention.

* 도면의 주요부분에 대한 부호의 설명* Explanation of symbols for main parts of the drawings

1 : 스피커 2 : 증폭기1: speaker 2: amplifier

3 : 디지탈/아날로그변환기 4 : 디지탈신호프로세서3: digital / analog converter 4: digital signal processor

5 : 어드레스디코더 6 : 버퍼5: Address decoder 6: Buffer

7 : 음성데이타롬 8 : 프로그램롬7: Voice data ROM 8: Program ROM

9 : 인터페이스부 10 : 퍼스널컴퓨터9 interface 10 personal computer

11 : 키보드 12 : 모니터11: keyboard 12: monitor

본 발명은 음성합성에 관한 것으로, 특히 음성의 음원과 유사하게 유성음의 경우 임펄스성분과 노이즈(noise)성분으로 이루어진 잔차신호로부터 분석해낸 노이즈성분을 임펄스에 첨가시켜 보다 자연스러운 음성을 발생하도록 하기 위한 잡음 첨가에 의한 자연음 합성기에 관한 것이다.The present invention relates to speech synthesis. In particular, similar to a sound source of a voice, in the case of voiced sound, a noise for generating a more natural voice by adding a noise component analyzed from a residual signal composed of an impulse component and a noise component to an impulse It relates to a natural sound synthesizer by addition.

종래의 자연음 합성기에 있어서는 선형예측부호화에 의한 음성합성의 경우 무성음은 잡음(White noise)에 의해 소리를 발생시켰고, 유성음은 임펄스(Impulse)에 의해 소리를 발생시켰다. 즉 임펄스에 의해 발생된 유성음과 불규칙 잡음발생기에 의해 발생된 무성음이 유성, 무성음 선택스위치에 의해 선택된 후 유성음의 경우 선형예측부호화계수에 의해 필터되어 사람의 음성과 유사한 합성음이 스피커를 통해 출력되었으나, 이는 임펄스로만 음을 발생시키기 때문에 기계음이 발생되는 문제점이 있었다.In the conventional natural sound synthesizer, in case of speech synthesis by linear predictive encoding, unvoiced sound is generated by white noise, and voiced sound is generated by impulse. In other words, the voiced sound generated by the impulse and the unvoiced sound generated by the irregular noise generator were selected by the voiced and unvoiced sound selection switch, and in the case of the voiced sound, the synthesized sound similar to the human voice was output through the speaker. This is because the sound is generated only by the impulse has a problem that the mechanical sound is generated.

본 발명은 이와 같은 종래의 문제점을 시정보완하기 위하여 디지탈신호프로세서 주변에 음성데이타롬과 프로그램롬, 어드레스디코더, 인터페이스부 및 디지탈/아날로그변환기를 구비시킨 후 한글문자 입력을 미리 저장된 음성데이타에 의해 합성해내 디지탈/아날로그변환기와 증폭기를 통해 출력시키도록 창안한 것으로, 이를 첨부된 도면을 참조하여 상세히 설명하면 다음과 같다.In order to solve the above problems, the present invention provides a voice data ROM, a program ROM, an address decoder, an interface unit, and a digital / analog converter in the vicinity of the digital signal processor, and synthesizes Korean character input by pre-stored voice data. It is designed to output through the digital / analog converter and the amplifier, which will be described in detail with reference to the accompanying drawings.

제1도는 본 발명의 자연을 합성기회로 구성도로서 이에 도시한 바와 같이 한글문자가 키보드(11)에서 선택되어 퍼스널컴퓨터(10)에 입력되고, 그 퍼스널컴퓨터(10)의 키이신호를 인터페이스하는 인터페이스부(9)와, 상기 인터페이스부(9)로부터 입력된 데이타에 따라 프로그램롬(8)에 저장된 프로그램을 수행하여 음성데이타롬(7)으로부터 해당데이타를 데이타라인(D0-D15)을 통해 읽어온 후 이를 분석, 합성하여 다시 데이타라인(D0-D15)으로 출력하는 디자탈신호프로세서(4)와, 상기 디지탈신호프로세서(4)에서 합성된 데이타를 데이타라인(D0-D15)을 통해 입력받아 아날로그신호로 변환시킨 후 이를 증폭기(2)를 통해 스피커(1)로 출력하는 디지탈/아날로그변환기(3)와, 상기 디지탈신호프로세서(4)의 데이타선택신호(DS) 및 프로그램선택신호(PS)를 디코딩하여 상기 음성데이타롬(7) 및 프로그램롬(8)을 선택하는 어드레스디코더(5)와, 상기 디지탈신호프로세서(4)의 어드레스신호(A0-A15)를 완충증폭하여 상기 음성데이타롬(7) 및 프로그램롬(8)의 어드레스를 지정하는 버퍼(6)로 구성한 것으로, 도면의 설명중 미설명 부호 12는 모니터이다.FIG. 1 is a schematic diagram illustrating the nature of the present invention. As shown in FIG. 1, Korean characters are selected from the keyboard 11 and input to the personal computer 10, and the key signals of the personal computer 10 are interfaced. In accordance with the interface unit 9 and the data stored in the program ROM 8 according to the data input from the interface unit 9, the data line D 0 -D 15 is stored from the voice data ROM 7. Read through the digital signal processor (4) for analyzing, synthesizing and outputting the data line (D 0 -D 15 ), and the data synthesized in the digital signal processor (4) data line (D 0 -D 15 , a digital / analog converter 3 for converting the analog signal into an analog signal and outputting the analog signal to the speaker 1 through the amplifier 2, the data selection signal DS of the digital signal processor 4, Decode the program selection signal (PS) Than the voice data ROM 7, and the program ROM 8 by buffer address signals (A 0 -A 15) of the address decoder 5 and the digital signal processor (4) for selecting the amplifying the voice data ROM ( 7) and a buffer 6 for designating an address of the program ROM 8, wherein reference numeral 12 in the description of the drawings indicates a monitor.

이와 같이 구성된 본 발명의 작용, 효과를 설명하면 다음과 같다.Referring to the operation and effects of the present invention configured as described above are as follows.

음성신호는 음원신호가 성도(Vocal track)를 통해 나오면서 변형된 신호인데, 여기서 변형특성은 성도의 모양에 의해 결정되며, 이를 나타내는 계수가 선형예측부호화계수이다. 또한 음원신호는 임펄스 성분과 노이즈 성분으로 이루어져 있는데, 이것이 잔차진호(residual)에 해당된다.The voice signal is a signal that is deformed as the sound source signal comes out through the vocal track, and the deformation characteristic is determined by the shape of the vocal tract, and the coefficient representing the linear signal is a linear predictive coding coefficient. In addition, the sound source signal is composed of an impulse component and a noise component, which corresponds to a residual signal.

즉, 음원신호를 성도 특성을 타나내는 선형예측계수에 의해 구성된 여과기를 통과시키면 음성신호가 발생하는데, 반대로 음성신호를 역필터링하게 되면 잔차신호가 얻어진다.That is, when a sound source signal passes through a filter composed of linear predictive coefficients representing vocal characteristics, a voice signal is generated. On the contrary, if the voice signal is reversely filtered, a residual signal is obtained.

따라서 잡음을 무시하고 임펄스로만 음을 발생시키면 기계음이 발생되어 음질이 좋지 않으므로 이 시스템에서는 잔차신호를 분석하여 잡음성분의 에너지를 구하여 음성합성시 임펄스 트래인에 적당한 에너지의 잡음을 섞어 줌으로써 보다 실제 음성과 흡사한 합성음을 만들어낸다.Therefore, if sound is generated by impulse, ignoring noise, and sound is not good, the sound quality is not good. In this system, the residual signal is analyzed and the energy of the noise component is obtained. Produces synthesized sound similar to

이와 같은 과정을 토대로 본 발명의 구성도 및 신호흐름도인 제2도를 참조하여 설명하면 다음과 같다.When described with reference to FIG. 2 which is a block diagram and a signal flow diagram of the present invention based on the above process.

본 시스템의 하드웨어는 디지탈신호프로세서(4)를 이용하여 구성되며, 제2도의 프로그램 내용은 프로그램롬(8)에 저장되어, 그 프로그램을 디지탈신호프로세서(4)에서 순차적으로 처리하여 수행하게 된다.The hardware of this system is constructed using the digital signal processor 4, the program contents of FIG. 2 are stored in the program ROM 8, and the programs are sequentially processed by the digital signal processor 4 to be executed.

음성데이타롬(7)에는 각 음소의 선형예측부호하(LPC)계수, 피치, 에너지에 관한 데이타를 저장하고, 프로그램롬(8)에는 합성알고리즘을 저장하여 둔다. 이후 키보드(11) 입력을 통해 한글문자 데이타가 퍼스널컴퓨터(10) 및 인터페이스부(9)를 통해 디지탈신호프로세서(4)의 데이타라인(D0-D15)에 입력되면, 그 디지탈신호프로세서(4)는 음운변환 규칙에 의해 소리나는대로 표기를 변화하여 이를 유성음과 무성음으로 나누고, 이어서 프로그램선택신호(PS)를 출력하면, 그 프로그램선택신호(PS)를 어드레스디코더(5)에서 디코딩하여 프로그램롬(8)을 선택하고, 이와아울러 어드레스신호(A0-A15)가 버퍼(6)를 통해 프로그램롬(8)의 어드레스(A)를 지정함에 따라 데이타라인(D0-D15)을 통해 해당데이타를 읽어오고, 다시 데이타선택신호(DS)를 출력하면, 그 데이타선택신호(DS)를 어드레스디코더(5)에서 디코딩하여 음성데이타롬(7)을 선택하고, 이와동시에 어드레스신호(A0-A15) 버퍼(6)를 통해 음성데이타롬(7)의 어드레스(A)를 지정함에 따라 그 어드레스에 저장된 음성데이타를 뽑아내어 원하는 음성을 합성한 후 데이타라인(D0-D15)을 통해 출력하고, 이에따라 그 음성합성신호는 디지탈/아날로그변환기(3)에서 아날로그신호로 변환된 후 증폭기(2)에서 증폭되어 스피커(1)로 출력된다.Voice data ROM 7 stores linear predictive code (LPC) coefficients, pitch, and energy of each phoneme, and program algorithm 8 stores synthetic algorithms. When the Hangul character data is input to the data lines D 0 -D 15 of the digital signal processor 4 through the personal computer 10 and the interface unit 9 through the keyboard 11 input, the digital signal processor ( 4) change the notation according to the phonological conversion rule, divide it into voiced sound and unvoiced sound, and then output the program selection signal PS. Then, the program selection signal PS is decoded by the address decoder 5 and programmed. The ROM 8 is selected and data lines D 0 -D 15 are selected as the address signals A 0 -A 15 designate the address A of the program ROM 8 through the buffer 6. When the corresponding data is read out and the data selection signal DS is output again, the data selection signal DS is decoded by the address decoder 5 to select the voice data ROM 7, and at the same time, the address signal A 0 -A 15 ) to the audio data (7) via the buffer (6). As the dress (A) is designated, the voice data stored at the address is extracted, the desired voice is synthesized, and then output through the data lines D 0 -D 15. Accordingly, the voice synthesized signal is digital / analog converter 3. After being converted into an analog signal from the amplifier (2) is amplified and output to the speaker (1).

여기서, 디지탈신호프로세서(4)에서 주파수펄스신호(XF)가 출력되어, 아날로그/디지탈변환기(3)에 인가되고, 그 아날로그/디지탈변환기(3)에서 인터럽트신호(INT)가 발생되어 디지탈신호프로세서(4)에 인가되며, 그 디지탈신호프로세서(4)에서 인터럽트인식신호(IACK)가 발생되어 디지탈/아날로그변환기(3)에 인가된다.Here, the frequency pulse signal XF is output from the digital signal processor 4, applied to the analog / digital converter 3, and the interrupt signal INT is generated from the analog / digital converter 3 to generate the digital signal processor. (4), an interrupt recognition signal (IACK) is generated in the digital signal processor (4) and applied to the digital / analog converter (3).

여기서, 음성데이타의 음원을 보면, 음원중 임펄스의 분할을 AV라 하고, 음원중 노이즈의 분할을 AN이라하면, AV+AN=1이 된다. 즉 유성음의 경우에도 AN에 해당되는 노이즈를 첨가시켜 줌으로써 종래의 임펄스에 의한 합성보다 자연스러운 음을 얻을 수 있다. 다음은 AV와 AN을 결정하는 방법을 설명하면, 음성신호를 선형예측부호화계수에 의해 역필터링하면 잔차신호가 얻어진다. 유성음의 경우 순수한 임펄스가 아닌 노이즈가 첨가된 형태의 음원을 가진다. 따라서 잔차신호를 구하면 한 프레임내에서 임펄스와 노이즈의 에너지 비율을 계산할 수 있다.In the sound source of the audio data, if the division of the impulse in the sound source is AV, and the division of the noise in the sound source is AN, AV + AN = 1. That is, even in the case of voiced sound, by adding noise corresponding to AN, a natural sound can be obtained than conventional synthesis by impulse. Next, a method of determining AV and AN will be described. When the audio signal is reversely filtered by a linear predictive encoding coefficient, a residual signal is obtained. In the case of voiced sound, it has a sound source in which noise is added, not pure impulse. Therefore, when the residual signal is obtained, the energy ratio between the impulse and the noise can be calculated in one frame.

Figure kpo00001
Figure kpo00001

(여기서 EV : 엄펄스의 에너지이고, En : 임펄스와 임펄스간의 노이즈의 에너지이다).(Where EV is energy of umpulse and En is energy of noise between impulse and impulse).

이값을 음성데이타롬(7)에 저장하였다가 합성할때 이 비율로 노이즈를 혼합해 음원을 만들어 내면 보다 자연스러운 합성음을 얻을 수 있다.This value is stored in the voice data ROM (7), and when synthesized, a noise can be produced by mixing noise at this ratio to obtain a more natural synthesized sound.

이와 같이 실제 음성의 음원과 유사하게, 유성음의 경우 임펄스에 잔차신호로부터 분석해낸 양의 노이즈를 첨가시킴으로써 보다 자연스러운 합성음을 얻을 수 있는 우수한 특성이 있는 것이다.Similar to the sound source of the actual voice, voiced sound has an excellent characteristic of obtaining a more natural synthesized sound by adding the amount of noise analyzed from the residual signal to the impulse.

Claims (1)

디지탈신호프로세서(4)의 주변에 키보드(11)의 키이신호에 따른 퍼스널컴퓨터)(10)의 한글문자 데이타를 인터페이스하는 인터페이스부(9)와, 음성데이타가 저장된 음성데이타롬(7) 및 프로그램이 저장된 프로그램롬(8)과, 상기 디지탈신호프로세서(4)의 데이타선택신호(DS) 및 프로그램선택신호(PS)를 디코딩하여 상기 음성데이타롬(7) 및 프로그램롬(8)을 선택하는 어드레스디코더(8)와, 상기 디지탈신호프로세서(4)의 어드레스신호(A0-A15)를 완충증폭하여 상기 음성데이타롬(7) 및 프로그램롬(8)의 어드레스를 지정하는 버퍼(6)와, 상기 디지탈신호프로세서(4)의 출력데이타를 아날로그신호로 변환한 후 증폭기(2)를 통해 스피커(2)로 출력하는 디지탈/아날로그변환기(3)를 구비하여, 상기 음성데이타롬(7)에 에너지, 피치선형예측부호화계수를 저장하여 두고, 상기 인터페이스부(9)로부터 한글문자 데이타가 출력되어 상기 디지탈신호프로세서(4)에 입력될때 그 디지탈신호프로세서(4)는 음성분석시 잔차신호분석에 의해 얻어진 임펄스와 노이즈 비례에 의해 상기 프로그램롬(8)에 저장된 데이타를 읽고, 그 데이타에 의해 상기 음성데이타롬(7)에 저장된 해당데이타를 읽은 후 임펄스에 노이즈를 첨가시켜 합성하고, 그 합성데이타를 상기 디지탈/아날로그변환기(3)로 출력하게 구성된 것을 특징으로 하는 잡음 첨가에 의한 자연음 합성기.The interface unit 9 for interfacing the Hangul character data of the personal computer according to the key signal of the keyboard 11 to the digital signal processor 4, the voice data ROM 7 and the program in which the voice data is stored. An address for selecting the voice data ROM 7 and the program ROM 8 by decoding the stored program ROM 8 and the data selection signal DS and the program selection signal PS of the digital signal processor 4. A decoder 6, buffer 6 amplifying the address signals A 0 -A 15 of the digital signal processor 4, and a buffer 6 for addressing the voice data ROM 7 and the program ROM 8; And a digital / analog converter (3) for converting the output data of the digital signal processor (4) into an analog signal and outputting the analog signal to the speaker (2) through the amplifier (2). Energy and pitch linear prediction coding coefficients are stored, When the Hangul character data is output from the interface unit 9 and inputted to the digital signal processor 4, the digital signal processor 4 generates the program ROM by the impulse and noise proportional value obtained by the residual signal analysis during voice analysis. 8) Read the data stored in the data, read the corresponding data stored in the voice data ROM 7 by the data, add noise to the impulse and synthesize the synthesized data, and output the synthesized data to the digital / analog converter 3. Natural sound synthesizer by adding noise, characterized in that the configuration.
KR1019890015829A 1989-10-31 1989-10-31 Natural sound synthesizer by adding noise KR920005509B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1019890015829A KR920005509B1 (en) 1989-10-31 1989-10-31 Natural sound synthesizer by adding noise

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1019890015829A KR920005509B1 (en) 1989-10-31 1989-10-31 Natural sound synthesizer by adding noise

Publications (2)

Publication Number Publication Date
KR910008647A KR910008647A (en) 1991-05-31
KR920005509B1 true KR920005509B1 (en) 1992-07-06

Family

ID=19291267

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1019890015829A KR920005509B1 (en) 1989-10-31 1989-10-31 Natural sound synthesizer by adding noise

Country Status (1)

Country Link
KR (1) KR920005509B1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100497354B1 (en) * 2002-05-07 2005-06-23 삼성전자주식회사 Digital audio system

Also Published As

Publication number Publication date
KR910008647A (en) 1991-05-31

Similar Documents

Publication Publication Date Title
US4912768A (en) Speech encoding process combining written and spoken message codes
US4304965A (en) Data converter for a speech synthesizer
JPH0573100A (en) Method and device for synthesising speech
US5321794A (en) Voice synthesizing apparatus and method and apparatus and method used as part of a voice synthesizing apparatus and method
KR920005509B1 (en) Natural sound synthesizer by adding noise
KR920008259B1 (en) Korean language synthesizing method
US4633500A (en) Speech synthesizer
KR920005508B1 (en) Chorus synthesizing circuit using linear predictive coding
JPH05165500A (en) Voice coding method
KR920003934B1 (en) Complex coding method of voice synthesizer
JP2703253B2 (en) Speech synthesizer
WO2023182291A1 (en) Speech synthesis device, speech synthesis method, and program
JP2806047B2 (en) Automatic transcription device
JPS6187199A (en) Voice analyzer/synthesizer
JPH1031496A (en) Musical sound generating device
Yazu et al. The speech synthesis system for an unlimited Japanese vocabulary
JPH10105200A (en) Voice coding/decoding method
JP2907828B2 (en) Voice interactive document creation device
JPH02236600A (en) Circuit for giving emotion of synthesized voice information
JPH03100700A (en) Voice synthetic singing apparatus
KR0136095B1 (en) Accent processing method for voice synthesizing device
JPS63285597A (en) Phoneme connection type parameter rule synthesization system
JPH01262598A (en) Utterance speed control circuit for voice synthesizing device
JPH037997A (en) Voice synthesizing and singing device
JPH0594199A (en) Residual driving type speech synthesizing device

Legal Events

Date Code Title Description
A201 Request for examination
E902 Notification of reason for refusal
G160 Decision to publish patent application
E701 Decision to grant or registration of patent right
GRNT Written decision to grant
FPAY Annual fee payment

Payment date: 19961230

Year of fee payment: 6

LAPS Lapse due to unpaid annual fee