KR20050080671A

KR20050080671A - Emoticon processing method for text to speech system

Info

Publication number: KR20050080671A
Application number: KR1020040008771A
Authority: KR
Inventors: 강용성
Original assignee: 엘지전자 주식회사
Priority date: 2004-02-10
Filing date: 2004-02-10
Publication date: 2005-08-17
Also published as: CN1655231A; CN1292400C

Abstract

본 발명은 TTS 시스템에서 문자열을 음성신호로 변환할 때 이모티콘이 발견되면 이의 해당 발음을 출력하는 기술에 관한 것이다. 이러한 본 발명은, TTS 엔진에 입력되는 문자열을 음성신호로 변환하기 위하여, 언어학적 처리를 수행함과 아울러, 그 문자열 중에 포함되어 있는 이모티콘을 해당 발음으로 출력하는 제1과정과; 상기 음성신호로 변환된 문장의 운율을 결정하기 위하여 억양, 음의 지속시간 등을 결정하고, 상기 이모티콘이 나타내는 감정에 따라 운율을 적절히 조절하는 제2과정과; 상기 운율이 조정된 음성신호를 외부로 출력하는 제3과정에 의해 달성된다.The present invention relates to a technique for outputting a corresponding pronunciation when an emoticon is found when converting a string into a voice signal in a TTS system. The present invention includes a first step of performing a linguistic process and outputting an emoticon included in the string as a corresponding pronunciation in order to convert the string input to the TTS engine into a voice signal; A second process of determining an accent, duration of sound, etc. to determine the rhyme of the sentence converted into the voice signal, and appropriately adjusting the rhyme according to the emotion indicated by the emoticon; The rhyme is achieved by a third process of outputting the adjusted voice signal to the outside.

Description

EMOTICON PROCESSING METHOD FOR TEXT TO SPEECH SYSTEM}

본 발명은 티티에스(TTS: Text To Speech) 시스템에서 이모티콘(emoticon)을 처리하는 기술에 관한 것으로, 특히 TTS 엔진에서 문자열을 음성신호로 변환할 때 이모티콘이 발견되면 이의 해당 발음을 출력할 수 있도록 한 티티에스 시스템의 이모티콘 처리 방법에 관한 것이다.The present invention relates to a technology for processing an emoticon in a text to speech (TTS) system, and in particular, when an emoticon is found when converting a string into a voice signal in a TTS engine, the corresponding pronunciation thereof can be output. A method of processing emoticons in a TTS system.

TTS 시스템은 기본적으로 문자열을 사람의 음성으로 변환하는 시스템으로서 기본적인 목적은 문자열로 구성된 텍스트를 사람이 보지 않고 귀로 들을 수 있도록 하는데 있다. 이러한 TTS 기술은 음성인식 기술보다 상용화에 더 접근된 기술이며 각종 텍스트 정보를 음성으로 변환하는 서비스에 활용할 수 있다. 요즈음과 같이 전자메일이 일반화된 상황에서 외부로부터 전화를 이용하여 새로 온 편지를 읽을 수 있는 것도 TTS기술 덕분이다. 이외에도 TTS기술은 워드 프로세서로 입력한 문장, 웹 브라우저가 화면에 보여주는 HTML문서를 음성으로 들어볼 수 있게 해주며, 시각장애자인 경우 인터넷상의 정보를 음성으로 변환하여 들려줌으로써 일반인에 못지않게 각종 유용한 정보를 얻을 수 있다. 최근에는 과거 기계음수준의 합성음을 넘어 인간의 음성과 유사한 합성음을 생성할 수 있는 기술이 개발되어 TTS 기술을 이용한 서비스가 일반인을 대상으로 점차 확대되어 가고 있는 추세이다. The TTS system is basically a system that converts a string into a human voice, and its basic purpose is to allow a user to hear text composed of a string without hearing a human being. Such TTS technology is more accessible to commercialization than voice recognition technology and can be used for a service for converting various text information into voice. It is thanks to TTS technology that new mail can be read from outside by using e-mail in general. In addition, TTS technology enables you to listen to the sentences entered by word processors and HTML documents displayed by the web browser on the screen, and in the case of the visually impaired, it converts the information on the Internet into voice and listens to various useful information. Can be obtained. Recently, a technology for generating synthesized sounds similar to human voices has been developed beyond past mechanical tones, and services using TTS technology are gradually expanding for the general public.

그런데, 사람이 사용하고 있는 언어는 생명력을 지니고 있어 시시각각으로 변화되고 있으며, 현재와 같이 각종 네트워크를 통해 서로 문자를 통해 의사소통을 하고 있는 상황에서는 그 변화의 속도가 점차 빨라지고 있느 실정이다. However, since the language used by human beings has vitality and is changing at every moment, the speed of change is gradually increasing in the situation where each person communicates through letters through various networks.

근래 들어, 컴퓨터 통신 등에서 이모티콘의 사용 빈도가 점차 증가되고 있는 추세에 있다. 상기 이모티콘은 자신의 감정이나 의사를 표현하는 것으로 감정(emotion)과 아이콘(icon)을 합성한 말이며, 키보드에 있는 각종 기호와 문자를 조합하여 만든다. 예를 들어 웃는 얼굴은 :) 또는 :-)로 나타낼 수 있는데, 왼쪽으로 돌려 보면 웃는 얼굴이 나타나게 된다. 1980년대 카네기 멜론 대학 학생인 S. 펠만이 최초로 사용한 것으로 알려져 있다. 자칫 딱딱해지기 쉬운 컴퓨터 통신을 부드럽고 재미있는 분위기로 이끌어 기계와 기계 사이에 오가는 커뮤니케이션을 좀 더 부드럽고 인간적으로 만들 수 있다. In recent years, the frequency of use of emoticons has gradually increased in computer communication. The emoticon expresses one's emotion or intention and is a word synthesized with an emotion and an icon, and is made by combining various symbols and characters on a keyboard. For example, a smiley face can be represented by :) or :-), and turning to the left will bring up a smiley face. S. Fellman, a student at Carnegie Mellon University in the 1980s, is known for the first time. By bringing computer communication, which tends to be hard, to a smooth and fun atmosphere, communication between machines can be made smoother and more human.

그런데, 종래 기술에 의한 TTS 시스템에 있어서는 통상의 문자만을 음성으로 변환할 뿐 이모티콘을 단순한 문장기호나 의미없는 기호로 처리하게 되어 있어 사용자에게 문서의 내용을 충분하게 전달하는데 어려움이 있었다.However, in the TTS system according to the prior art, only normal characters are converted into speech, and the emoticons are processed as simple punctuation symbols or meaningless symbols, which makes it difficult to sufficiently transmit the contents of the document to the user.

따라서, 본 발명의 목적은 TTS 엔진에서 문자열을 음성신호로 변환할 때 이모티콘이 발견되면 이모티콘 발음사전을 이용하여 이의 해당 발음을 출력하도록 하는데 있다.Accordingly, an object of the present invention is to output a corresponding pronunciation using an emoticon pronunciation dictionary when an emoticon is found when converting a string into a voice signal in the TTS engine.

본 발명에 의한 티티에스 시스템의 이모티콘 처리 방법은, TTS 엔진에 입력되는 문자열을 음성신호로 변환하기 위하여, 텍스트의 문장처리, 구문 분석, 비한글 문자 처리, 형태소 분석 및 구문 분석, 발음 표기 변환 등의 작업을 수행함과 아울러, 그 문자열 중에 포함되어 있는 이모티콘을 해당 발음으로 출력하는 제1과정과; 상기 음성신호로 변환된 문장의 운율을 결정하기 위하여 억양, 음의 지속시간 등을 결정하고, 상기 이모티콘이 나타내는 감정에 따라 운율을 적절히 조절하는 제2과정과; 음성 데이터베이스(DB)를 참고하여 실제 음성신호를 생성한 후 그 음성신호에 대해 D/A변환을 실시하고 증폭처리하는 제3과정으로 이루어지는 것으로, 이와 같이 이루어지는 본 발명의 이모티콘 처리과정을 첨부한 도 1 및 도 2를 참조하여 상세히 설명하면 다음과 같다.The emoticon processing method of the TTS system according to the present invention includes a sentence processing, parsing, non-Hangul character processing, morphological analysis and parsing, pronunciation notation conversion, etc., in order to convert a string input to the TTS engine into a voice signal. A first step of performing an operation and outputting an emoticon included in the string in a corresponding pronunciation; A second process of determining an accent, duration of sound, etc. to determine the rhyme of the sentence converted into the voice signal, and appropriately adjusting the rhyme according to the emotion indicated by the emoticon; The third step of generating a real voice signal with reference to the voice database (DB), and then performing a D / A conversion and amplification process for the voice signal, attached to the emoticon processing process of the present invention Referring to 1 and 2 in detail as follows.

외부 장치 또는 내부 메모리로부터 문자열이 TTS 엔진의 텍스트 입력부(1)에 입력되면, 언어학적 처리부(2)에서 이를 음성신호로 변환하기 위하여, 사전부(6)의 숫자/약어 기호 사전, 품사 사전, 발음 사전에 있는 각종 데이터를 참고하여 텍스트의 문장처리, 구문 분석, 비한글 문자 처리, 형태소 분석 및 구문 분석, 발음 표기 변환 등의 작업을 수행한다. When a character string is input to the text input unit 1 of the TTS engine from an external device or an internal memory, the linguistic processing unit 2 converts it into a voice signal, so that the number / abbreviation dictionary of the dictionary unit 6, a part-of-speech dictionary, By referring to the various data in the pronunciation dictionary, text processing, parsing, non-Hangul character processing, morphological analysis and syntax analysis, and phonetic notation conversion are performed.

이때, 상기 언어학적 처리부(2)는 이모티콘 발음사전(7)을 이용하여 상기 문자열 중에 포함되어 있는 이모티콘을 인식한 후 이를 단순한 기호로 처리하지 않고 그 이모티콘 발음사전(7)에 들어 있는 발음으로 출력을 구성한다. At this time, the linguistic processing unit 2 recognizes the emoticon included in the character string using the emoticon pronunciation dictionary 7 and outputs the pronunciation in the emoticon pronunciation dictionary 7 without processing it as a simple symbol. Configure

참고로, 도 2는 상기 이모티콘 발음사전(7)에 기록되어 있는 각 이모티콘들의 발음 예를 나타낸 것이다. 예를 들어, ^^, ^_^, :), ^o^, ^_^ 등의 이모티콘은 "웃습니다"로 발음한다. 다른 예로써, -.-, - -, -.-의 이모티콘에 대해서는 "무표정합니다"로 발음한다.For reference, Figure 2 shows an example of pronunciation of each emoticon recorded in the emoticon pronunciation dictionary (7). For example, emoticons like ^^, ^ _ ^, :), ^ o ^, ^ _ ^ are pronounced "laugh". As another example, an emoticon of -.-,--, -.- is pronounced "no expression".

이후, 운율 처리부(3)에서는 문장을 소리로 출력할 때 운율을 결정하기 위하여 억양, 음의 지속시간 등을 결정하게 되는데, 이때 상기 이모티콘이 나타내는 감정에 따라 운율을 적절히 조절한다.Subsequently, the rhyme processing unit 3 determines the intonation, the duration of the sound, and the like in order to determine the rhyme when the sentence is output as a sound. In this case, the rhyme is appropriately adjusted according to the emotion indicated by the emoticon.

이어서, 음성신호 처리부(4)에서는 실제의 음성 데이터가 저장되어 있는 음성 데이터베이스(8)를 참고하여 실제 음성신호를 생성하고, 음성신호 출력부(5)에서는 상기 음성신호를 사람이 들을 수 있도록 D/A변환을 실시함과 아울러 적당한 레벨로 증폭하여 출력하게 된다.Subsequently, the voice signal processing unit 4 generates an actual voice signal by referring to the voice database 8 in which the actual voice data is stored, and the voice signal output unit 5 allows the human voice to hear the voice signal. A / A conversion is performed, and the output is amplified to an appropriate level.

이상에서 상세히 설명한 바와 같이 본 발명은 TTS 엔진에서 문자열을 음성신호로 변환할 때 이모티콘이 발견되면 이모티콘 발음사전을 이용하여 이의 해당 발음을 출력하도록 함으로써, 이모티콘이 포함되어 있는 문자의 내용을 음성으로 변환하여 출력할 때 그 내용을 그대로 전달할 수 있는 효과가 있다.As described in detail above, when the emoticon is found when converting a string into a voice signal in the TTS engine, the present invention outputs the corresponding pronunciation by using an emoticon pronunciation dictionary, thereby converting the contents of a character including the emoticon into a voice. When outputting, the contents can be delivered as it is.

도 1은 본 발명에 의한 이모티콘 처리 방법의 시스템 구성도.1 is a system configuration of an emoticon processing method according to the present invention.

도 2는 도 1에서 이모티콘 발음사전의 발음 예시표. Figure 2 is a pronunciation example table of the emoticon pronunciation dictionary in FIG.

***도면의 주요 부분에 대한 부호의 설명*** *** Description of the symbols for the main parts of the drawings ***

1 : 텍스트 입력부 2 : 언어학적 처리부1: text input unit 2: linguistic processing unit

3 : 운율 처리부 4 : 음성신호 처리부3: rhyme processing unit 4: voice signal processing unit

5 : 음성신호 출력부 6 : 사전부5: audio signal output unit 6: dictionary unit

7 : 이모티콘 발음사전 8 : 음성 데이터베이스7: Emoticon Pronunciation dictionary 8: Voice database

Claims

A first step of performing a linguistic processing and outputting an emoticon included in the string as a corresponding pronunciation in order to convert the string input to the TTS engine into a voice signal; And a second process of determining an intonation, duration of sound, and the like, and appropriately adjusting the rhyme according to the emotion indicated by the emoticon in order to determine the rhyme of the sentence converted into the voice signal. To deal with emoticons.

The method of claim 1, wherein the text string input to the TTS engine is provided from an external device or an internal memory.

The method according to claim 1, wherein the first process comprises a step of recognizing an emoticon included in a string using an emoticon pronunciation dictionary and configuring an output with a pronunciation contained in the emoticon pronunciation dictionary. How to handle emoticons in your system.

According to claim 3, Emoticon pronunciation dictionary is emoticon processing method of the TTS system, characterized in that the pronunciation corresponding to each emoticon is stored.