KR20030000314A

KR20030000314A - Telephone Message Memo System Using Automatic Speech Recognition

Info

Publication number: KR20030000314A
Application number: KR1020010036043A
Authority: KR
Inventors: 박인표
Original assignee: 박인표
Priority date: 2001-06-23
Filing date: 2001-06-23
Publication date: 2003-01-06

Abstract

PURPOSE: A system for taking notes of call contents using voice recognition is provided to recognize voice for the contents of a call transceived through a telephone set, and to output the contents with a printer or on video, so that taking a memo is not necessary. CONSTITUTION: A voice communication terminal(10) has a voice receiving part and a voice transmitting part. A voice recognition unit(20) recognizes voice signals from the voice transmitting part or voice signals from the voice receiving part, to convert recognized voice recognition data into character data. An output unit(30) outputs the character data acquired by the voice recognition unit(20). The output unit(30) includes a printer(30A) or a monitor(30B). The voice recognition by the voice recognition unit(20) is provided through Internet communication of a voice recognition server(50) having a voice recognition engine of the voice recognition unit(20). And a recognition switch(40) selectively switches on/off for the operation of the voice recognition by the voice recognition unit(20).

Description

Phone Message Memo System Using Automatic Speech Recognition}

(기술분야)(Technology)

본 발명은 음성인식을 이용한 전화통화내용 메모 시스템에 관한 것이며, 보다 구체적으로는 전화기를 통해 송수신 되는 전화통화내용을 음성인식하여 이를 프린터나 화상으로 출력함으로써 펜을 이용한 전화통화내용메모가 필요 없도록 한 음성인식을 이용한 전화통화내용 메모 시스템에 관한 것이다.The present invention relates to a phone call contents memo system using voice recognition, and more particularly, to recognize phone call contents transmitted and received through a telephone and output them to a printer or an image, thereby eliminating the need for a phone call memo using a pen. The present invention relates to a phone call memo system using voice recognition.

(배경기술)(Background)

사람과 기계와의 정보교환은 각종 스위치, 키보드, 마우스 및 터치스크린 등과 같은 입력수단의 발달로 그 편의성이 상당히 개선되었으나 사람과 사람 사이의 대화에 비하면 아직도 상대적으로 매우 불편하다.The exchange of information between humans and machines is considerably improved due to the development of input means such as various switches, keyboards, mice, and touch screens, but it is still relatively inconvenient compared to the dialogue between humans.

따라서 보다 신속하고 편리하게 사람과 기계의 정보교환을 위한 MMI(Man Machine Interface)가 중요하게 대두되어 왔고, 1970년대 중반 이후 음성인식에 대한 연구가 활발히 전개되면서 MMI의 기술은 음성인식의 차원을 넘어서 음성의 의미를 파악하고 필요한 기능을 수행하는 단계에 이르렀다.Therefore, MMI (Man Machine Interface) for the exchange of information between humans and machines has been important, and since the mid-1970s, the research on voice recognition has been actively conducted, so MMI's technology goes beyond voice recognition. They have come to understand the meaning of the voice and perform the necessary functions.

일반적으로 음성인식이란 주어진 음성신호로부터 특징을 추출하고 이것에 패턴인식 알고리즘을 적용시켜 화자가 어떤 음소열을 발화시켜 발생된 음성신호인가를 역추적하는 기술을 말한다.In general, speech recognition refers to a technique of extracting a feature from a given speech signal and applying a pattern recognition algorithm to it to trace back which phoneme is a speech signal generated by the speaker.

음성인식기술은 그 기준에 따라 여러 가지로 분류할 수 있는바, 대표적으로인식의 대상으로 삼는 화자에 따라 화자종속 시스템, 화자독립 시스템 및 화자적응 시스템으로 분류할 수 있다.Speech recognition technology can be classified into various types according to the standard, and can be classified into speaker dependent system, speaker independent system, and speaker adaptation system according to the speaker to be recognized.

화자종속 시스템은 특정 화자의 음성을 인식하기 위한 시스템으로서 휴대폰에 탑재되어 사용되는 음성다이얼시스템이 대표적이다. 화자종속 시스템에서는 일반적으로 시스템의 사용 전에, 사용자의 음성을 저장 및 등록시키고 실제 인식을 수행할 때는 입력된 음성의 패턴과 저장된 음성의 패턴을 비교하는 기법이 사용된다.The speaker dependent system is a system for recognizing the voice of a specific speaker and is typically a voice dial system used in a mobile phone. In a speaker-dependent system, a technique of comparing a pattern of an input voice with a pattern of a stored voice is generally used when storing and registering a user's voice and performing real recognition before using the system.

화자독립 시스템은 불특정 다수 화자의 음성을 인식하기 위한 것으로, 화자종속 시스템에서와 같이 사용자가 시스템의 동작 전에 음성을 등록시켜야되는 번거로움이 없지만, 다수화자의 음성을 수집하여 통계적인 모델을 학습시키고 학습된 모델을 이용하여 인식을 수행하는 과정이 수반된다. 따라서 화자의 특징적인 특성은 사라지고 각 화자간에 공통적으로 나타나는 특성이 부각된다.The speaker independent system is for recognizing the voice of unspecified majority speakers, and there is no hassle for the user to register the voice prior to the operation of the system as in the speaker dependent system. It involves the process of performing recognition using the trained model. Therefore, the characteristic characteristic of the speaker disappears and the characteristic common to each speaker is highlighted.

화자적응 시스템은 화자독립 시스템을 구축하고 있으면서 실제 사용할 때는 사용자의 음성에 적합하도록 인식 모델을 변형함으로써 화자독립 시스템의 단점을 보완한 시스템이다.The speaker adaptation system is a system in which the speaker independence system is constructed while the speaker independence system is modified to suit the user's voice in actual use.

음성인식기술은 또한, 인식의 대상이 되는 어휘수에 따라 소용량 시스템과 대용량 시스템으로도 구분하고(통상 소용량 시스템은 어휘수가 수백개 이하인 경우를, 대용량 시스템은 인식 대상 어휘수가 수천에서 수십만에 이르는 시스템을 일컫는다), 발음의 형태에 따라 각 단어가 또박또박 발음되고 각 단어 사이에는 충분한 길이의 묵음구간이 존재한다고 가정하는 고립어인식 시스템과 문장 단위로 음성인식을 행하는 연속어 인식 시스템으로도 분류되기도 한다.Speech recognition technology is also divided into small-capacity systems and large-capacity systems according to the number of words to be recognized (typically small-capacity systems have hundreds of words or less, and large-capacity systems have thousands of hundreds of thousands of recognized words. Depending on the type of pronunciation, each word is classified into two or more words, and it is also classified into an isolated language recognition system that assumes that there is a silent section of sufficient length between each word, and a continuous word recognition system that performs speech recognition on a sentence-by-sentence basis. .

현재 적용되고 있는 음성인식기술의 예로서는, 음성 다이얼링 휴대폰, 음성구동 퍼스널 컴퓨터(컴퓨터와의 음성 인터페이스 기술), 음성인식 자동차, 주식정보, 전화번호, 지역정보 등의 각종 정보안내 서비스, 음성인식 무인교환기, 음성에 의한 로봇 제어, 자동통역 시스템, 음성을 문자 파일로 형성하는 원고작성 시스템 등이 있으며, 음성인식기술의 보급이 확대됨에 따라 사람들의 문화적 생활에 커다란 변혁을 가져올 것으로 전망되고 있다.Examples of voice recognition technologies currently applied include voice dialing mobile phones, voice-driven personal computers (voice interface technology with computers), voice recognition cars, stock information, telephone numbers, local information, and other information guide services, voice recognition unmanned exchanges. , Voice control of robots, automatic interpretation system, manuscript writing system that forms voice into text files, etc., and the spread of voice recognition technology is expected to bring great change to people's cultural life.

일상생활에서 가장 많이 사용하는 생활필수품의 하나인 전화기를 사용함에 있어서, 상대방과의 통화내용 중에 필요한 사항을 메모해야하는 경우가 자주 발생한다. 이와 같은 전화통화내용메모는 음식점 등을 포함한 각종 영업소에서 고객의 주문(배달주문)을 받을 때 더욱 필수적이다. 그러나, 지금까지 전화통화내용메모는 전화기 옆에 구비해둔 종이(메모지)나 주문서에 펜으로 통화내용을 요약 기록하는 방법 이외에 다른 방법이 없었다.In using a telephone, which is one of the necessities of daily necessities used in daily life, it is often necessary to take note of necessary matters in the conversation with the other party. Such phone call memo is more essential when receiving a customer's order (delivery order) at various offices, including restaurants. However, until now, the telephone call memo has no other method than the method of summarizing the call with a pen on a paper (memo paper) or order form provided by the telephone.

본 발명의 목적은, 메모할 필요가 있는 전화통화내용이 통화자의 수기에 의하지 아니하고도 인쇄나 화상 등의 방법으로 자동으로 메모되도록 함으로써, 전술한 바와 같은 전화통화내용메모에 관련된 불편함을 해소하고자 하는 것이다.SUMMARY OF THE INVENTION An object of the present invention is to eliminate the inconvenience associated with the above-mentioned phone call contents memo by allowing the phone call contents that need to be memo to be automatically memo by printing or image, without the handwriting of the caller. It is.

도1은 본 발명에 따른 음성인식을 이용한 전화통화내용 메모 시스템의 예시적인 시스템 구성도,1 is an exemplary system configuration diagram of a phone call memo system using voice recognition according to the present invention;

도2는 본 발명에 따른 시스템의 예시적인 블록도.2 is an exemplary block diagram of a system in accordance with the present invention.

< 도면의 주요 부분에 대한 부호의 설명 ><Description of Symbols for Main Parts of Drawings>

1: 본 발명의 시스템 10: 음성통신단말1: system 10 of the present invention: voice communication terminal

11: 송화부 12: 수화부11: telephone receiver 12: receiver

20: 음성인식수단 30: 출력수단20: voice recognition means 30: output means

30A: 프린터 30B: 모니터30A: Printer 30B: Monitor

40: 인식스위치 41: 인쇄실행버튼40: Recognition switch 41: Print execution button

42: 삭제버튼 43: 메인스위치42: Delete button 43: Main switch

50: 음성인식서버50: voice recognition server

본 발명에 따라, 음성통신단말, 음성인식수단 및 출력수단을 포함하는 음성인식을 이용한 전화통화내용 메모 시스템이 제공된다.According to the present invention, there is provided a telephone call content memo system using voice recognition comprising a voice communication terminal, voice recognition means and output means.

구체적으로, 본 발명에 따라, 송화부와 수화부를 구비하는 통상의 음성통신단말, 상기 송화부와 상기 수화부로부터의 음성신호 중 적어도 하나의 음성신호를 인식하여 인식된 음성인식데이터를 대응하는 문자데이터로 변환하는 음성인식수단, 및 상기 음성인식수단에 의해 획득된 문자데이터를 출력하는 출력수단을 포함하는 음성인식을 이용한 전화통화내용 메모 시스템이 제공된다.Specifically, according to the present invention, a general voice communication terminal having a transmitter and a receiver, at least one of the voice signals from the transmitter and the receiver, recognizes the voice recognition data, and the character data corresponding to the recognized voice recognition data. There is provided a phone call content memo system using voice recognition, comprising voice recognition means for converting the voice recognition means, and an output means for outputting the text data obtained by the voice recognition means.

상기 출력수단으로는 프린터와 모니터 중 적어도 하나를 사용할 수 있고 바람직하게는 프린터와 모니터 모두가 사용되며, 필요에 따라 음성합성수단과 스피커를 추가하여 음성인식된 내용이 합성음성으로 출력하도록 할 수도 있다.As the output means, at least one of a printer and a monitor may be used. Preferably, both a printer and a monitor are used. If necessary, a voice synthesizing means and a speaker may be added to output the voice recognition content as a synthesized voice. .

바람직하게, 본 발명의 전화통화내용 메모 시스템에는 음성인식수단의 음성인식실행의 온/오프를 자유로이 선택할 수 있는 인식스위치를 포함시킴으로써 통화자의 상기 인식스위치의 조작에 따라 메모할 필요가 있는 음성만을 선택하여 상기 출력수단을 통해 출력되도록 할 수 있다.Preferably, the telephone call contents memo system of the present invention includes a recognition switch capable of freely selecting on / off of voice recognition execution of the voice recognition means, so that only the voice that needs to be memorized according to the caller's operation of the recognition switch is selected. It can be output through the output means.

상기 음성인식수단으로서는 지금까지 개발된 다양한 음성인식기술을 본 발명의 목적에 맞게 적절히 적용하여 사용할 수 있다. 또한, 상기 음성인식수단은 본 발명의 시스템에 일체로 포함시키는 것을 기본으로 하지만, 음성인식수단의 핵심인 음성인식엔진을 인터넷 상의 음성인식서버에 구비시켜 인터넷 통신을 통해 문자인식을 지원 받도록 할 수도 있다.As the speech recognition means, various speech recognition technologies developed so far can be applied to suit the purpose of the present invention. In addition, the voice recognition means is based on the integral inclusion in the system of the present invention, but the voice recognition engine, which is the core of the voice recognition means may be provided in the voice recognition server on the Internet to support text recognition through Internet communication have.

이하, 첨부 도면을 참조하여 본 발명에 따른 음성인식을 이용한 전화통화내용 메모 시스템을 상세히 설명한다. 이하의 구체예는 본 발명에 따른 음성인식을 이용한 전화통화내용 메모 시스템을 예시적으로 설명하는 것일 뿐, 본 발명의 범위를 제한하는 것으로 의도되지 아니한다.Hereinafter, with reference to the accompanying drawings will be described in detail a phone call content memo system using voice recognition according to the present invention. The following embodiments are merely illustrative of the telephone call content memo system using voice recognition according to the present invention, and are not intended to limit the scope of the present invention.

도1은 본 발명에 따른 음성인식을 이용한 전화통화내용 메모 시스템의 예시적인 시스템 구성도, 도2는 본 발명에 따른 시스템의 예시적인 블록도이다.1 is an exemplary system configuration diagram of a phone call memo system using voice recognition according to the present invention, and FIG. 2 is an exemplary block diagram of a system according to the present invention.

도시된 바와 같이, 본 발명에 따른 음성인식을 이용한 전화통화내용 메모 시스템(1)은 음성통신단말(10), 음성인식수단(20) 및 출력수단(30)을 기본적으로 포함한다.As shown, the telephone call contents memo system 1 using the voice recognition according to the present invention basically includes a voice communication terminal 10, a voice recognition means 20 and an output means 30.

상기 음성통신단발(10)은, 널리 알려진 바와 같이 사용자의 음성을 전기신호로 전환시키는 마이크를 구비하는 송화부(11)와 전화회선을 통해 전송된 음성신호를 음성으로 전환하는 스피커를 구비한 수화부(12)를 포함하는 통상의 유무선 전화기(10)를 의미하며, 바람직하게는 영업소나 일반가정에서 가장 일반적으로 사용하는 유선전화기(10)이다.The voice communication terminal 10 is, as is well known, a receiver having a microphone 11 for converting a user's voice into an electrical signal and a speaker for converting a voice signal transmitted through a telephone line into voice. Means a conventional wired and wireless telephone 10 including (12), and is preferably a landline telephone 10 most commonly used in business offices and homes.

본 발명의 특징에 따라, 전화기(10)의 송화부(11)와 수화부(12)의 음성신호 중 적어도 하나의 음성신호는, 상기 전화기(10)에 전기적으로 접속된 음성인식수단(20)에 의해 인식되어 출력 가능한 문자데이터로 변환된다. 이와 같은 음성인식수단(20)에 의한 음성신호의 문자데이터로의 변환은 공지된 음성인식수단을 본 발명에 맞게 적용함으로써 실현할 수 있다.According to a feature of the invention, at least one of the voice signals of the talker 11 and the receiver 12 of the telephone 10 is connected to the voice recognition means 20 electrically connected to the telephone 10. Is converted into character data that can be recognized and output. The conversion of the voice signal into the text data by the voice recognition means 20 can be realized by applying a known voice recognition means in accordance with the present invention.

예를 들어 송화부(11)의 마이크에 의해 형성된 음성신호와 수화부(12)의 스피커로 제공되는 음성신호는, 음성인식수단(20)의 음성인식엔진(21)으로 제공됨으로써 음성에 대응하는 문자로서 출력될 수 있는 상태로 된다. 상기 음성인식엔진(21)은, 아날로그 음성신호를 디지털 음성신호로 변환하는 아날로그디지털 변환부(22), 디지털 음성신호로부터 그 특징을 추출하는 특징추출부(23), 추출된 특징으로부터 그 패턴을 인식하는 음성패턴인식부(24), 음성인식에 관련된 기준모델이 저장된 음성패턴기억부(25), 음성패턴인식부(24)의 정보와 음성패턴기억부(25)의 정보를 비교하여 음성을 인식하는 음성인식부(26), 및 인식된 음성인식데이터를 문자데이터로 변환하는 문자데이터변환부(27)를 포함하도록 구성할 수 있다.For example, the voice signal formed by the microphone of the transmitter 11 and the voice signal provided to the speaker of the receiver 12 are provided to the voice recognition engine 21 of the voice recognition means 20 so that a character corresponding to the voice is provided. It can be output as. The voice recognition engine 21 includes an analog-digital converter 22 for converting an analog voice signal into a digital voice signal, a feature extractor 23 for extracting the feature from the digital voice signal, and a pattern from the extracted feature. The voice pattern recognition unit 24 to recognize the voice, the voice pattern memory unit 25 storing the reference model related to the voice recognition, the information of the voice pattern recognition unit 24 and the information of the voice pattern memory unit 25 are compared. And a text data converter 27 for converting the recognized voice recognition data into text data.

본 발명의 전화통화내용 메모 시스템(1)이 인식하는 음성은, 송화부(11)의 마이크로 입력되는 본 발명 사용자의 음성(예, 영업소 통화자의 음성), 수신부(12)의 스피커로부터 출력되는 상대방의 음성(예, 영업소에 상품을 주문하는 고객의 음성), 또는 양자 모두 일 수 있다.The voice recognized by the telephone call content memo system 1 of the present invention is the voice of the user of the present invention (eg, the voice of a sales office caller) input into the microphone of the caller 11, and the other party output from the speaker of the receiver 12. May be the voice of a customer (eg, the voice of a customer ordering a product at a sales office), or both.

메모를 요하는 내용은 전화통화를 하는 양측 모두의 통화내용에 존재하는 것이 일반적임으로, 통화 당사자 양측 모두의 음성을 인식하고 메모할 수 있도록 하는 것이 가장 바람직하지만, 다양한 음색과 사투리 및 언어습관을 가진 다수의 상대방(고객) 음성을 신뢰성 있게 인식하기 위해서는 대용량의 음성인식수단(20)을 필요로 함으로, 본 발명 시스템(1)의 단순화를 위해서 본 발명 사용자의 음성(예, 영업소 통화자의 음성)만을 인식하도록 구성하는 것도 바람직할 수 있다.Since the contents requiring the memo are generally present in the contents of the call on both sides of the telephone call, it is most preferable to recognize and memoize the voices of both parties, but it has various voices, dialects and language habits. In order to reliably recognize a large number of other party's (customer) voices, a large amount of voice recognition means 20 is required, so that only the voice of the user of the present invention (for example, the voice of a sales office caller) is required for the simplification of the system 1. It may also be desirable to configure it to recognize.

후자의 영업소 통화자의 음성만을 인식하도록 할 경우에는, 화자종속 인식방법을 주로 채용하면서 영업소 통화자의 음성을 기억 및 학습시킴으로써 상대적으로 적은 용량의 음성인식수단(20)을 사용하여도 음성인식률을 높게 유지할 수 있다는 점에서 바람직하다. 이 경우 상대방(고객)의 음성은 인식이 되지 않음으로 영업소통화자는 상대방(고객)의 통화내용 중에 메모를 요하는 사항은 복창하여 언급하여야 한다. 실제 상품 주문을 받는 영업소의 통화자는, 예를 들어, "콤비네이션 피자1개를 주문 받았습니다"와 같은 방식으로 고객주문내용을 복창하여 확인하는 것이 일반적이기 때문에, 본 발명의 시스템(1)을 상대방(고객)의 음성은 인식하지 않도록 구축하여도 본질적인 기능 수행에는 큰 문제가 없다.In the case of recognizing only the voice of the latter office caller, the voice recognition rate is maintained high even when a relatively small voice recognition means 20 is used by memorizing and learning the voice of the office caller while employing a speaker dependent recognition method. It is preferable in that it can be. In this case, the voice of the other party (customer) is not recognized. Therefore, the sales caller shall repeat the contents of the call of the other party. Since the caller of the sales office receiving the actual product order generally checks the customer order contents in a manner such as "I have ordered one combination pizza", the system 1 of the present invention is checked by the counterpart ( Even if the customer's voice is constructed not to recognize, there is no big problem in performing essential functions.

상기 음성인식수단(20)에 의해 얻어진 출력 가능한 문자데이터는 출력수단(30)에 의해 출력된다. 본 발명에 사용되는 출력수단(30)으로서는 기존의 펜과 종이를 사용하여 행하여 오던 메모를 대체할 수 있는 것이라면 특히 제한되지 않는다.The printable text data obtained by the voice recognition means 20 is output by the output means 30. The output means 30 used in the present invention is not particularly limited as long as it can replace a memo that has been performed using an existing pen and paper.

구체적으로 상기 출력수단(30)으로는 문자데이터를 인식할 수 있도록 출력하는 프린터(30A) 및/또는 모니터(30B)를 사용할 수 있으며, 바람직하게는 프린터(30A)와 모니터(30B) 모두가 포함된다. 아울러 상기 음성인식수단(20)에 음성합성부를 부가하고 스피커를 장착하여 메모내용이 합성음성으로 출력되도록 할 수도 있다.Specifically, the output means 30 may use a printer 30A and / or a monitor 30B for outputting text data to be recognized. Preferably, both the printer 30A and the monitor 30B are included. do. In addition, a voice synthesizer may be added to the voice recognition means 20 and a speaker may be mounted so that the memo contents are output as a synthesized voice.

본 발명의 시스템(1)은, 널리 사용되고 있는 퍼스널컴퓨터에 음성인식수단(20)의 하드웨어와 소프트웨어를 장착하고 상기 퍼스널 컴퓨터에 프린터(30A)와 모니터(30B)를 연결한 구조로 구축할 수도 있고, 전화기(10)의 본체에 음성인식수단(20)과 소정의 프린터(30A)와 액정화면과 같은 모니터(30B)를 일체로 내장한 구조로 구축할 수도 있다.The system 1 of the present invention may be constructed in a structure in which hardware and software of the voice recognition means 20 are attached to a widely used personal computer, and a printer 30A and a monitor 30B are connected to the personal computer. The voice recognition means 20, a predetermined printer 30A, and a monitor 30B such as a liquid crystal screen may be integrally built into the main body of the telephone 10.

도1에서는 이와 같은 2가지 구성을 모두 나타내기 위해 퍼스널 컴퓨터와 이에 연결된 프린터(30A)와 모니터(30B) 뿐만 아니라 전화기(10)에 일체로 장착된 프린터(30A)와 모니터(30B)도 함께 표시하였다.In FIG. 1, not only a personal computer and a printer 30A and a monitor 30B connected thereto but also a printer 30A and a monitor 30B integrally mounted in the telephone 10 are shown together to show both of these configurations. It was.

일반적으로 영업소에서든 일반 가정에서든 통화 중에 메모할 사항은 통화내용전체가 아니라 그 중에 중요한 사항 일부인 것이 대부분이다. 따라서 본 발명의 시스템(1)에는 음성인식수단(20)의 음성인식실행을 자유롭게 선택적으로 온/오프할 수 있는 인식스위치(40)를 포함시킴으로써 상기 인식스위치(40)의 온/오프 조작에 따라 메모할 필요가 있는 음성만이 선택적으로 출력수단(30)으로 출력되도록 하는 것이 바람직하다.In general, whether you're in a business office or at home, the notes you take during a call aren't the entire call, but most of them are important. Therefore, the system 1 of the present invention includes a recognition switch 40 which can freely selectively turn on / off the voice recognition means of the voice recognition means 20 in accordance with the on / off operation of the recognition switch 40. It is preferable that only the voice which needs to be memorized is output to the output means 30 selectively.

인식스위치(40)는 음성인식실행을 단속하도록 전화기(10)의 송화부(11)와 수신부(12) 및 음성인식수단(20) 사이의 임의의 지점에 전기적으로 설치할 수 있으며, 인식스위치(40)는 전화를 받으면서 손가락으로 가볍게 눌러 작동시킬 수 있도록 푸시버튼 형태의 스위치로 전화기(10)의 다른 기능키 들과 함께 배치하는 것이 바람직하다.The recognition switch 40 may be electrically installed at any point between the talker 11 and the receiver 12 and the voice recognition means 20 of the telephone 10 to control the voice recognition, and the recognition switch 40 ) Is preferably a pushbutton type switch with other function keys of the phone 10 so as to be operated by lightly pressing with a finger while receiving a call.

또한, 전화기(10)에는 상기 인식스위치(40) 이외에, 프린터(30A)가 인쇄를 실행하도록 하는 인쇄실행버튼(41), 인식스위치(40)를 작동시켜 음성인식을 실행시켰으나 이미 실행시킨 내용이 쓸모가 없게 되어 다시 메모하여야 할 경우(예, 통화과정에서 처음 주문했던 것을 취소하고 다른 주문을 하는 경우) 이미 음성인식되어 있던 내용을 삭제하는 삭제버튼(42) 등, 본 발명의 시스템을 편리하게 사용하기 위한 각종 기능버튼들을 추가할 수 있다. 도면부호 43은 음성인식수단(20)의 작동을 완전히 중지시키는 메인스위치를 나타낸다.In addition to the recognition switch 40, the telephone 10 executes the print execution button 41 and the recognition switch 40 for causing the printer 30A to perform printing, but the voice recognition is executed. If the useless and need to re-memo (for example, cancel the first order in the call process and place another order) convenient system of the present invention, such as the delete button 42 for deleting the contents already recognized voice recognition Various function buttons can be added for use. Reference numeral 43 denotes a main switch for completely stopping the operation of the voice recognition means 20.

이상에서는 음성인식수단(20)의 핵심 구성인 음성인식엔진(21)이 물리적으로 본 발명의 시스템(1)에 일체로 구비된 예에 대하여 설명하였으나, 음성인식엔진(21)을 본 발명의 시스템(1)에 일체로 포함시키지 않고 음성인식엔진(21)을 구비한 음성인식서버(50)와의 인터넷 통신을 통해 문자인식을 원격 지원 받도록 할 수도 있다. 경우, 통화자가 인식스위치(40)를 작동하면 해당 음성신호가 인터넷을 통해 음성인식서버(50)로 전송되어 음성인식이 실행되고 음성인식에 의해 형성된 문자데이터가 다시 통화자에게 전송되어 출력수단(30)에 의해 출력되는 방식으로 메모가 실행된다.In the above, the example in which the voice recognition engine 21, which is a core component of the voice recognition means 20, is physically provided in the system 1 of the present invention has been described, but the voice recognition engine 21 is the system of the present invention. It is also possible to remotely support text recognition through internet communication with the voice recognition server 50 having the voice recognition engine 21 without being integrally included in (1). In this case, when the caller operates the recognition switch 40, the corresponding voice signal is transmitted to the voice recognition server 50 through the Internet to perform voice recognition, and the text data formed by the voice recognition is transmitted to the caller again and output means ( The memo is executed in the manner outputted by 30).

이와 같은 음성인식엔진(21)의 아웃소싱에 의하면, 여러 통화자들이 대용량, 고정밀도 및 고인식률의 음성인식엔진(21)을 구비한 음성인식서버(50)로부터 인터넷을 통해 고품질의 음성인식을 제공받을 수 있게 됨으로, 적은 시스템 구축비용으로 고품질의 음성인식을 구현할 수 있다는 점에서 바람직하다.According to the outsourcing of the voice recognition engine 21, various callers provide high quality voice recognition through the Internet from the voice recognition server 50 equipped with the voice recognition engine 21 having a large capacity, high precision, and high recognition rate. Since it can be received, it is preferable in that it can implement high quality voice recognition with a low system construction cost.

이상에서 설명한 본 발명의 음성인식을 이용한 전화통화내용 메모 시스템(1)에 있어서, 음성인식수단(20)의 구체적인 구성과 이를 실현하는 다양한과 음성인식을 위한 소프트웨어 등은 본 발명의 기술분야에 공지된 기술에 의해 본 발명의 특징에 맞게 적절히 구성할 수 있는 것이고, 이들 구성 자체가 본 발명의 특징이 아니므로 그 구체적인 설명은 생략한다.In the telephone call contents memo system 1 using the voice recognition of the present invention described above, the specific configuration of the voice recognition means 20, various software for realizing the same, and software for voice recognition are known in the art. It can be appropriately configured in accordance with the features of the present invention by the techniques described above, and the detailed description thereof will be omitted since these configurations themselves are not features of the present invention.

또한, 이상에서 본 발명을 설명함에 있어서 본 발명의 시스템이 상품을 주문 받는 영업소에 사용되는 예를 위주로 설명하였지만, 일반가정에서의 통화내용메모는 물론 사무실에서의 통화내용메모에도 본 발명이 그대로 적용될 수 있음이 당연하고, 아울러 널리 알려진 각종 OA 프로그램을 적용하여 음성인식메모내용을 소정의 원하는 양식으로 출력되도록 할 수 있음도 당연하다.In addition, in the above description of the present invention, the system of the present invention has been described with reference to an example of being used in an office for receiving an order. However, the present invention is applied to the call content memo in a general home as well as the call content memo in an office. Naturally, it is also possible to apply various well-known OA programs to output the voice recognition memo content in a desired format.

이상에서 설명한 본 발명에 따른 음성인식을 이용한 전화통화내용 메모 시스템에 의하면, 전화통화 중에 메모할 필요가 있는 통화내용을 음성인식을 통해 인쇄나 화상으로 자동으로 메모하여 줌으로 더 이상 전화통화내용의 메모에 펜과 메모지를 사용할 필요가 없으며, 따라서 본 발명을 통해 전화사용 및 전화사용에 관련된 일상생활과 업무활동이 더욱 편리해질 것으로 기대된다.According to the telephone call contents memo system using the voice recognition according to the present invention described above, the call contents that need to be memorized during the telephone call are automatically memorized by printing or an image through the voice recognition. It is not necessary to use a pen and a memo pad for the memo, and therefore, it is expected that the present invention will be more convenient in daily life and work activities related to the use of the phone and the use of the phone.

Claims

A general voice communication terminal having a transmitter and a receiver, voice recognition means for recognizing at least one voice signal among the voice signals from the transmitter and the receiver and converting the recognized voice recognition data into corresponding text data; and And an output means for recognizably outputting the text data obtained by the voice recognition means.

The telephone call contents memo system according to claim 1, wherein the output means comprises at least one of a printer and a monitor.

The telephone call contents using voice recognition according to claim 1, wherein the voice recognition means by the voice recognition means is provided by a voice recognition server having a voice recognition engine of the voice recognition means through internet communication. Memo system.

The apparatus of any one of claims 1 to 3, further comprising a recognition switch for selectively turning on / off the voice recognition performance of the voice recognition means so that only the voice that needs to be memorized in the contents of the call is recognized. Telephone call contents memo system using voice recognition.