KR20050004755A

KR20050004755A - Wireless Communication Devices with a Built-in Speech Recognition Function

Info

Publication number: KR20050004755A
Application number: KR1020040111666A
Authority: KR
Inventors: 김재형; 홍종철; 윤종민
Original assignee: 주식회사 비즈모델라인
Priority date: 2004-12-24
Filing date: 2004-12-24
Publication date: 2005-01-12

Abstract

PURPOSE: A wireless communication device embedded with a voice recognition function for credit card transactions is provided to secure convenience/promptness/security for actual use at the minimum expense while maximally utilizing a merit of mobility, when an existing portable terminal is used for credit card settlement. CONSTITUTION: A speaker(300) receives voice information of a client holding the wireless communication device. A voice recognition interface(310) analyzes features of inputted voice data. A user voice data storing part(340) stores voice data features analyzed through the voice recognition interface. A voice data comparing part(330) analyzes/compares the voice data inputted for financial transaction with the voice data features of the client stored in the user voice data storing part after receiving financial information from the client through the speaker. A text generating part(350) converts the inputted voice data into text data in case that the voice features of two data are identical with each other.

Description

Wireless Communication Devices with a Built-in Speech Recognition Function}

본 발명은 무선통신장치를 이용하여 신용카드 거래를 요청하고 요청된 메시지에 대한 거래 승인 여부를 수신하여 무선으로 신용카드 결제 처리를 수행하는 무선이동통신 결제 시스템에 있어서, 상기 무선 신용카드 결제 요청에 필요한 정보(판매대금, 가맹점번호 등)입력을 상기 무선통신장치 소유 클라이언트 본인의 음성을 사용하는 음성 정보 입력 수단과 상기 클라이언트가 입력하는 음성 데이터 특질을 보다 정확하게 분석하기 위한 음성 인식 수단과 상기 클라이언트가 입력하는 음성 데이터 특질을 상기 무선통신장치 내 미리 저장시켜 관리하는 클라이언트 음성특질 저장 수단과 신용카드 결제시 거래를 위해 입력된 상기 클라이언트의 음성 데이터와 기 입력된 클라이언트 음성 데이터 특질의 일치여부를 분석 비교하는 음성 데이터 비교 수단과 상기 음성 데이터 비교부를 통해 비교한 결과 상기 두 데이터가 일치할 경우 상기 입력된 음성데이터를 결제 정보에 필요한 문자(숫자) 데이터로 전환하여 생성하는 문자 전환 수단과 상기 생성된 결제 정보 데이터를 무선통신망을 통해 결제처리 인증 금융망으로 전송 및 상기 금융망으로부터 결제승인 데이터를 수신하는 결제 정보 데이터 송·수신 수단을 구비하여 이루어지는 것을 특징으로 하며, 무선 카드 결제시 음성인식기능을 이용하여 클라이언트의 편리성과 보안성을 크게 개선한 것이다.The present invention relates to a wireless mobile communication payment system for requesting a credit card transaction using a wireless communication device and receiving a transaction approval for a requested message and performing a credit card payment process wirelessly. Voice information input means using the wireless communication device owning client's own voice to input necessary information (sales price, merchant number, etc.), and voice recognition means for analyzing the characteristics of voice data inputted by the client more accurately. Analyzing and comparing the client voice quality storage means for storing and managing the input voice data characteristics in advance in the wireless communication device with the voice data of the client inputted for a transaction at the time of credit card payment and the previously input client voice data characteristics. Voice data comparison means And a text converting means for converting the input voice data into text (numeric) data required for payment information and the generated payment information data when the two data match with each other as a result of the comparison through the voice data comparison unit. And payment information data transmission / reception means for transmitting to a payment processing authentication financial network and receiving payment approval data from the financial network. The convenience and security of the client using a voice recognition function during wireless card payment. The castle has been greatly improved.

일반적으로 음성이 다른 인터페이스 보다 뛰어난 점은 크게 3가지로 나눠 볼 수 있다.In general, the voice is superior to other interfaces can be divided into three.

첫 번째 음성은 인간의 가장 자연스러운 의사 소통수단으로 인간과 인간이 대화를 나누듯 기계와 그러한 의사 소통이 가능하다면, 편리성은 어느 인터페이스에 비할 바가 못 될 것이다. 특히, 키보드나, 휴대 단말기의 자판등 기타의 인터페이스의 경우에는 사용을 위한 별도의 훈련이 필요하나, 음성의 경우 훈련이 불필요하므로 사용 시스템에 대한 적응성이 뛰어난 장점이 있다.The first voice is the most natural means of communication for human beings. If such communication with a machine is possible as humans communicate with each other, the convenience will not be comparable to any interface. In particular, in the case of keyboards, keyboards, and other interfaces, separate training for use is required, but in the case of voice, training is unnecessary, and thus, there is an advantage in that the system is highly adaptable.

두 번째로 병렬성이다.Second is parallelism.

음성으로 대화를 하면서 키보드를 치는 등의 동시적인 처리가 가능하며, 정보를 사용하거나 입력하는데 고정된 위치에서라기보다는 자유로운 움직임이 가능하다.Simultaneous processing, such as typing on the keyboard while talking by voice, allows for free movement rather than from a fixed position for using or entering information.

세 번째로는 자료입력의 고속화 및 원거리 입력이 가능하다.Thirdly, it is possible to speed up data input and remote input.

음성은 글을 쓰는 것에 비해서 4배 정도 빠른 것으로 알려져 있다. 따라서 연속적인 음성인식이 가능할 경우, 고속으로 자료 입력이 가능할 것으로 보인다. 전화선이나 무선통신을 통한 원거리 입력이 가능하며, 직접적인 자료 입력으로 인해 데이터 key-in등의 중간 결과 생략에 따른 작업의 신뢰도 향상 및 운영의 효율을 기할 수 있게된다.Voice is said to be four times faster than writing. Therefore, it is possible to input data at high speed when continuous speech recognition is possible. Remote input through telephone line or wireless communication is possible, and direct data input enables improved operation reliability and operation efficiency by eliminating intermediate results such as data key-in.

앞으로의 결제 방식의 흐름은 현금 결제 보다 신용카드를 이용한 결제로 갈 것이라는 것은 누구나 예측할 수 있는 사항이다. 여기서 신용카드 결제는 고객(구매자)과 가맹점(판매자)사이에 신용카드사(대납자)가 관여하여 고객대신 가맹점에 결제하여 주고 나중에 고객에게 결제대금을 받는 형태로 이루어진다. 물론 신용카드사는 고객과 가맹점에 일정의 수수료를 받고 있다. 신용카드사는 판매자를 가맹점으로 가입시키고 가맹점 번호를 부여한다.Anyone can predict that the flow of payment in the future will go to payment using credit card rather than cash payment. In this case, the credit card payment is a form in which a credit card company (payer) is involved between a customer (buyer) and a merchant (seller) to pay to the merchant instead of the customer and later receive payment from the customer. Of course, credit card companies charge a certain fee to customers and merchants. The credit card company registers the merchant as a merchant and assigns a merchant number.

고객은 가맹점에 설치된 카드 체크기를 이용하여 거래를 성립 시키게된다. 즉 카드를 체크기에 읽히고 거래 내역(물품 대금, 할부 개월수)을 입력한후 카드사로 전송하면 카드사에서는 고객에 대한 신용정도에 따라서 거래를 승인 할 것인지 거절할 것인지 또는 거래 대금이 카드 사용 한도액을 초과했는지를 가맹점의 카드체크기를 통해 전송한다.The customer establishes a transaction using the card checker installed at the merchant. That is, when the card is read in the checker, the transaction details (item price, installment months) are sent to the card company, and the card company accepts or rejects the transaction according to the credit rating of the customer, or the transaction price exceeds the card usage limit. It sends the card through the merchant's card check.

요즘 가맹점은 약간의 수수료를 신용카드 사업자에 지불하고 자동이체로 판매 대금을 입금 받는다. 자동 이체란 판매자가 카드사에 가맹점 요청시 신용카드 결제에 따른 대금을 입금 받을 통장 계좌번호를 함께 제출하고 거래 승인후 4-5일후 매출 전표의 카드사 제출과 관계없이 판매대금을 입금 받는다.Nowadays merchants pay a small fee to credit card providers and receive payments by direct debit. Direct debit means that the seller submits the bank account number to receive the credit card payment when the merchant requests the card company, and the payment is made regardless of the card company's submission of the sales slip 4-5 days after the transaction is approved.

매출 전표는 보통 1달에 한번 모아서 신용카드 사업자에게 제출한다. 거래를 위한 행위가 카드를 체크기에 읽히는 단계와 거래 대금과 할부 개월수를 입력하는 단계로 되어있다. 현재 개발중인 무선 전자결제 시스템보다 사용면에서 훨씬 간단/편리하다고 할 수 있다. 또한 유선망을 이용하므로 무선망보다도 별도의 장치 없이도 보안면에서 우위에 있다고 할 수 있다.Sales slips are usually collected once a month and submitted to the credit card provider. The transaction for the transaction consists of reading the card into the checker and entering the transaction price and the installment months. It is much simpler and more convenient to use than the wireless electronic payment system currently being developed. In addition, since it uses a wired network, it can be said to be superior in security without a separate device than a wireless network.

현재 휴대 단말기의 이동/휴대의 편리성를 신용카드 결제나 온라인 상에서 금융거래에 활용하려는 발명이 여러곳에서 준비되고 있으나 정작 서비스 실시에 있어서는 이것이 기존 방식에 비해서 얼마만큼의 편리성을 줄 수 있으며, 또한 고객이 이것을 사용할 것인지에 대해서 의문을 남기고 있는 것 또한 사실이며, 유선에 대한 무선의 단점인 보안성 문제의 극복도 다시 한번 재고 해야할 필요가 있다.At present, various inventions have been prepared to use the convenience of mobile / mobile for mobile payments in credit card payment or online financial transactions.However, when implementing the service, this can provide some convenience compared to the existing method. It is also true that the customer is questioning whether to use it, and it is necessary to reconsider overcoming the security problem, which is the disadvantage of wireless over wires.

고객이 서비스를 이용하도록 하는 가장 중요한 요소는 바로 편리성과 안정성 그리고 경제성이라는 것을 간과해서는 안된다.It should not be overlooked that the most important factor in making the service available to customers is convenience, stability and economy.

이와 관련한 기존의 발명들을 보면 단지 휴대 단말기를 카드 체크기 대용으로 사용하는 것을 골자로 한다. 그러나 고객의 신용카드 번호 / 입금 계좌번호를 일일이 휴대 단말기 자판으로 입력하여야 하며 판매자가 신용카드 사로부터 부여받은 가맹점 번호(8-11자리) 또한 직접 입력해야하는 번거로움이 있다.In view of the existing inventions related to this, only the use of a mobile terminal as a card checker is a good idea. However, the customer's credit card number / deposit account number must be entered into the handset keyboard, and the merchant number (8-11 digits) that the seller has received from the credit card company has to be manually entered.

현 신용거래에 있어서 신용카드를 체크기에 읽히고 거래대금/할부 개월수만을 입력하는 시스템과 비교하면 휴대할 수 있고 기존에 소유하고 있는 휴대 단말기를 이용한다는 장점을 빼면 편리성이나 보안성에 있어서 크게 뒤쳐진다고 할 수 있다.Compared to a system that reads a credit card at the checker and inputs only the transaction amount / monthly installment in the current credit transaction, it lags behind in convenience and security except for the advantage of using a portable terminal owned by the user. can do.

현재 개발 중이거나 개발이 완료된 무선 휴대 단말기를 이용한 신용카드 결제 방식들은 여러 가지 보안 방법을 채택하고 있지만(주로 공증키 암호방식) 무선만이 가지는 보안의 취약성을 얼마나 해결할 수 있을지는 아직 미지수이다.Credit card payments using wireless handset currently under development or completed have various security methods (mainly notary key cryptography), but it is still unknown how much the security weaknesses of wireless can be solved.

또한 사용에 있어서 편리성/ 신속성 문제도 해결 과제로 남아있다. 전화 번호 9-11자리도 누르기 싫어서 단축키를 사용하는 고객들이 현재의 카드 체크기에 카드를 읽힘으로써 모든 신용카드 거래가 종결되는 시스템을 외면하고 굳이 무선 휴대 단말기의 숫자 버튼을 눌러가며 신용거래를 할 것인가에 대한 물음은 서비스를 실시해 봐야 알겠지만 아마도 NO라는 대답이 맞을 것이다. 한번의 거래를 위해 카드번호 16자리, 가맹점 번호 8-11자리, 거래대금(보통 신용카드 거래를 기준으로 4-6자리)누르고 비밀번호 6-8자리를 직접 입력하면서까지 휴대 단말기로 거래를 할지가 의문이다.In addition, convenience / promptness of use remains a challenge. Customers who use shortcut keys because they don't want to press the 9-11 digits of the phone number will read the card to the current card checker and bypass the system where all credit card transactions are closed. The question is whether you should try the service, but the answer is probably NO. For a single transaction, you will need to press 16 card numbers, merchant number 8-11 digits, transaction amount (usually 4-6 digits based on credit card transactions) and enter 6-8 digits of password directly. I doubt it.

상기와 같은 문제점을 해결하기 위한 본 발명의 목적은 기존 휴대 단말기를 신용카드 결제에 사용시 이동성의 장점을 최대한 이용하면서, 실제 사용에 있어 최소한의 비용으로 편리성/ 신속성/ 보안성을 확보하기 위해 음성 인식 시스템을 서비스에 맞게 개발하여 활용하는데 있다.An object of the present invention for solving the above problems is to make the most of the advantages of mobility when using the existing mobile terminal for credit card payment, voice to secure convenience / speed / security at a minimum cost in actual use It is to develop and utilize recognition system for service.

즉, 본 발명은 무선 신용카드 결제 요청에 필요한 정보(판매대금, 가맹점번호 등)입력을 상기 무선통신장치 소유 클라이언트 본인의 음성을 사용하는 음성 정보 입력 수단과 상기 클라이언트가 입력하는 음성 데이터 특질을 보다 정확하게 분석하기 위한 음성 인식 수단과 상기 클라이언트가 입력하는 음성 데이터 특질을 상기 무선통신장치 내 미리 저장시켜 관리하는 클라이언트 음성 특질 저장 수단과 신용카드 결제시 거래를 위해 입력된 상기 클라이언트의 음성 데이터와 기 입력된 클라이언트 음성 데이터 특질의 일치여부를 분석 비교하는 음성 데이터 비교 수단과 상기 음성 데이터 비교부를 통해 비교한 결과 상기 두 데이터가 일치할 경우 상기 입력된 음성데이터를 결제 정보에 필요한 문자(숫자) 데이터로 전환하여 생성하는 문자 전환 수단과 상기 생성된 결제 정보 데이터를 무선통신망을 통해 결제처리 인증 금융망으로 전송 및 상기 금융망으로부터 결제승인 데이터를 수신하는 결제 정보 데이터 송·수신 수단을 구비하여 이루어지는 것을 특징으로 하는 음성인식기능이 내장된 무선통신장치를 개발함에 그 목적이 있으며,That is, according to the present invention, voice information input means using the voice of the client owning the wireless communication device and voice data characteristics input by the client can be used to input information (sales price, merchant number, etc.) required for the wireless credit card payment request. Voice recognition means for accurate analysis, client voice feature storage means for storing and managing the voice data quality inputted by the client in the wireless communication device in advance and voice data of the client inputted for transaction at the time of credit card payment After comparing the voice data comparing means and the voice data comparing unit for analyzing and comparing the characteristics of client voice data, if the two data match, the input voice data is converted into character (numeric) data necessary for payment information. Character Switching Means to Generate Built-in voice recognition function characterized in that it comprises a payment information data transmission and reception means for transmitting the generated payment information data to the payment processing authentication financial network via a wireless communication network and receiving payment approval data from the financial network. The purpose is to develop a wireless communication device,

이를 위하여 클라이언트의 음성 데이터를 입력받는 음성인식 인터페이스와 상기 음성인식 인터페이스를 통해 입력되는 상기 클라이언트의 음성 데이터(0에서 9까지의 음성숫자 데이터)를 미리 저장하여 관리하는 클라이언트 음성 데이터 저장부와 상기 음성인식 인터페이스를 통해 입력되는 상기 클라이언트의 음성 데이터의 잡음이나 왜곡된 부분을 제거 또는 재생하는 Low-Pass필터부(LPF)와 상기 Low-Pass 필터부를 통해 나오는 음성 데이터의 모음과 유성음의 특질을 추출하는 Formant 추출부와 상기 Formant 추출부를 통해 나오는 음성 데이터의 유성음과 무성음을 구별하여 추출하는 Fitch 추출부와 상기 Fitch 추출부를 통해 나오는 음성 데이터에서 불필요한 silence 부분을 제거하고, 필요한 음성 부분만을 검출하는 실음성구간 검출부와 상기 실음성구간 검출부를 통해 나오는 음성 데이터의 자음 특징을 추출하는 프리엠퍼시스부와 상기 각 구성부를 통과한 음성 데이터의 특징(보통 숫자로 표현)을 종합적으로 검출하는 음성 특징 파라미터 검출부와 상기 음성 특징 파라미터 검출부를 통해 나오는 상기 클라이언트의 음성특질의 특성과 상기 음성 데이터 저장부에 기 저장된 상기 클라이언트의 음성특질의 특성과 비교하는 음성 데이터 비교부와 상기 음성 데이터 비교부를 통해 비교한 결과 상기 두 데이터가 일치할 경우 상기 입력된 음성데이터를 결제 처리에 필요한 문자(숫자) 데이터로 전환하여 생성하는 문자 생성부와 상기 생성된 결제 정보 데이터를 무선통신망을 통해 결제처리 인증 금융망으로 전송 및 상기 금융망으로부터 결제승인 데이터를 수신하는 결제 정보 데이터 송·수신 부를 구비하여 이루어지는 것을 특징으로 하는 음성인식기능이 내장된 무선통신장치를 제공함에 그 목적이 있다.To this end, the client voice data storage unit and the voice to store in advance the voice recognition interface for receiving the voice data of the client and the voice data (voice number data from 0 to 9) of the client input through the voice recognition interface in advance A low pass filter (LPF) for removing or reproducing noise or distortion of the voice data of the client input through a recognition interface, and a collection of voice data coming out of the low pass filter and a feature of voiced sound are extracted. A real speech section that removes unnecessary silence parts from the speech extracting unit and the pitch extracting unit that extracts voiced and unvoiced speech from the formant extracting unit and the speech data from the formant extracting unit, and detects only necessary speech portions. Detection section and the real sound section test Pre-emphasis unit for extracting the consonant features of the voice data coming out through the unit, voice feature parameter detection unit for comprehensively detecting the characteristics (usually expressed as a number) of the voice data passed through each component unit and the voice feature parameter detection unit The voice data comparing unit comparing the characteristics of the voice characteristic of the client with the characteristics of the client client previously stored in the voice data storage unit and the voice data comparing unit are compared. A character generation unit for converting voice data into character (numeric) data required for payment processing, and transmitting the generated payment information data to a payment processing authentication financial network through a wireless communication network and receiving payment approval data from the financial network. Have a payment information data sending and receiving section It is an object of the present invention to provide a wireless communication device with a voice recognition function, characterized in that made.

도 1은 기존의 유선망을 이용한 카드 결제 시스템 전체 흐름을 보여주는 일실시예도이다.1 is an embodiment showing the entire flow of the card payment system using a conventional wired network.

도 2는 음성인식 시스템을 가진 휴대 단말기를 이용한 무선 카드결제 시스템에 대한 대략적인 구성도이다.2 is a schematic diagram of a wireless card payment system using a mobile terminal having a voice recognition system.

도 3은 무선 결제 서비스를 위해 휴대 단말기안에 추가 구성되는 시스템에 대한 구성도이다.3 is a configuration diagram of a system additionally configured in a mobile terminal for a wireless payment service.

도 4는 음성 데이터 비교부에 대한 구성도이다.4 is a configuration diagram of a voice data comparison unit.

도 5은 휴대 단말기의 메뉴에서 신용카드 결제모드를 선택하는 과정에 대한 일실시예도이다.5 is a diagram illustrating a process of selecting a credit card payment mode from a menu of a mobile terminal.

도 6은 신용카드 결제를 위한 거래 내역 작성을 위한 양식을 나타낸 일실시예도이다.Figure 6 is an embodiment showing a form for creating a transaction history for credit card payment.

도 7은 음성인식 시스템에서 거래를 위한 음성데이터를 어떻게 처리하는가에 대한 전체적인 순서도이다7 is an overall flowchart of how voice data for transaction is processed in a voice recognition system.

*도 8은 클라이언트 음성 데이터 생성 및 저장과정에 대한 순서도이다.8 is a flowchart illustrating a process of generating and storing client voice data.

[도면의 주요 부분에 대한 설명][Description of main part of drawing]

300 : 음성 데이터 입력부(스피커) 310 : 음성인식 인터페이스300: voice data input unit (speaker) 310: voice recognition interface

320 : Low Pass Filter 330 : 음성 데이터 비교부320: Low Pass Filter 330: Voice Data Comparator

340 : 클라이언트 음성 데이터 저장부 350 : 문자 생성부340: Client voice data storage unit 350: Character generation unit

360 : 수정부 370 : 디스플레이 화면360: vernier 370: display screen

400 : Formant 추출부 410 : Fitch 추출부400: Formant extraction unit 410: Fitch extraction unit

420 : 실음성 구간 검출부 430 : 프리엠퍼시스부420: real sound section detection unit 430: pre-emphasis unit

440 : 음성파라미터 검출부 450 : 비교부440: voice parameter detection unit 450: comparison unit

본 발명은 기존 휴대 단말기를 신용카드 결제에 사용시 이동성의 장점을 최대한 이용하면서, 실제 거래에 있어 최소한의 비용으로 편리성/ 신속성/ 보안성을 확보하기 위해 음성 인식 시스템 및 그 시스템을 이용한 무선 신용 카드 결제 방법에 관한 것으로서, 무선 신용카드 결제 요청에 필요한 정보(판매대금, 가맹점번호 등)입력을 상기 무선통신장치 소유 클라이언트 본인의 음성을 사용하는 음성 정보 입력 수단과 상기 클라이언트가 입력하는 음성 데이터 특질을 보다 정확하게 분석하기 위한 음성 인식 수단과 상기 클라이언트가 입력하는 음성 데이터 특질을 상기 무선통신장치 내 미리 저장시켜 관리하는 클라이언트 음성 특질 저장 수단과 신용카드 결제시 거래를 위해 입력된 상기 클라이언트의 음성 데이터와 기 입력된 클라이언트 음성 데이터 특질의 일치여부를 분석 비교하는 음성 데이터 비교 수단과 상기 음성 데이터 비교부를 통해 비교한 결과 상기 두 데이터가 일치할 경우 상기 입력된 음성데이터를 결제 정보에 필요한 문자(숫자) 데이터로 전환하여 생성하는 문자 전환 수단과 상기 생성된 결제 정보 데이터를 무선통신망을 통해 결제처리 인증 금융망으로 전송 및 상기 금융망으로부터 결제승인 데이터를 수신하는 결제 정보 데이터 송·수신 수단을 구비하여 이루어지는 것을 특징으로 하며, 클라이언트의 음성 데이터를 입력받는 음성인식 인터페이스와 상기 음성인식 인터페이스를 통해 입력되는 상기 클라이언트의 음성 데이터(0에서 9까지의 음성숫자 데이터)를 미리 저장하여 관리하는 클라이언트 음성 데이터 저장부와 상기 음성인식 인터페이스를 통해 입력되는 상기 클라이언트의 음성 데이터의 잡음이나 왜곡된 부분을 제거 또는 재생하는 Low-Pass필터부(LPF)와 상기 Low-Pass 필터부를 통해 나오는 음성 데이터의 모음과 유성음의 특질을 추출하는 Formant 추출부와 상기 Formant 추출부를 통해 나오는 음성 데이터의 유성음과 무성음을 구별하여 추출하는 Fitch 추출부와 상기 Fitch 추출부를 통해 나오는 음성 데이터에서 불필요한 silence 부분을 제거하고, 필요한 음성 부분만을 검출하는 실음성구간 검출부와 상기 실음성구간 검출부를 통해 나오는 음성 데이터의 자음 특징을 추출하는 프리엠퍼시스부와 상기 각 구성부를 통과한 음성 데이터의 특징(보통 숫자로 표현)을 종합적으로 검출하는 음성 특징 파라미터 검출부와 상기 음성 특징 파라미터 검출부를 통해 나오는 상기 클라이언트의 음성특질의 특성과 상기 음성 데이터 저장부에 기 저장된 상기 클라이언트의 음성특질의 특성과 비교하는 음성 데이터 비교부와 상기 음성 데이터 비교부를 통해 비교한 결과 상기 두 데이터가 일치할 경우 상기 입력된 음성데이터를 결제 처리에 필요한 문자(숫자) 데이터로 전환하여 생성하는 문자 생성부와 상기 생성된 결제 정보 데이터를 무선통신망을 통해 결제처리 인증 금융망으로 전송 및 상기 금융망으로부터 결제승인 데이터를 수신하는 결제 정보 데이터 송·수신 부를 구비하여 이루어지는 것을 특징으로 하는 음성인식기능이 내장된 무선통신장치에 관한 것이다.The present invention provides a voice recognition system and a wireless credit card using the system in order to secure convenience, speed and security at a minimum cost in actual transactions while maximizing the advantage of mobility when using an existing mobile terminal for credit card payment. A payment method, comprising: inputting information (sales price, merchant number, etc.) required for a wireless credit card payment request, using voice information input means using the voice of the client owning the wireless communication device, and voice data characteristics input by the client. Voice recognition means for more accurate analysis, client voice feature storage means for storing and managing the voice data quality input by the client in the wireless communication device in advance, and voice data and pre-input of the client input for transaction at the time of credit card payment Client voice days Characters that are generated by converting the input voice data into character (numeric) data required for payment information when the two data match, as a result of comparing the voice data comparison means for analyzing and comparing characteristics and the voice data comparison unit. And a payment information data transmission / reception means for transmitting the switching means and the generated payment information data to a payment processing authentication financial network through a wireless communication network and receiving payment approval data from the financial network. The client voice data storage unit and the voice recognition interface for storing and managing voice data (voice number data from 0 to 9) of the client input through the voice recognition interface and voice recognition interface to receive voice data in advance. The climax input Low-pass filter unit (LPF) for removing or reproducing noise or distorted parts of the voice data of an unwitting voice, and a formant extraction unit for extracting the collection of voice data and voice characteristics of the voice data from the low-pass filter unit and the formant extraction A pitch extractor for discriminating and extracting voiced and unvoiced voice data from the voice data, and a real speech section detector and a real speech section detector for removing unnecessary silence parts from the speech data output from the pitch extractor and detecting only necessary speech parts. Pre-emphasis unit for extracting the consonant features of the voice data coming out through the voice feature parameter detection unit for comprehensively detecting the characteristics (usually expressed as a number) of the voice data passed through each component and the voice feature parameter detection unit Voice characteristics of the client and the voice data storage The voice data comparison unit comparing the characteristics of the voice quality of the client previously stored in the unit and the voice data comparison unit, and the character data (numeric) data required for the payment processing when the two data match, And a payment information data transmitting / receiving unit for transmitting the generated text information and the generated payment information data to a payment processing authentication financial network through a wireless communication network and receiving payment approval data from the financial network. The present invention relates to a wireless communication device having a voice recognition function.

또한, 본 발명에 따른 신용 거래를 위해서는 1차적으로 클라이언트의 목소리 톤의 특성이 거래 내역서 작성완료 때까지 일치하여야 하므로(보통 1번의 거래에 있어 17-18번의 클라이언트가 본인인지를 확인하는 과정을 거침 - 가맹점번호 8-11자리, 거래대금 보통5-6자리, 할부개월수, 카드번호 등) 제 3자에 의한 사용이 원천 봉쇄되고 또한 거래내역의 마지막에 비밀번호 입력으로 2차 보안을 실현한다.In addition, for the credit transaction according to the present invention, the characteristics of the voice tone of the client must first be matched until the transaction statement is completed (usually undergoing a process of checking whether the client is 17-18 in one transaction. -8-11 digits of merchant number, 5-6 digits of transaction value, installment months, card number, etc.) The use by third parties is blocked, and the second security is realized by entering a password at the end of the transaction.

또한, 본 발명은 신용카드 결제 거래가 숫자 입력만으로 가능하므로(즉 카드번호, 가맹점 번호, 거래 금액, 할부 개월, 카드사로부터 수신되는 거래 승인 번호) 휴대 단말기에 추가된 별도의 신용 결제 모드에서 원하는 기 입력되어 있는 신용카드 중 원하는 신용카드(신용카드 정보가 휴대 단말기 안에 기 입력되어 있다. - 카드번호, 보안을 위한 클라이언트 개인정보)를 선택하고 가맹점 번호와 거래 대금, 할부 개월수를 음성으로 입력한 후 디스플레이 화면에 출력된 거래내역을 다시 한번 살펴본 후 비밀번호를 입력하고 전송시키고 카드사에서 거래에 대한 거래 승인번호를 수신 받으면 거래는 종결된다.In addition, the present invention is a credit card payment transaction is possible only by numeric input (ie card number, merchant number, transaction amount, installment month, transaction approval number received from the card company) desired input in a separate credit payment mode added to the mobile terminal Select the desired credit card (credit card information is pre-populated in the mobile terminal-credit card number, client personal information for security), and enter the merchant number, transaction price, and installment months by voice. After reviewing the transaction details displayed on the display screen, enter the password, send it, and receive the transaction approval number from the card company.

또한, 본 발명은 신용카드 결제에 1-9까지의 비연속적인 숫자를 사용하므로 음성인식의 연속성을 배제하여 시스템을 보다 간단하게 할 수 있으며 입력에 있어서 확실성을 부여할 수 있다.In addition, since the present invention uses non-consecutive numbers from 1 to 9 for credit card payment, the system can be made simpler by excluding continuity of voice recognition and can give certainty in input.

또한, 본 발명의 핵심인 음성인식 방법은 보통 어휘수, 화자종속여부, 발성음의 연속성 여부, 인식 단위에 따라 여러 가지로 나뉜다.In addition, the speech recognition method, which is the core of the present invention, is generally divided into various types according to the number of words, whether or not the speaker depends on the continuity of the uttered sound and the recognition unit.

첫 번째로 어휘수는 소(10-100개), 중(100-500), 대(1000개 이상)로 나뉘며 어휘수가 클수록 어휘간의 혼동 가능성이 높아지므로 인식률이 저하된다.First, the number of words is divided into small (10-100), medium (100-500), and large (1000 or more).

본 발명은 10개의 숫자만 사용하므로 각 음성문자에 대한 특성 추출을 다각적으로 보다 세분화할 수 있고, 기 입력된 음성 비교에 사용될 클라이언트의 음성문자 정보를 입력 받을 때 같은 음성문자를 5회 이상 반복 입력받아 클라이언트 음성의 특징의 공통성을 보다 상세히 추출하여 기존 휴대폰의 음성인식의 에러율(클라이언트가 본인이라도 주변의 소음이나 클라이언트 목소리 상태에 따라 인식이 안됨)을 100% 줄일 수 있다. (기존 휴대 단말기에 사용되는 음성인식 시스템은 많은 어휘에 대한 데이터를 저장하여야 하므로 인식률이 저하될 수밖에 없다.)Since the present invention uses only 10 numbers, it is possible to further refine the feature extraction for each phonetic character, and input the same phonetic character more than five times when receiving the phonetic character information of the client to be used for the previously input voice comparison. By extracting the commonality of the features of the client's voice in more detail, the error rate of the voice recognition of the existing mobile phone can be reduced by 100% even if the client is not recognized by the surrounding noise or the state of the client's voice. (The speech recognition system used in existing mobile terminals has to store data for many vocabularies, so the recognition rate is deteriorated.)

두 번째 화자 종속 여부는 학습에 사용된 화자와 인식에 사용될 화자가 동일한 경우를 화자종속이라고 하고, 학습에 사용된 화자와 인식에 사용될 화자가 다른 경우를 화자독립이라고 한다. 화자종속의 경우가 화자독립에 비해서 높은 인식율을 나타내는 것으로 알려져 있다. (음성인식 시스템은 클라이언트 음성패턴의 특징만 완전히 추출한다면 그 자체가 보안성을 지니고 있다.) 본 발명은 화자종속 방법을 선택할 것이며 이 선택 자체가 어느 정도 보안성을 지니고 있다.In the case of second speaker dependency, the speaker-dependent is the same when the speaker used for learning is the same as the speaker used for recognition, and the speaker-independent is called when the speaker used for learning is different from the speaker used for recognition. It is known that the speaker dependency has a higher recognition rate than the speaker independence. (The speech recognition system has security in itself if only the characteristics of the client speech pattern are completely extracted.) The present invention will select a speaker-dependent method, and this selection itself has some security.

세 번째 발성음의 연속성 여부에 따라 독립 발성음, 연결 발성음, 연속 발성음, 자유발화 등으로 나누어진다. 독립 발성음의 경우, 음성의 시작점과 끝점이 명확한 것으로 단어들의 발성시 단어의 앞과 끝에 명확한 휴지를 삽입하는 경우를 의미한다. 연결 발성음은 단어와 단어간에 짧은 휴지가 존재하도록 발성한 음성을 말하며, 연속 발성음은 단어와 단어간에 지극히 짧은 휴지가 존재하도록 발성한 음성을 말한다. 자유발화는 일상적인 대화를 나눌 때 발성하는 음성으로, 문법적으로 올바른 문장 이외에 비문법적인 구문구조를 가진 발성음 모두를 말한다.(예 : 독립발성 "학교",/ 연결발성 "나는", "학교에", "에" "간다"/연속발성 "나는 학교에 간다."/ 자유발화 "나는 학교에 간다. 홍석이와 함께.")According to the continuity of the third voice, it is divided into independent voice, connected voice, continuous voice, and free speech. In the case of independent speech, the beginning and end points of the speech are clear, which means that a clear pause is inserted at the beginning and end of the word when the words are spoken. Connected speech refers to voices uttered so that a short pause exists between words, and continuous speech sounds refers to voices uttered such that an extremely short pause exists between words. Free speech is a voice that is spoken in everyday conversations, and refers to both phonological sounds with non-grammatical syntactic structures in addition to grammatically correct sentences (eg, independent school "school", / connected school "I", "school"). "I go to school", "to go" / continuous "I go to school" / "I go to school. With Hongseok.")

네 번째로, 인식단위의 경우 단어와 같이 큰 인식단위를 사용하는 경우도 있지만 음소나 음절과 같이 부단어를 사용하는 경우도 있다. 인식 시스템의 구현을 단순화하고 대어휘 인식을 위해서 단어나 음절 등의 단위보다는 음소 등의 단위를 사용하여 인식기를 구성한다.Fourth, in the case of a recognition unit, a large recognition unit may be used like a word, but a subword may be used such as a phoneme or a syllable. To simplify the implementation of the recognition system and recognize the large vocabulary, the recognizer is constructed using units such as phonemes rather than units such as words or syllables.

본 발명은 무선 휴대 단말기로 음성(숫자만)을 이용한 신용카드 결제에 음성인식 시스템을 사용할 것이기 때문에 발성음은 간단한 독립발성이면 충분하며, 어휘수는 10자 내외이며, 보안을 위해 화자 종속 방식을 사용하게 된다.Since the present invention will use a voice recognition system for credit card payment using a voice (numbers only) as a wireless portable terminal, a simple independent voice is sufficient, and the vocabulary number is about 10 characters. Will be used.

이하, 본 발명의 바람직한 실시 예에 대하여 첨부도면을 참조하여 상세히 설명한다. 우선 각 도면의 구성요소들에 참조부호를 부가함에 있어서, 동일한 구성 요소들에 한해서는 비록 다른 도면상에 표시되더라도 가능한 한 동일한 부호로 표기되었음에 유의하여야 한다. 또한, 하기의 설명에서는 본 발명의 실시 예를 설명하기 위하여 구체적인 구성소자 등과 같은 많은 특정사항들이 도시되어 있는데, 이는 본 발명의 보다 전반적인 이해를 돕기 위해서 제공된 것일 뿐 이러한 특정 사항들 없이도 본 발명이 실시될 수 있음은 이 기술분야에서 통상의 지식을 가진 자에게는 자명하다 할 것이다. 그리고, 본 발명을 설명함에 있어서, 관련된 공지 기능 혹은 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. First, in adding reference numerals to components of each drawing, it should be noted that the same components are denoted by the same reference numerals as much as possible even if they are displayed on different drawings. In addition, in the following description, in order to explain the embodiments of the present invention, specific details such as specific elements are illustrated, which are provided to help a more general understanding of the present invention, and the present invention may be practiced without these specific details. It will be apparent to those skilled in the art. In describing the present invention, when it is determined that a detailed description of a related known function or configuration may unnecessarily obscure the subject matter of the present invention, the detailed description thereof will be omitted.

도 1은 종래의 유선망을 이용한 카드 결제 시스템 전체 흐름을 나타낸 것이다.Figure 1 shows the overall flow of the card payment system using a conventional wired network.

가맹점의 카드 체크기(100)는 신용카드 사업자(120) (또는 카드 조회 사업자, VAN사 라고도 함)가 가입되어 있는 신용카드 거래 전용망(110)에 연결되어 있고 신용카드 사업자(120)의 신용카드 거래 전용망(110)은 카드사(130)와 금융공동망(140)에 연결되어 있다.The card checker 100 of the merchant is connected to a credit card transaction exclusive network 110 to which a credit card provider 120 (or a card inquiry provider, also known as a VAN company) is subscribed, and a credit card transaction of the credit card provider 120 is performed. Dedicated network 110 is connected to the card company 130 and the financial common network (140).

즉 가맹점에서 신용카드 결제를 의뢰하게되면 카드 체크기(100)는 신용카드의 정보를 신용카드 거래 전용망(110)을 통해 신용카드 조회 사업자(120)에게 연결하고, 신용카드 조회 사업자(120)는 수신한 고객의 신용카드 정보를 해당 카드 발행사(130)로 보내어 카드의 신용여부를 확인한 후 가맹점의 카드 체크기(110)로 거래의 가능 또는 불가를 통보하게 된다. 거래가 가능할 시는 거래 승인 번호를 부여하게 된다.That is, when the merchant requests credit card payment, the card checker 100 connects the information of the credit card to the credit card inquiry service provider 120 through the credit card transaction exclusive network 110, and the credit card inquiry service provider 120 receives The customer's credit card information is sent to the card issuer 130 to confirm the credit of the card, and the merchant's card checker 110 notifies whether the transaction is possible or not. When a transaction is possible, a transaction approval number is assigned.

도 2는 음성인식 시스템을 가진 휴대 단말기(200)를 이용한 무선 카드결제 시스템에 대한 대략적인 예시도이다.2 is a schematic diagram of a wireless card payment system using a mobile terminal 200 having a voice recognition system.

신용카드 정보(카드번호, 클라이언트 개인정보)를 내장하고 거래 내역을 음성으로 입력할 수 있는 음성인식 시스템을 가진 휴대 단말기(200)와 상기와 같은 휴대 단말기(200)가 가입되어있는 무선 통신망(210)과 상기 데이터 전송 시스템으로부터 전송된 거래 요청 메시지에 휴대 단말기(200)와 그 클라이언트가 맞는 지를 확인하고 기존의 신용카드 조회 사업자(120)에 연결해 주는 무선전자거래 게이트웨이(220)와, 상기 신용카드 조회 사업자(120)와 연결되어 신용카드의 신용을 판단하여 승인 여부를 결정하는 카드 발행사(130)와 계좌 보유사들의 금융공동망(140)으로 구성된다.A mobile terminal 200 having a voice recognition system for embedding credit card information (card number and client personal information) and inputting transaction details by voice and a wireless communication network 210 to which the portable terminal 200 as described above is subscribed. And the wireless electronic transaction gateway 220 which checks whether the mobile terminal 200 and the client correspond to the transaction request message transmitted from the data transmission system and connects to the existing credit card inquiry service provider 120, and the credit card. It is composed of the card issuer 130 and the financial common network 140 of the account holders connected with the inquiry service provider 120 to determine the credit of the credit card to determine the approval.

유선의 신용카드 거래 전용망(110) 대신 무선통신망(210)을 사용하는 점과 중간에 무선전자거래 게이트웨이(220)를 거쳐서 거래가 이루어진다는 점을 빼면 상기 도1의 시스템과 동일하다.It is the same as the system of FIG. 1 except for using the wireless communication network 210 instead of the wired credit card transaction dedicated network 110 and the transaction being made through the wireless electronic transaction gateway 220 in the middle.

거래 성사시 카드사(130)는 거래 승인번호를 거래 요청 휴대 단말기(200)의 디스플레이 화면에 발송하여 주고 판매자는 이를 카드 매출전표 대신 카드사별 별도의 양식지에 판매대금/할부개월수/거래승인번호를 기입하고 구매자의 자필 서명을 받아 거래를 성사시킨다.When the transaction is completed, the card company 130 sends the transaction approval number to the display screen of the transaction requesting mobile terminal 200, and the seller sends the payment price, the installment month number, and the transaction approval number to a separate form paper by the card company instead of the card sales slip. Fill in and get the buyer's handwritten signature to close the deal.

도면 3은 서비스를 위해 휴대 단말기 안에 추가될 음성인식 시스템에 대한 블록도이다.3 is a block diagram of a voice recognition system to be added into a portable terminal for service.

휴대 단말기로 신용카드 결제 모드를 선택하면 신용카드 번호는 따로 입력할필요 없다 왜냐면 도 5에서처럼 휴대 단말기의 메모리에 고객이 소지한 신용카드에 대한 신용정보(신용카드 번호, 이름, 주민번호, 신용정보에 따른 사용 가능한 한도액, 신용카드의 유효 기간 등)가 패키지 형태로 이미 입력되어 있기 때문이다.If you select the credit card payment mode with your mobile device, you do not need to enter your credit card number separately, because your credit card information (credit card number, name, social security number, credit information) This is because the usable limit, expiration date, etc. of the credit card are already entered in the package form.

신용거래시 신용카드 결제를 위한 음성정보를 스피커와 음성인식 인터페이스(310)를 통해 Low-Pass필터부(LPF)(320)로 전달하여 필요한 음성정보 이외의 잡음이나 왜곡된 부분을 재생한다. 그리고 음성데이터 비교부 전달한다.During the credit transaction, the voice information for credit card payment is transferred to the Low-Pass Filter unit (LPF) 320 through the speaker and the voice recognition interface 310 to reproduce noise or distortion other than necessary voice information. The voice data comparison unit is then transferred.

음성데이터 비교부(330)로 전달된 음성 데이터는 기존에 클라이언트 음성 데이터 저장부(340)에 입력되어 있는 고객의 음성과 일치여부를 확인하기 위한 비교 데이터를 추출하기 위해 음성 데이터 비교부(330) 내의 Formant추출부(400)에서 모음 또는 이와 유사한 유성음의 단음을 주파수로 분석하여 입력되는 모음과 유성음의 특질을 추출한 후 Fitch추출부(410)에서 모음과 유성음의 준주기적인 특성과 파형의 최대값을 이용하여 유성음과 무성음을 구별한다. 무성음은 파형이 비 주기적이며 분석구간 내 최대값이 상대적으로 작다.The voice data delivered to the voice data comparator 330 is compared with the voice data comparator 330 to extract comparison data for confirming whether or not the voice of the client has been previously input to the client voice data storage 340. After extracting the characteristics of the input vowels and voiced sounds by analyzing the single vowels of vowels or similar voiced sounds in the formant extractor 400, the quasi-periodic characteristics of the vowels and voiced sounds and waveforms are maximum in the pitch extractor 410. Use to distinguish between voiced and unvoiced sounds. The unvoiced sound is non-periodic in waveform and relatively small in the analysis section.

Fitch추출부(410)를 통과한 음성데이터는 실음성구간 검출부(420)에서 불필요한 silence 부분을 없애고 꼭 필요한 음성 부분만을 검출한다. 클라이언트 음성 특질을 추출하는데 중요한 과정이다. 실음성 구간 검출부(420)에서 자음의 특징을 확실히 하기위해서 프리엠퍼시스 과정(430)을 실시한후 음성 특징 파라미터 검출부(440)에서 음성의 특징(보통 숫자로 표현)을 종합적으로 검출한다.The speech data passing through the pitch extracting unit 410 removes unnecessary silence portions from the real speech region detecting unit 420 and detects only the necessary speech portions. This is an important process for extracting client voice qualities. In order to ensure the characteristic of the consonant in the real speech section detector 420, the pre-emphasis process 430 is performed, and then the speech characteristic parameter detector 440 comprehensively detects the speech feature (usually represented by a number).

음성 특징 파라미터 검출이 끝나면 클라이언트 음성 데이터 저장부(340)에 기 입력되어있는 클라이언트의 음성 테이터를 비교부(450)로 보낸다. 비교부(450)에서는 음성 특징 파라미터 검출이 끝난 입력음성 데이터와 클라이언트 음성 데이터 저장부(340)에 기 입력되어있는 데이터와 클라이언트 본인 여부를 비교한다. 만일 두 신호가 일치하면 문자생성부(350)로 보내어 음성 데이터에 맞는 문자를 생성하여 도 6처럼 휴대 단말기 신용카드 결제 모드의 디스플레이 화면(370)에 차례로 출력한다. 이때 각 음성에 대한 문자는 기존 메모리의 데이터 베이스에 링크되어 있다.After the voice feature parameter detection is completed, the voice data of the client, which is previously input to the client voice data storage unit 340, is transmitted to the comparator 450. The comparator 450 compares the input voice data from which the voice feature parameter is detected with the data previously input to the client voice data storage unit 340 with the client's identity. If the two signals match, it is sent to the text generation unit 350 to generate characters suitable for the voice data, and output them in sequence to the display screen 370 of the mobile terminal credit card payment mode as shown in FIG. At this time, the character for each voice is linked to the database of the existing memory.

만일 두 데이터가 불일치한 경우는 수정부(360)에서 음성인식 인터페이스(310)를 통하여 도 5처럼 수정 메시지를 화면(370)에 출력함과 동시에 간단한 소리를 동시에 출력한다.If the two data are inconsistent, the correction unit 360 outputs a correction message on the screen 370 and simultaneously outputs a simple sound through the voice recognition interface 310.

한번의 거래를 위해서는 여러 번의 숫자음성의 입력이 필요하다.(보통 17-18번 정도) 거래 내역서가 작성될 때까지 계속적으로 음성 데이터 입력을 요구하게 된다.One transaction requires multiple inputs of numeric voices (usually 17-18), which requires continuous voice data input until a transaction statement is completed.

보안의 핵심인 음성 데이터 비교부(330)의 역할은 클라이언트 본인의 음성이 아닌 다른 사람의 음성으로 음성 데이터를 입력하면 계속적으로 에러 메시지만 출력하고 도 5와 같은 결제를 위한 내역서 자체를 작성할 수 없도록 하는 것이다.즉 도 5와 같이 계속 에러 메시지만 출력되고 화면은 빈 공간으로 계속 남아 있게 되어 거래가 이루어지지 않는다.The role of the voice data comparator 330, which is the core of security, is to input only the voice data by a voice other than the client's own voice, so that only the error message is continuously output and the statement for payment itself as shown in FIG. 5 cannot be created. That is, as shown in FIG. 5, only an error message is output and the screen remains empty and no transaction is performed.

현재 연구중이거나 시행중인 음성 인식방법은 가장 널리 쓰이며 대용량 연속음성인식을 가능하게 하는 은닉 마코프모델(Hidden Markov Model)과 동적정합법(Dynamic time warping), 신경망 방식(Neural Network) 등이 있다.The speech recognition methods currently being studied or implemented are the most widely used, including Hidden Markov Model, Dynamic Time Warping, and Neural Network, which enable large-scale continuous speech recognition.

신용카드 결제에 필요한 음성 데이터는 0 -9 까지의 숫자만으로 가능하므로 동적정합법을 사용하기로 한다. 또한 클라이언트 본인의 일치여부를 보다 정확히 하기 위해 클라이언트 음성의Formant, Fitch를 추출하여 클라이언트 음성 특성을 보다 명확하게 비교할 수 있도록 한다. 물론 앞으로 보다 나은 음성인식 방법이 나오면 필요에 따라 더 좋은 방법을 채택한다는 것은 당연한 일이다.Since voice data required for credit card payment can be set to 0-9 only, the dynamic matching method is used. In addition, it is possible to compare the client voice characteristics more clearly by extracting the formant and pitch of the client voice to more accurately match the client. Of course, if a better voice recognition method emerges in the future, it is natural to adopt a better one as needed.

*동적정합법이란 대표 패턴과 주어진 입력 패턴 건을 비교하여 둘사이의 유사성을 판별하는 방법이다. 같은 단어를 발성할 경우라도 화자의 감정, 주변환경, 소음 등에 따라 각기 다른 지속시간을 가지므로, 이러한 지속길이의 불일치를 비선형적으로 최적화하는 방법으로 부분 최적화에 기반을 두어 전체적인 최적화를 수행하는 특성을 갖는다.* Dynamic matching method is to compare the representative pattern with the given input pattern and determine the similarity between the two. Even when the same word is spoken, it has different durations according to the speaker's emotion, surroundings, and noise. Therefore, the non-linear optimization of the mismatches of the durations is based on the partial optimization. Has

동적 프로그래밍으로 인해 계산량이 많고, 음성의 시작점과 끝점을 미리 정확히 알아야하는 제약조건이 있지만 우리가 사용할 음성 데이터는 0-9까지의 숫자 문자로 이루어지기 때문에 대상어휘가 작고, 고립단어 인식에 사용되며, 기준 패턴을 쉽게 만들 수 있는 동적정합법이 서비스를 위해 우선은 적합하다.Due to the dynamic programming, there is a large amount of calculation, and there is a constraint that we need to know the starting point and ending point of the voice in advance, but the speech data we use is composed of numeric characters from 0-9. First of all, dynamic matching, which can easily create a reference pattern, is suitable for services.

사용하는 음성 데이터가 소어휘(10-100) 수준도 되지 않고, 결제를 위해 음성 데이터 입력시 전화번호를 불러 주는 것처럼 각 숫자를 1개씩 고립적으로 입력하기 때문에(예 712342 : 칠, 일, 이, 삼, 사, 이) 음성 데이터 베이스 양이 다른 상황에 활용되어지는 음성인식 방법보다 현저히 적다는 장점이 있다. .Since the voice data used is not at the level of small vocabulary (10-100), and each number is isolated one by one like calling a phone number when inputting the voice data for payment (Example 712342: seven, seven, two, 3) There is an advantage that the amount of voice database is significantly smaller than the voice recognition method used in other situations. .

동적정합법은 0-9까지의 클라이언트의 기본 음성패턴을 만들 때 같은 단어를 발성할 경우, 같은 화자라고 하여도 화자의 감정, 주변환경에 따라 각기 다른 지속시간을 가지므로 이러한 지속시간의 불일치를 비선형적으로 최적화하는 인식방법이다.The dynamic matching method uses the same voice when making the basic voice patterns of the clients from 0 to 9, and even if the same speaker has different durations depending on the speaker's emotions and the surrounding environment, the discrepancy between these durations is avoided. It is a nonlinear optimization method.

보안성을 확보하기 위해 즉 다른 클라이언트의 음성으로 데이터를 입력하면 상기에 서술한 바와 같이 거래가 성립되지 않게 하기 위해서 다루는 입력 데이터의 수(0-9)가 아주 소량이라는 장점을 활용하여 화자의 음성 데이터 패턴시 각 데이터에 대한 Formant[Vocal tract의 fundamental frequency, : 어떤 음색을 특정 짓는, 음원의 스펙트럼 에너지가 집중되어 있는 대역, 혹은 얼마간의 대역군. 모음 또는 이에 가까운 유성음의 단음을 주파수로 분석하였을 때 공진으로 인하여 몇 군데의 특정 주파수 영역에 생기는 에너지의 산(봉우리)을 말한다. 이 영역의 중심 주파수를 formant frequency라고 하는데 낮은 것부터 1st, 2nd,.formant라고 부른다.In order to secure the security, that is, inputting data by the voice of another client, as described above, the number of input data (0-9) that is dealt with in order to prevent a transaction from being established is utilized. Formant [Vocal tract fundamental frequency] for each data in the pattern: The band in which the spectral energy of a sound source is concentrated, or a group of bands, that specifies a certain tone. It is the peak of energy generated in several specific frequency ranges due to resonance when the vowels or the close voices of voiced sounds are analyzed by frequency. The center frequency of this region is called the formant frequency and is called low to 1st, 2nd, .formant.

단음의 언어 음으로 구별은 formant구성에 크게 의존한다. 을 추가로 데이터 베이스화하여 음성 데이터 비교부에서 같이 비교하여 화자의 본인여부를 판명하도록 한다. 또한 Fitch부에서 모음과 유성음의 준주기적인 특성과 파형의 최대값을 이용하여 유성음과 무성음을 구별한다. 무성음은 파형이 비 주기적이며 분석 구간내 최대값이 상대적으로 작다.The distinction between monolingual and verbal sounds depends heavily on the formant configuration. In addition, by making a database, the voice data comparator compares together to determine whether or not the speaker is identified. Also, the pitch part uses quasi-periodic characteristics of vowel and voiced sound and maximum value of waveform to distinguish between voiced and unvoiced sound. The unvoiced sound is aperiodic in waveform and relatively small in the analysis section.

도 4는 음성 데이터 비교부(330)에 대한 블록도이다.4 is a block diagram of the voice data comparison unit 330.

기본 음성 인식 방법을 응용하여 본 발명의 보안성을 해결하는 핵심 부분인 음성 데이터 비교부(330)는 Formant 추출부(400), Fitch추출부(410), 실음성 구간 검출부(420)(EndPoint Detection)와 프리엠퍼시스부(430)와 음성 파리미터 특성 검출부(440)와 비교부(450)로 구성되어있다.The speech data comparator 330, which is a key part of solving the security of the present invention by applying a basic speech recognition method, includes a formant extractor 400, a pitch extractor 410, and a real speech section detector 420 (EndPoint Detection). ), A pre-emphasis unit 430, a voice parameter characteristic detector 440, and a comparator 450.

Formant 추출부(400)에서 모음 또는 이와 유사한 유성음의 단음을 주파수로 분석하여 입력되는 모음과 유성음의 특질을 추출한후 Fitch 추출부(410)에서 모음과 유성음의 준주기적인 특성과 파형의 최대값을 이용하여 유성음과 무성음을 구별한다. 무성음은 파형이 비 주기적이며 분석 구간내 최대값이 상대적으로 작다.Formant extractor 400 analyzes the vowels or similar voices of single voices by frequency and extracts the characteristics of the input vowels and voiced sounds, and then extracts 410 the quasi-periodic characteristics of the vowels and voiced sounds and the maximum value of the waveform. Distinguish between voiced and unvoiced sounds. The unvoiced sound is aperiodic in waveform and relatively small in the analysis section.

실음성 구간 검출부(420)(EndPoint Detection)는 녹음된 음성중 불필요한Silence부분을 없애고 꼭 필요한 음성 부분만을 검출하는 과정이다.The real voice section detecting unit 420 removes unnecessary silence from the recorded voice and detects only the necessary voice.

프리엠퍼시스(430) 과정은 실음성 구간에서 자음의 특징을 확실히 하기 위해 실시한다.The pre-emphasis 430 process is performed to ensure the characteristics of the consonants in the real speech section.

상기에 기술한 봐와 같이 모음 및 이에 가까운 유성음과 무성음은 각 음성 데이터의 formant를 별도로 패턴하여 비교한다.As described above, vowels, voiced sounds and unvoiced sounds close to each other are separately patterned for formant of each voice data.

비교부(450)는 실음성 구간 검출부(420)(EndPoint Detection)와 프리엠퍼시스(430) 과정을 거친 데이터와 기 입력된 클라이언트의 음성 데이터를 비교하여 클라이언트가 본인인가를 판단하여 본인이 확인된 경우 입력 데이터를 음성인식 인터페이스(310)를 통해 디스플레이 화면(370)에 표시하고, 불일치할 경우는 수정부(360)과 음성인식 인터페이스(310)를 통해 경고 메시지를 출력한다.The comparator 450 compares the data of the real voice section detection unit 420 (EndPoint Detection) and the pre-emphasis 430 process with the voice data of the previously input client to determine whether the client is the user, If the input data is displayed on the display screen 370 through the voice recognition interface 310, and if there is a mismatch, a warning message is output through the correction unit 360 and the voice recognition interface 310.

도 5는 휴대 단말기의 메뉴에서 신용카드 결제모드를 선택하는 과정과 클라이언트 불일치시에 대한 에러 메시지 출력에 대한 예시도이다.5 is an exemplary diagram illustrating a process of selecting a credit card payment mode from a menu of a mobile terminal and outputting an error message when a client mismatch occurs.

도 6은 신용카드 결제를 위한 거래 내역 작성을 위한 양식을 나타낸 예시도이다.6 is an exemplary view showing a form for creating a transaction history for credit card payment.

도 7 은 음성인식 시스템에서 거래를 위한 음성데이터를 어떻게 처리하는가에 대한 전체적인 순서도이다.7 is an overall flowchart of how voice data for transaction is processed in a voice recognition system.

거래를 위한 음성 데이터가 입력되면(700) LPF(320)를 통해 필요한 음성정보 이외의 잡음이나 왜곡된 부분을 재생한다(705). LPF(320)를 통과한 입력 음성데이터는 Formant특질을 추출하여(710) 모음 또는 이와 유사한 유성음의 단음을 주파수로 분석하여 입력되는 모음과 유성음의 특질을 추출한후 Fitch특질을 추출하여(715) 모음과 유성음의 준주기적인 특성과 파형의 최대값을 이용하여 유성음과 무성음을 구별한다. 무성음은 파형이 비 주기적이며 분석 구간내 최대값이 상대적으로 작다.When the voice data for the transaction is input (700), the LPF 320 reproduces noise or distortion other than necessary voice information (705). The input voice data passing through the LPF 320 extracts the formant feature (710) by analyzing the vowels or similar single voices of the voiced voices by frequency and extracts the input vowel and voiced features, and then extracts the pitch feature (715). We distinguish between voiced and unvoiced by using the quasi-periodic characteristics of the overshoot and the maximum value of the waveform. The unvoiced sound is aperiodic in waveform and relatively small in the analysis section.

Fitch추출부(410)를 통과한 음성데이터는 실음성구간 검출부(420)에서 불필요한 silence 부분을 없애고 꼭 필요한 음성 부분만을 검출한다(720). 클라이언트 음성 특질을 추출하는데 중요한 과정이다. 실음성 구간 검출부(420)에서 자음의 특징을 확실히 하기 위해서 프리엠퍼시스 과정을 실시한후(725) 음성 특징 파라미터 검출부(440)에서 음성의 특징(보통 숫자로 표현)을 종합적으로 검출한다(730). 음성 특성 파라미터 검출이 끝나면 클라이언트 음성 데이터 저장부에 기 입력되어있는 클라이언트의 음성 테이터를 비교부로 보낸다.The voice data passing through the pitch extractor 410 removes unnecessary silence parts from the real voice interval detection unit 420 and detects only necessary voice parts (720). This is an important process for extracting client voice qualities. After the pre-emphasis process is performed in order to ensure the characteristic of the consonant in the real voice section detector 420 (725), the voice characteristic parameter detector 440 comprehensively detects the characteristic of the voice (usually represented by a number) (730). . After detecting the voice characteristic parameter, the voice data of the client, which is pre-inputted in the client voice data storage unit, is sent to the comparator.

비교부(450)에서는 음성 특징 파라미터 검출이 끝난 입력음성 데이터와 클라이언트 음성 데이터 저장부(340)에 기 입력되어있는 데이터와 클라이언트 본인 여부를 비교한다. 만일 두 신호가 일치하면(740) 문자생성부(350)로 보내어 음성 데이터에 맞는 문자를 생성하여(750) 화면에 출력한다(755). 이때 각 음성에 대한 문자는 기존 메모리의 데이터 베이스에 링크되어 있다.The comparator 450 compares the input voice data from which the voice feature parameter is detected with the data previously input to the client voice data storage unit 340 with the client's identity. If the two signals match (740), the character generator 350 sends the text to the screen to generate a character suitable for the voice data (750) and outputs it to the screen (755). At this time, the character for each voice is linked to the database of the existing memory.

한번의 거래를 위해서는 여러번의 숫자음성의 입력이 필요하다.(보통 17-18번 정도) 거래 내역서가 작성이 완료되었나 체크하여 완료되었으면(765) 종료하고 그렇지 않으면 추가입력을 요구한다. .A single transaction requires multiple inputs of numeric voices (usually around 17-18). If the transaction details are completed or checked and completed (765), they are terminated. Otherwise, additional input is required. .

도 8은 음성인식 시스템 사용시 거래를 위해 입력되는 음성데이터의 본인 여부를 확인하기 위해 클라이언트 본인의 음성의 특질을 어떻게 휴대 단말기에 미리 입력하여 관리하는가에 대한 클라이언트 음성 데이터 생성 및 저장과정에 대한 순서도이다.FIG. 8 is a flowchart illustrating a process of generating and storing client voice data regarding how the client's voice characteristics are inputted and managed in advance in a mobile terminal to check whether the voice data inputted for a transaction when using the voice recognition system. .

서비스 이용을 원하는 고객은 1-9까지의 음성을 휴대 단말기에 0부터 차례로 한 숫자에 대해서 약 5회 정도를 반복하여 입력한다. 시스템은 이를 평균하여 보다 확실한 클라이언트 음성 특질을 추출할 수 있다.The customer who wants to use the service repeatedly inputs the voice of 1-9 to the portable terminal about 5 times for the number one after the other. The system can average this to extract more robust client voice qualities.

클라이언트의 음성 데이터가 입력되면(800) LPF(320)를 통해 잡음 및 왜곡된 부분을 최대한 복원한후(805), COUNTER(803)에 N=0을 입력한다(810). (이는 여러번 반복 입력을 위해 필요한 시스템이다.) 입력 음성 데이터는 Formant특질을 추출하여(815) 모음 또는 이와 유사한 유성음의 단음을 주파수로 분석하여 입력되는 모음과 유성음의 특질을 추출한후 Fitch특질을 추출하여 모음과 유성음의 준주기적인 특성과 파형의 최대값을 이용하여 유성음과 무성음을 구별한다(820).When the voice data of the client is input (800), the noise and the distorted portion are recovered as much as possible through the LPF 320 (805), and N = 0 is input to the COUNTER 803 (810). (This is a system necessary for repetitive input several times.) The input voice data is extracted formant feature (815) and the vowel or similar voiced sound is analyzed by frequency to extract the input vowel and voiced sound quality and then the pitch feature is extracted. By using the quasi-periodic characteristics of the vowel and voiced sound and the maximum value of the waveform, the voiced sound and the unvoiced sound are distinguished (820).

무성음은 파형이 비 주기적이며 분석 구간내 최대값이 상대적으로 작다. Fitch추출부(410) 통과한 음성데이터는 실음성구간 검출부(420)에서 불필요한 silence 부분을 없애고 꼭 필요한 음성 부분만을 검출한다(825). 클라이언트 음성 특질을 추출하는데 중요한 과정이다.The unvoiced sound is aperiodic in waveform and relatively small in the analysis section. The speech data passing through the pitch extracting unit 410 removes unnecessary silence portions from the real speech region detecting unit 420 and detects only necessary speech portions (825). This is an important process for extracting client voice qualities.

실음성 구간 검출부(420)에서 자음의 특징을 확실히 하기 위해서 프리엠퍼시스 과정을 실시한 후(830) 음성 특징 파라미터 검출부(440)에서 음성의 특징(보통 숫자로 표현)을 종합적으로 검출한다(835). 음성 특성 파라미터 검출이 끝났으면 입력 음성 데이터에 할당된 N을 조사하여 4보다 크거나 같은지를 확인한다. 4보다 작으면(850) 분석한 음성 특질을 임시저장 메모리(812)에 저장하고(860) 다시 같은 입력데이터를 클라이언트 목소리로 입력을 요구한다(855).After performing the pre-emphasis process in order to ensure the characteristic of the consonant in the real speech section detector 420 (830), the speech characteristic parameter detector 440 comprehensively detects the characteristic of the speech (usually represented by a number) (835). . When the voice characteristic parameter detection is finished, the N assigned to the input voice data is examined to determine whether it is greater than or equal to four. If it is less than 4 (850), the analyzed voice quality is stored in the temporary storage memory 812 (860), and the same input data is requested again as a client voice (855).

N>=4를 만족하면(840) 임시 저장 메모리에 입력된 데이터와 N>=4를 만족하는 데이터를 음성 특성 종합부로 보내어 한 음성 데이터에 대한 5번의 음성 특성에 대한 평균치를 종합한다(845). 음성 데이터 특성 종합이 끝나면 종합된 데이터를 클라이언트 음성 데이터 저장부(340)로 보낸다(865). 클라이언트 음성 데이터 저장부(340)에 저장된 데이터는 휴대 단말기를 이용한 신용카드 결제시 입력되어질 음성데이터의 클라이언트 본인 여부의 판명에 사용되어진다.If N> = 4 is satisfied (840), the data input into the temporary storage memory and data satisfying N> = 4 are sent to the voice characteristic synthesis unit, and the average value of five voice characteristics of one voice data is synthesized (845). . When the voice data characteristic synthesis is completed, the synthesized data is sent to the client voice data storage unit 340 (865). The data stored in the client voice data storage unit 340 is used to determine whether or not the client is the voice data to be input during the credit card payment using the mobile terminal.

본 발명에 따르면, 무선통신장치를 이용한 무선 신용카드 결제 서비스에 있어서 사람에게 가장 편리하고 친근한 인터페이스인 음성 인식 시스템을 사용함으로써, 무선카드 결제 서비스의 문제점인 편리성과 보안성을 동시에 해결하 수 있으며, 보다 건전한 신용결제 사회를 구현할 수 있는 장점이 있다.According to the present invention, by using a voice recognition system that is the most convenient and friendly interface to a person in a wireless credit card payment service using a wireless communication device, it is possible to simultaneously solve the convenience and security of the wireless card payment service. There is an advantage to a more sound credit settlement society.

또한, 사용의 편리성을 해결함으로써, 향후 이동성을 강조한 무선 신용카드 결제 서비스의 사용을 일반화하는데 큰 도움이 될 수 있으며, 보안성 면에서는 거래 내역이 작성될 때까지 약 20회 가량 입력 데이터가 클라이언트 본인의 것인지를 계속적으로 체크하므로 어느 보안 방법보다도 우수하다고 할 수 있다.In addition, by solving the ease of use, it can be a great help to generalize the use of wireless credit card payment service that emphasizes future mobility.In terms of security, the input data is about 20 times until the transaction history is written. It is superior to any security method because it continuously checks whether it is own.

Claims

In a wireless communication device,

Voice information input means for receiving voice information of the client owning the wireless communication device;

Voice recognition means for analyzing voice data characteristics inputted through the voice information input means;

Speech feature storage means for storing speech data features analyzed by the speech recognition means;

In a financial transaction using the wireless communication device, after receiving predetermined financial information from the client through the voice information input means by voice, the voice data received for the financial transaction and prestored through the voice quality storage means. Voice data comparison means for analyzing and comparing client voice data characteristics for matching;

Text conversion means for converting the voice data input for the financial transaction into text (or numeric) data when the voice characteristics of the two data match as a result of comparing the voice data comparison means; And

And information data transmitting / receiving means for transmitting the financial transaction information data converted by the text switching means to the financial transaction server through a wireless communication network.

According to claim 1, The wireless communication device,

And a payment means storage and management means for storing at least one payment means information data of the client.