KR101424962B1

KR101424962B1 - Authentication system and method based by voice

Info

Publication number: KR101424962B1
Application number: KR1020110126266A
Authority: KR
Inventors: 박용선
Original assignee: 주식회사 지티티비
Priority date: 2011-11-29
Filing date: 2011-11-29
Publication date: 2014-08-01
Also published as: KR20130059999A

Abstract

이 발명은 인간(화자)의 음성 특징을 기반으로 정당 사용자의 본인 여부를 인증하는 시스템 및 방법에 관한 것이다.
이 발명에 따른 음성 기반 인증시스템은, 사용자 식별자 계정 정보와 이동통신단말기 정보와 사용자의 음성 패턴 정보를 연계하여 저장하는 저장부와, 상기 사용자 식별자 계정에 대한 음성 기반 인증을 요청한 컴퓨터장치 또는 상기 이동통신단말기에게 인증 문자열을 발급하는 인증문자열발급부와, 상기 이동통신단말기로부터 입력되는 화자의 음성신호로부터 음성 패턴을 추출하는 음성패턴추출부와, 상기 음성패턴추출부에서 추출된 상기 화자의 음성 패턴과 상기 사용자의 음성 패턴을 비교하는 음성패턴검증부와, 상기 화자의 음성신호를 음성인식하는 음성인식부와, 상기 음성인식부에서 인식된 문자열과 상기 인증문자열발급부에서 발급한 상기 인증 문자열을 비교하는 인증문자열검증부를 포함한다.The present invention relates to a system and method for authenticating the identity of a party user based on the voice characteristics of a human (speaker).
A voice-based authentication system according to the present invention includes a storage unit for storing user identifier account information, mobile communication terminal information and voice pattern information of a user in association with each other, and a computer unit or a mobile unit An authentication character string issuing unit for issuing an authentication string to the communication terminal; a voice pattern extracting unit for extracting a voice pattern from a voice signal of a speaker input from the mobile communication terminal; A speech recognition unit for recognizing a speech signal of the speaker; a recognition unit for recognizing a character string recognized by the speech recognition unit and the authentication string issued by the authentication character string issuing unit; And includes an authentication string verifying unit for comparing.

Description

BACKGROUND OF THE INVENTION 1. Field of the Invention [0001] The present invention relates to a voice-

이 발명은 인증시스템 및 방법에 관한 것으로서, 보다 상세하게는 인간(화자)의 음성 특징을 기반으로 정당 사용자의 본인 여부를 인증하는 시스템 및 방법에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an authentication system and method, and more particularly, to a system and method for authenticating whether or not a user is a party based on a voice characteristic of a human (speaker).

오늘날 인터넷의 발달에 힘입어 다양한 종류의 온라인 서비스가 제공되고 있다. 대부분의 온라인 서비스 시스템들은 인터넷을 통해 해당 시스템에 접근하는 클라이언트 컴퓨터 장치가 해당 온라인 서비스를 이용할 자격을 가지는지 여부를 인증한다.Today, with the development of the Internet, various kinds of online services are being provided. Most online service systems authenticate whether a client computer device accessing the system via the Internet is entitled to use the online service.

가장 보편적으로 사용되는 사용자 인증방법은 사용자 식별자(ID)와 비밀번호를 이용한 인증방법으로서, 사용자가 온라인 서비스 시스템에 회원으로 가입할 때 사용자 식별자와 비밀번호를 등록하고, 추후 해당 사용자가 해당 시스템에 접속하고자 할 때 기등록한 사용자 식별자와 비밀번호를 입력받아 그 사용자의 본인 여부를 검증한다.The most commonly used user authentication method is an authentication method using a user identifier (ID) and a password. When a user subscribes to an online service system, a user identifier and a password are registered, and the user then accesses the corresponding system When the user ID and password are registered, the user is authenticated.

그러나, 이러한 사용자 식별자와 비밀번호를 이용한 인증방법은 인증 정보(사용자 식별자와 비밀번호)가 도용되거나 해킹되기 쉬우며, 인증 정보가 노출될 경우 악의적인 접근 시도를 차단할 수 없는 문제점이 있다.However, the authentication method using the user identifier and the password has a problem that the authentication information (the user identifier and the password) is easily stolen or hacked, and malicious access attempts can not be blocked if the authentication information is exposed.

온라인 서비스 시스템에는 다양한 개인 정보들이 관리되고 있고, 최근 인터넷 뱅킹 서비스, 온라인 주식 거래 서비스 등을 통해 실질적인 금전 거래가 온라인에서 이루어지고 있으며, 온라인 서비스 시스템을 통한 무형의 자산(예컨대, 온라인 게임에서의 아이템, 사이버 머니 등)이 증가함에 따라, 온라인 서비스 시스템을 이용하는 사용자에 대해 보다 강력한 본인 인증방법들이 요구되고 있다.In the online service system, a variety of personal information is managed. In recent years, real money transactions have been carried out online through internet banking services and online stock trading services. Intangible assets (for example, items in online games, Cyber money, etc.), there is a need for stronger authentication methods for users using the online service system.

이러한 요구에 따라 사용자의 이동통신단말기를 이용한 일회용 인증키 기반 인증방법이 사용되고 있다. 이 일회용 인증키 기반 인증방식은 통상 다음과 같은 절차로 진행된다. 먼저, 온라인 서비스 시스템은 사용자 식별자와 비밀번호를 확인하고, 인증서버에게 사용자 인증을 요청한다. 인증서버는 기등록된 사용자의 이동통신단말기에 일회용 인증키가 포함된 문자메시지(SMS)를 발송한다. 온라인 서비스 시스템은 사용자의 컴퓨터장치를 통해 그 일회용 인증키를 입력받아 인증서버에게 전달한다. 그러면 인증서버는 사용자의 이동통신단말기에 발송한 일회용 인증키와 온라인 서비스 시스템을 통해 입력받은 일회용 인증키가 동일한 지를 검증한다.According to the request, a one-time authentication key based authentication method using a user's mobile communication terminal is used. This one-time authentication key-based authentication method generally follows the following procedure. First, the online service system confirms the user identifier and the password, and requests the authentication server to authenticate the user. The authentication server sends a text message (SMS) including the disposable authentication key to the mobile communication terminal of the pre-registered user. The online service system receives the disposable authentication key through the user's computer device and delivers it to the authentication server. Then, the authentication server verifies whether the one-time authentication key sent to the user's mobile communication terminal is the same as the one-time authentication key inputted through the online service system.

이러한 일회용 인증키 기반 인증방식은 본인 인증 보안 강도를 어느 정도 강화시킬 수는 있으나, 다양한 해킹에 의한 취약점이 여전히 존재하는 문제점이 있다. 또한, 하나의 사용자 식별자와 비밀번호를 여럿의 지인이 공유하여 다중 로그인을 함으로써, 온라인 서비스 시스템의 리소스가 부당하게 점유되는 문제점이 있다.
또한, 대한민국 공개특허 제2005-0090568호(2005.09.14.공개) '음성결제처리기능이 구비된 무선 단말기 및 기록매체'에서는, 무선단말기에 구비된 음성입력부로부터 입력되는 음성데이터의 패턴을 인식하여 그에 대응하는 문자 및 문자열을 인식하고, 그 인식된 문자 및 문자열을 기반으로 결제가 이루어지도록 하는 기술이 기재된다. 이러한 선행기술을 정당사용자가 아닌 제3자라도 패스워드에 대응하는 음성을 발음하면 동일한 문자 및 문자열이 인식되어 결제가 이루어지기 때문에, 무선 단말기와 패스워드가 도난될 경우 보안이 취약하게 되는 문제점이 있다.
또한, 대한민국 공개특허 제2002-0072030호(2002.09.14.공개) '이동통신단말기에서 음성 인식을 이용한 암호화 방법'에서는, 비밀번호를 음석인식기능을 이용하여 등록하고, 사용자가 미리 등록해 놓은 음성과 현재 마이크를 통해 입력되는 음성을 비교하여 사용자 인증하는 기술이 기재된다. 이러한 선행기술은 등록된 음성암호와 입력된 음성암호를 상호 비교하여 일치해야 정당 사용자로 인정되기 때문에, 정당 사용자라 하더라도 감기 등에 걸려서 음성이 변한 경우에는 인증을 통과하지 못하게 되는 문제점이 있다.
Although the one-time authentication key based authentication method can enhance the security strength of the authentication of the user to some extent, there is still a problem that various vulnerabilities due to various hacks exist. Also, there is a problem that resources of the online service system are unjustly occupied by multiple logins by sharing one user identifier and a password with a plurality of acquaintances.
In a wireless terminal and recording medium having a voice settlement processing function, a pattern of voice data input from a voice input unit provided in a wireless terminal is recognized in Korean Patent Laid-Open Publication No. 2005-0090568 (published on September 14, 2005) A character and a character string corresponding thereto are recognized, and a settlement is made based on the recognized character and character string. When the voice corresponding to the password of the third party, which is not the party user, is pronounced in this prior art, since the same character and string are recognized and payment is made, security is weakened when the wireless terminal and the password are stolen.
Also, in Korean Patent Application Publication No. 2002-0072030 (published on Sep. 14, 2002) 'Encryption method using voice recognition in mobile communication terminal', a password is registered using a voice recognition function, A technique of comparing a voice inputted through a microphone and authenticating the user is described. This prior art is problematic in that, since the registered voice password and the inputted voice password are mutually compared and matched and recognized as a party user, even if a party user is caught by a cold or the like, the authentication can not be passed when the voice is changed.

이 발명은 상술한 종래기술의 문제점을 해결하기 위하여 안출된 것으로서, 인간의 음성 특징을 기반으로 보다 강력하게 사용자의 본인 여부를 인증하는 시스템 및 방법에 관한 것이다.
SUMMARY OF THE INVENTION The present invention has been made in order to solve the above problems of the conventional art, and more particularly, to a system and method for authenticating a user's identity more strongly based on human voice characteristics.

상술한 목적을 달성하기 위한 이 발명에 따른 음성 기반 인증시스템은, 사용자 식별자 계정 정보와 이동통신단말기 정보와 사용자의 음성 패턴 정보를 연계하여 저장하는 저장부와, 상기 사용자 식별자 계정에 대한 음성 기반 인증을 요청한 컴퓨터장치 또는 상기 이동통신단말기에게 인증 문자열을 발급하는 인증문자열발급부와, 상기 이동통신단말기로부터 입력되는 화자의 음성신호로부터 음성 패턴을 추출하는 음성패턴추출부와, 상기 음성패턴추출부에서 추출된 상기 화자의 음성 패턴과 상기 사용자의 음성 패턴을 비교하는 음성패턴검증부와, 상기 화자의 음성신호를 음성인식하는 음성인식부와, 상기 음성인식부에서 인식된 문자열과 상기 인증문자열발급부에서 발급한 상기 인증 문자열을 비교하는 인증문자열검증부를 포함한 것을 특징으로 한다.According to another aspect of the present invention, there is provided a voice-based authentication system including a storage unit for storing user identifier account information, mobile communication terminal information and voice pattern information of a user in association with each other, A voice pattern extracting unit that extracts a voice pattern from a voice signal of a speaker input from the mobile communication terminal; A speech recognition unit for recognizing the speech signal of the speaker and a character recognition unit for recognizing the character string recognized by the speech recognition unit and the authentication character string issuing unit And an authentication string verifying unit for comparing the authentication string issued by the authentication- The.

또한, 이 발명에 따른 음성 기반 인증방법은, 인증서버가 사용자 식별자 대응 사용자의 음성 패턴을 등록하는 제1단계와, 상기 사용자 식별자에 대해 음성 기반 인증 요청이 접수되면, 상기 인증서버가 인증문자열을 생성하여 발급하는 제2단계와, 상기 사용자 식별자 대응 이동통신단말기를 통해 화자의 음성신호가 입력되면, 상기 인증서버가 상기 화자의 음성신호로부터 화자의 음성 패턴을 추출하는 제3단계와, 상기 인증서버가 상기 화자의 음성 패턴과 상기 사용자의 음성 패턴을 비교하여 상기 화자의 음성 패턴을 검증하는 제4단계와, 상기 화자의 음성신호를 음성 인식하여 화자가 발성한 문자열을 추출하는 제5단계와, 상기 화자가 발성한 문자열과 제2단계에서 발급된 인증문자열을 비교하여 상기 화자가 발성한 문자열을 검증하는 제6단계를 포함한 것을 특징으로 한다.
According to another aspect of the present invention, there is provided a voice-based authentication method including: a first step of an authentication server registering a voice pattern of a user corresponding to a user identifier; and a step of, when receiving a voice-based authentication request for the user identifier, A third step in which the authentication server extracts a speech pattern of the speaker from the speech signal of the speaker when the speech signal of the speaker is inputted through the mobile communication terminal corresponding to the user identifier, A fourth step of the server verifying the speech pattern of the speaker by comparing the speech pattern of the speaker with the speech pattern of the user, a fifth step of speech recognition of the speech signal of the speaker and extracting a character string uttered by the speaker , And a sixth step of verifying the character string uttered by the speaker by comparing the character string uttered by the speaker with the authentication string issued in the second step Hamhan is characterized in that.

이상과 같이 이 발명에 따르면 기등록된 사용자의 이동통신단말기를 통해 기등록된 음성 특징의 음성신호가 입력되는지를 검증함으로써, 정당 사용자의 본인 여부를 보다 강력하게 인증할 수 있는 효과가 있다. 아울러, 랜덤한 인증 문자열을 발급하고, 현재 화자로부터 입력된 음성신호를 음성 인식한 결과의 문자열과 상기 발급된 인증 문자열을 비교함으로써, 상기 화자가 정당 사용자임을 보다 강력하게 인증할 수 있는 효과가 있다.
As described above, according to the present invention, it is possible to verify whether or not the user of the party user is authenticated more strongly by verifying that the voice signal of the voice characteristic pre-registered through the mobile communication terminal of the pre-registered user is inputted. In addition, there is an effect that the speaker can authenticate that the user is a party user more strongly by issuing a random authentication string, comparing the character string resulting from speech recognition of the voice signal input from the current speaker and the issued authentication string .

도 1은 이 발명에 따른 음성 기반 인증 시스템을 도시한 구성 블록도이다.
도 2는 이 발명에 따른 음성패턴추출부(153)의 구성 블록도이다.
도 3은 이 발명에 따른 음성 기반 인증방법을 도시한 동작 흐름도이다.
도 4는 이 발명에 따른 사용자 식별자 대응 사용자의 음성패턴을 등록하는 과정(S301)을 도시한 동작 흐름도이다.
도 5는 이 발명에 따른 화자의 음성패턴을 추출하는 과정(S305)을 도시한 동작 흐름도이다.FIG. 1 is a block diagram of a voice-based authentication system according to the present invention.
Fig. 2 is a block diagram of the speech pattern extracting unit 153 according to the present invention.
FIG. 3 is a flowchart illustrating an operation of the voice-based authentication method according to the present invention.
4 is a flowchart illustrating a process of registering a voice pattern of a user corresponding to a user identifier (S301) according to the present invention.
5 is a flowchart illustrating a process of extracting a speech pattern of a speaker according to the present invention (S305).

이하, 첨부된 도면을 참조하여 이 발명의 한 실시예에 따른 음성 기반 인증시스템 및 방법을 보다 상세하게 설명하면 다음과 같다.Hereinafter, a voice-based authentication system and method according to an embodiment of the present invention will be described in detail with reference to the accompanying drawings.

이 발명의 명세서에서 '사용자'라 함은 임의의 사용자 식별자의 '정당 사용자'임을 의미하고, '화자'라 함은 임의의 사용자 식별자의 정당 사용자 여부가 인증되지 않은 자임을 의미한다. 이 발명의 명세서에서는 이 발명에 따른 음성 기반 인증을 위한 음성 패턴 등록시에는 정당 사용자인 '사용자'가 정상적으로 등록하는 것으로 가정한다. 반면, 음성 패턴 등록 후, 음성 기반 인증시 사용자 식별자로 로그인을 하더라도 음성 패턴 검증전에는 '화자'로 명명한다. '화자'는 음성 패턴 검증을 통해 '사용자'로 인증될 수 있다.In the specification of the present invention, 'user' means 'party user' of an arbitrary user identifier, and 'speaker' means that a party user of any user identifier is not authenticated. In the specification of the present invention, it is assumed that a 'user' who is a party user registers normally when registering a voice pattern for voice-based authentication according to the present invention. On the other hand, after the voice pattern registration, even if the user is logged in with the user identifier in the voice-based authentication, the user is referred to as the 'speaker' before the voice pattern verification. 'Speaker' can be authenticated as 'user' through voice pattern verification.

도 1은 이 발명에 따른 음성 기반 인증 시스템을 도시한 구성 블록도이다.FIG. 1 is a block diagram of a voice-based authentication system according to the present invention.

사용자는 이동통신단말기(110) 또는 컴퓨터장치(120)를 이용하여 통신망(130)에 접속하고, 온라인 서비스 시스템(140)에서 온라인 서비스를 제공받기 위해 이 발명에 따른 인증 절차를 수행한다. 컴퓨터장치(120)는 데스크탑, 노트북과 같은 다양한 컴퓨터 환경을 포함하며, 이동통신단말기(110)는 일반 피처폰(feature phone) 또는 운영체제(OS)를 탑재하고 다양한 어플리케이션(응용프로그램)의 설치 및 구동이 가능한 스마트폰을 포함한다.The user accesses the communication network 130 using the mobile communication terminal 110 or the computer 120 and performs the authentication procedure according to the present invention in order to receive the online service from the online service system 140. The computer 120 includes various computer environments such as a desktop and a notebook computer. The mobile communication terminal 110 is equipped with a general feature phone or an operating system (OS) and installs and operates various applications This includes a possible smartphone.

온라인 서비스 시스템(140)은 통신망(130)을 통해 다수의 사용자들에게 온라인 서비스를 제공하는 웹 상의 시스템으로서, 사용자가 사용하는 이동통신단말기(110) 또는 컴퓨터장치(120)를 통해 사용자 식별자 및 비밀번호를 입력받는 로그인 절차를 수행한다.The online service system 140 is a system on the web that provides online services to a plurality of users through the communication network 130. The online service system 140 is a system on the web, As shown in FIG.

이 발명에 따른 인증 서버(150)는 사용자의 등록 요청시 이동통신단말기(110)를 통해 등록용 문자열에 상응하는 사용자의 음성 신호를 입력받고 그 음성 신호로부터 음성 패턴을 추출하여 등록한다. 다음, 사용자의 인증 요청시 인증용 문자열을 발급하고 이동통신단말기를 통해 음성 신호를 입력받아 음성 패턴을 추출하여 등록된 음성 패턴과 동일한지를 검증하고, 아울러 입력된 음성 신호를 음성 인식하여 추출한 문자열과 상기 발급한 인증용 문자열을 비교하여 동일한지를 검증한다.The authentication server 150 according to the present invention receives a user's voice signal corresponding to a character string for registration through the mobile communication terminal 110 and extracts and registers a voice pattern from the voice signal. Next, a character string for authentication is issued upon request of the user, a voice signal is inputted through the mobile communication terminal to extract a voice pattern, and it is verified whether the voice pattern is the same as the registered voice pattern. The issued authentication string is compared and verified to be the same.

즉, 이 발명에 따른 인증 서버(150)는 현재 인증 요청하는 화자에 대해 2가지를 인증하는데, 하나는 현재 화자의 음성 패턴이 기등록된 사용자의 음성 패턴과 동일한지를 인증하고 것이고, 다른 하나는 현재 화자의 음성 신호에 상응하는 문자열이 해당 사용자에게 발급한 인증용 문자열과 동일한지를 인증하는 것이다. 전자의 인증을 통해 현재 화자의 음성이 사용자 본인의 음성인지를 검증하여 타인에 의한 악의적인 도용을 방지할 수 있는 잇점이 있다. 그렇다고 매번 동일한 문자열에 대한 사용자의 음성 패턴을 검증한다면(입력되는 음성 신호에 대한 문자열에 대한 검증을 하지 않는다면), 사용자의 해당 문자열에 대응하는 음성 신호를 미리 녹음하였다가 그 녹음된 음성 신호를 인증 서버로 입력하여 인증받을 우려가 있다. 후자는 랜덤한 인증용 문자열을 발급하여 화자가 그 인증용 문자열을 발성하는지에 대한 검증을 하기 때문에 사용자 식별자 계정에 대한 악의적인 도용을 방지함과 아울러 하나의 사용자 식별자 계정의 공유도 방지할 수 있는 잇점이 있다.That is, the authentication server 150 according to the present invention authenticates two of the speakers currently requesting authentication, one authenticating that the voice pattern of the current speaker is the same as the voice pattern of the previously registered user, It is authenticated whether or not the character string corresponding to the speech signal of the current speaker is the same as the character string for authentication issued to the user. It is advantageous in that the voice of the current speaker can be verified whether or not the voice of the user is the voice of the user through the authentication of the former to prevent malicious use by another person. However, if the voice pattern of the user for the same character string is verified each time (if the character string of the inputted voice signal is not verified), the voice signal corresponding to the user's corresponding character string is recorded in advance, There is a possibility that authentication is inputted into the server. The latter issues a random authentication string and verifies whether the speaker speaks the authentication string, thereby preventing malicious theft of the user identifier account and preventing sharing of one user identifier account There is an advantage.

인증 서버의 구성 및 동작을 보다 상세하게 설명한다.The configuration and operation of the authentication server will be described in more detail.

인증 서버(150)는 사용자 식별자 계정 정보와 이동통신단말기 정보와 사용자의 음성 패턴 정보를 연계하여 저장하는 저장부(151)와, 상기 사용자 식별자 계정에 대한 음성 기반 인증을 요청한 화자의 컴퓨터장치 또는 상기 이동통신단말기에게 인증 문자열을 발급하고 저장부(151)에 저장하는 인증문자열발급부(152)와, 상기 이동통신단말기(110)로부터 입력되는 화자의 음성신호로부터 화자의 음성 패턴을 추출하는 음성패턴추출부(153)와, 상기 음성패턴추출부(153)에서 추출된 화자의 음성 패턴과 상기 저장부(151)에 저장된 사용자의 음성 패턴을 비교하는 음성패턴검증부(154)와, 상기 화자의 음성신호를 음성인식하는 음성인식부(155)와, 상기 음성인식부(155)에서 인식된 문자열과 상기 인증문자열발급부(152)에서 발급한 인증 문자열을 비교하는 인증문자열검증부(156)를 포함한다.The authentication server 150 includes a storage unit 151 for storing the user identifier account information, the mobile communication terminal information and the user's voice pattern information in association with each other, a computer apparatus of the speaker requesting voice-based authentication for the user identifier account, An authentication character string issuing unit 152 for issuing an authentication string to the mobile communication terminal and storing the authentication string in the storage unit 151 and a voice pattern extracting unit 152 for extracting a voice pattern of the speaker from the voice signal of the speaker input from the mobile communication terminal 110. [ A speech pattern verifying unit 154 for comparing the speech pattern of the speaker extracted by the speech pattern extracting unit 153 with the speech pattern of the user stored in the storage unit 151, A voice recognition unit 155 for voice recognition of a voice signal; an authentication character string search unit 155 for comparing the character string recognized by the voice recognition unit 155 with an authentication string issued by the authentication character string issuing unit 152; And an augmentation 156.

음성패턴추출부(153)는 임의의 사용자 식별자 계정에 대응하는 음성 패턴 등록시, 사용자의 음성신호로부터 추출된 음성 패턴을 학습하여 저장부(151)에 저장한다. 또한, 임의의 사용자 식별자 계정에 대응하는 음성 패턴 검증시, 화자의 음성신호로부터 추출된 음성 패턴과 저장부(151)에 저장된 사용자의 음성 패턴을 비교하여 매칭율이 임계값 이상일 경우 사용자의 음성 패턴을 학습하고 저장부(151)에 학습된 음성 패턴으로 업데이트한다. 음성패턴검증부(154)는 음성패턴추출부(153)에서 추출된 화자의 음성 패턴과 저장부(151)에 저장된 사용자의 음성 패턴을 비교하여 화자의 사용자 여부를 인증한다.The voice pattern extracting unit 153 learns the voice pattern extracted from the voice signal of the user and stores the voice pattern in the storage unit 151 when registering the voice pattern corresponding to an arbitrary user identifier account. In addition, when verifying the voice pattern corresponding to an arbitrary user identifier account, the voice pattern extracted from the voice signal of the speaker is compared with the voice pattern of the user stored in the storage unit 151. If the matching rate is equal to or greater than the threshold value, And updates the learned speech pattern in the storage unit 151. The voice pattern verification unit 154 compares the voice pattern of the speaker extracted by the voice pattern extraction unit 153 with the voice pattern of the user stored in the storage unit 151 to authenticate whether or not the speaker is a user.

음성인식부(144)는 통상적인 음성인식기술을 적용하여 사용자의 음성신호로부터 사용자가 발성한 문자열을 추출한다. 인증문자열검증부(156)는 음성인식부(144)에서 추출한 문자열과 저장부(151)에 저장된 인증 문자열을 비교하여 사용자가 발성한 문자열을 검증한다.The voice recognition unit 144 extracts a user's utterance from the user's voice signal by applying a conventional voice recognition technology. The authentication string verification unit 156 compares the character string extracted by the voice recognition unit 144 and the authentication string stored in the storage unit 151, and verifies the user-generated character string.

음성패턴검증부(154)의 화자 인식과 인증문자열검증부(156)의 인증문자열 검증이 정상적으로 완료되면, 상기 사용자 식별자 계정에 대한 음성 기반 인증을 요청한 화자의 컴퓨터장치 또는 상기 이동통신단말기가 온라인 서비스를 이용할 수 있도록 한다. 음성패턴검증부(154)의 화자 인식과 인증문자열검증부(156)의 인증문자열 검증은 순차적으로 이루어질 수도 있고, 동시에 이루어질 수도 있다.When the speaker recognition of the voice pattern verification unit 154 and verification of the authentication string of the authentication string verification unit 156 are normally completed, the computer apparatus of the speaker or the mobile communication terminal requesting the voice- . The speaker recognition of the voice pattern verification unit 154 and the authentication string verification of the authentication string verification unit 156 may be performed sequentially or simultaneously.

인증문자열발급부(152)는 컴퓨터장치(120) 또는 이동통신단말기(110)에게 인증문자열을 발급하지만, 이동통신단말기(110)를 통해 해당 인증문자열에 대응하는 음성신호를 전송받는다. 이동통신단말기(110)에 인증문자열을 발송하거나 혹은 이동통신단말기(110)로부터 인증문자열 대응 음성신호를 전송받으려면 이동통신단말기(110)가 활성화 상태가 되어야 하는데, 이동통신단말기(110)가 일반 피쳐폰이거나 푸시서비스를 위한 장치토큰이 등록되지 않은 스마트폰인 경우 인증문자열발급부(152)는 인증서버(150)와의 접속을 위한 에스엠에스(SMS)를 발송하여 인증서버(150)와 이동통신단말기(110) 사이의 통신이 이루어지도록 한다.The authentication string issuing unit 152 issues an authentication string to the computer 120 or the mobile communication terminal 110 but receives a voice signal corresponding to the authentication string through the mobile communication terminal 110. [ In order to send an authentication string to the mobile communication terminal 110 or receive an authentication string corresponding voice signal from the mobile communication terminal 110, the mobile communication terminal 110 must be activated. Phone or a smartphone in which a device token for a push service is not registered, the authentication string issuing unit 152 sends an SMS for connection with the authentication server 150 and transmits the SMS to the authentication server 150 and the mobile communication terminal 150. [ (110).

한편, 이동통신단말기(110)가 푸시서비스를 위한 장치토큰이 등록된 스마트폰인 경우, 인증문자열발급부(152)는 푸시서버(160)를 이용하여 해당 이동통신단말기(110)를 활성화시키고 이동통신단말기(110)로부터 화자의 음성신호를 전송받는다.If the mobile communication terminal 110 is a smartphone registered with a device token for a push service, the authentication string issuing unit 152 activates the corresponding mobile communication terminal 110 using the push server 160, And receives the speech signal of the speaker from the communication terminal 110.

푸시(PUSH)서버는 이동통신단말기(110)의 제조업체에서 제공하는 서비스로서, 이동통신단말기(110)는 임의의 어플리케이션에 대해 푸시서비스를 제공받고자 할 경우, 먼저 푸시서버(160)로부터 해당 어플리케이션에 대응하는 장치토큰을 발급받는다. 그러면, 푸시서버(160)는 이동통신단말기(110)에 푸시메시지를 전송하여 이동통신단말기를 깨우고(wakeup), 해당 토큰장치에 대응되는 어플리케이션(예컨대, 이동통신단말기의 보안인증모듈)을 활성화시키는 역할을 수행한다.The push server is a service provided by a manufacturer of the mobile communication terminal 110. When the mobile communication terminal 110 wants to receive a push service for an arbitrary application, A corresponding device token is issued. The push server 160 then transmits a push message to the mobile communication terminal 110 to wake up the mobile communication terminal and activate an application corresponding to the corresponding token device (e.g., a security authentication module of the mobile communication terminal) Role.

즉, 인증서버(150)가 이동통신단말기(110)로부터 인증문자열 대응 음성신호를 전송받고자 할 때, 인증문자열발급부(152)는 이동통신단말기(110)의 장치토큰을 푸시(PUSH)서버(160)에게 전달하고, 그러면 푸시(PUSH)서버(160)는 이동통신단말기(110)에게 푸시메시지를 출력하여 이동통신단말기(110)가 깨어나서 인증서버(150)와 통신하도록 한다. 푸시서버의 예로서, iOS계열은 애플에서 제공하는 APNs(Apple Push Notification Service)를 푸시서버로 사용하고, 안드로이드 계열은 구글에서 제공하는 C2DM(Cloud To Device Messaging)를 푸시서버로 사용한다.That is, when the authentication server 150 receives an authentication string corresponding voice signal from the mobile communication terminal 110, the authentication string issuing unit 152 transmits the device token of the mobile communication terminal 110 to the push server (PUSH) The PUSH server 160 outputs a push message to the mobile communication terminal 110 so that the mobile communication terminal 110 wakes up and communicates with the authentication server 150. [ As an example of a push server, the iOS series uses APNs (Apple Push Notification Service) provided by Apple as a push server, and the Android series uses Cloud To Device Messaging (C2DM) provided by Google as a push server.

도 2는 이 발명에 따른 음성패턴추출부(153)의 구성 블록도이다.Fig. 2 is a block diagram of the speech pattern extracting unit 153 according to the present invention.

음성패턴추출부(153)는 입력되는 음성신호를 샘플링하는 샘플링부(21)와, 상기 샘플링된 음성신호로부터 화자 음성의 특징 파라미터를 추출하는 MFCC(Mel Frequency Cepstral Coefficients)(22)와, 상기 음성의 특징 파라미터를 코드북(23)의 코드 벡터 공간으로 사상하여 음성의 특징 벡터를 추출하는 벡터양자화부(24)와, 상기 특징 벡터의 패턴을 시간축 상에서의 비선형 신축을 고려하여 보정하여 화자의 음성 패턴을 추출하고 상기 추출된 화자의 음성 패턴을 상기 저장부(151)에 저장하는 DTW(Dynamic Time Warping) 프로세서(25)를 포함한다.The voice pattern extracting unit 153 includes a sampling unit 21 for sampling an input voice signal, a Mel Frequency Cepstral Coefficients (MFCC) 22 for extracting feature parameters of the speaker's voice from the sampled voice signal, A vector quantization unit 24 for mapping the feature vector of the speech vector into a code vector space of the codebook 23 to extract a speech feature vector, And a DTW (Dynamic Time Warping) processor 25 for extracting the speech pattern of the extracted speaker and storing the extracted speech pattern in the storage unit 151.

음성패턴추출부(153)는 상기 화자의 음성 패턴과 상기 저장부(151)에 기저장된 사용자의 음성 패턴의 매칭율과 임계값을 비교하는 매칭프로세서(26)와, 상기 매칭율이 임계값 이상이면 화자의 음성 패턴을 반영하여 상기 사용자의 음성 패턴을 훈련하는 음성패턴훈련부(27)와, 상기 훈련된 사용자의 음성 패턴을 최적화하여 상기 저장부(151)에 저장하는 최적화부(28)를 포함한다.The speech pattern extracting unit 153 includes a matching processor 26 for comparing the matching rate of the speech pattern of the speaker with the speech pattern of the user previously stored in the storing unit 151 and a threshold value, A voice pattern training unit 27 for training the voice pattern of the user to reflect the voice pattern of the speaker and an optimization unit 28 for optimizing the voice pattern of the trained user and storing the voice pattern in the storage unit 151 do.

사용자의 음성 패턴 등록시, 사용자에게는 적어도 둘 이상(바람직하게는 5개)의 샘플 문자열에 발성이 요구되는데, 사용자는 기등록된 이동통신단말기를 이용하여 이 샘플 문자열을 발성하고, 이 발명에 따른 음성패턴추출부(153)에 사용자의 음성신호가 입력된다. 첫 번째 샘플 문자열에 대응하는 음성신호는 상기 샘플링부(21)와 MFCC(22)와 벡터양자화부(24)와 DTW 프로세서(25)를 통해 음성 패턴이 추출되어 저장부(151)에 저장된다. 두 번째 샘플 문자열에 대응하는 음성신호로부터 추출된 음성 패턴은 매칭 프로세서(26)에서 저장부(151)에 저장된 음성 패턴과 비교되는데, 그 매칭율이 기설정된 기준치보다 낮으면 상기 두 번째 샘플 문자열에 대응하는 음성신호로부터 음성 패턴을 다시 추출하던지 아니면 사용자에게 재발송할 것을 요청한다. 하지만, 그 매칭율이 기설정된 기준치보다 낮지 않으면 음성 패턴을 훈련하여 업데이트한다.At the time of registering the user's voice pattern, the user is required to utter at least two (preferably five) sample strings. The user utters the sample string using the previously registered mobile communication terminal, The voice signal of the user is input to the pattern extracting unit 153. [ The speech signal corresponding to the first sample string is extracted from the speech pattern through the sampling unit 21, the MFCC 22, the vector quantization unit 24, and the DTW processor 25, and is stored in the storage unit 151. The speech pattern extracted from the speech signal corresponding to the second sample string is compared with the speech pattern stored in the storage unit 151 in the matching processor 26. If the matching rate is lower than the preset reference value, It requests to extract the voice pattern again from the corresponding voice signal or retransmit it to the user. However, if the matching rate is not lower than the predetermined reference value, the voice pattern is trained and updated.

도 3은 이 발명에 따른 음성 기반 인증방법을 도시한 동작 흐름도이다.FIG. 3 is a flowchart illustrating an operation of the voice-based authentication method according to the present invention.

인증서버는 사용자 식별자 대응 사용자의 음성 패턴을 등록한다(S301). 사용자의 음성 패턴을 등록하는 상세한 과정은 후술하기로 한다.The authentication server registers the voice pattern of the user corresponding to the user identifier (S301). The detailed process of registering the user's voice pattern will be described later.

이 후, 사용자 식별자에 대해 음성 기반 인증 요청이 접수되면(S302), 인증 서버는 인증문자열을 생성하여 컴퓨터장치 또는 이동통신단말기에게 발급하면서 해당 인증문자열에 대한 발성을 요청한다(S303). 이때, 인증서버는 푸시서버를 통해 이동통신단말기에게 푸시메시지를 전송할 수도 있다.When the voice-based authentication request is received for the user identifier (S302), the authentication server generates the authentication string and issues the authentication string to the computer device or the mobile communication terminal (S303). At this time, the authentication server may transmit a push message to the mobile communication terminal through the push server.

이동통신단말기를 통해 화자의 음성신호가 입력되면(S304), 인증서버는 화자의 음성신호로부터 화자의 음성 패턴을 추출한다(S305). 통상적으로 단계 S301의 사용자의 음성 패턴과 단계 S305의 화자의 음성 패턴은 동일한 방법으로 추출되는 것이 바람직하다.When a speech signal of a speaker is input through the mobile communication terminal (S304), the authentication server extracts a speech pattern of the speaker from the speech signal of the speaker (S305). Normally, the voice pattern of the user of step S301 and the voice pattern of the speaker of step S305 are preferably extracted in the same way.

다음, 화자의 음성 패턴과 사용자의 음성 패턴을 비교하여(S306), 그 매칭율이 임계값보다 크면(S307) 사용자의 음성 패턴을 학습하여 업데이트하고(S308), 화자의 음성신호를 음성 인식하여 화자가 발성한 문자열을 추출한다(S309). 발성된 문자열과 단계 S303에서 발급한 인증문자열을 비교하여(S310), 두 문자열이 일치하면(S311) 인증을 완료하며 컴퓨터장치 또는 이동통신단말기에 대한 온라인 서비스 이용이 허용되도록 한다(S312). 그러나, 두 문자열이 일치하지 않으면(S311), 상기 컴퓨터장치와 이동통신단말기에 대한 온라인 서비스 이용이 차단되도록 한다(S313). 또한, 단계 S307에서 화자의 음성 패턴과 기저장된 사용자의 음성 패턴의 매칭율이 임계값보다 크지 않으면, 상기 컴퓨터장치와 이동통신단말기에 대한 온라인 서비스 이용이 차단되도록 한다.Next, the speech pattern of the speaker is compared with the speech pattern of the user (S306). If the matching rate is larger than the threshold value (S307), the speech pattern of the user is learned and updated (S308) And extracts a character string uttered by the speaker (S309). If the two strings match (S311), the authentication is completed and the online service is allowed to be allowed to the computer device or the mobile communication terminal (S312). However, if the two strings do not match (S311), the online service use for the computer device and the mobile communication terminal is blocked (S313). If the matching rate of the speech pattern of the speaker and the voice pattern of the user stored in advance is not greater than the threshold value in step S307, the on-line service use for the computer device and the mobile communication terminal is blocked.

도 4는 이 발명에 따른 사용자 식별자 대응 사용자의 음성패턴을 등록하는 과정(S301)을 도시한 동작 흐름도이다.4 is a flowchart illustrating a process of registering a voice pattern of a user corresponding to a user identifier (S301) according to the present invention.

인증 서버는 컴퓨터장치 또는 이동통신단말기에게 샘플 문자열을 출력하고 사용자로 하여금 샘플 문자열에 대한 발성을 요청한다(S401).The authentication server outputs a sample string to the computer device or the mobile communication terminal and requests the user to speak about the sample string (S401).

사용자의 음성신호가 입력되면(S402), 입력되는 사용자의 음성신호를 샘플링하고(S403), 샘플링된 음성신호로부터 사용자 음성의 특징 파라미터를 추출하며(S404), 음성의 특징 파라미터를 코드북의 코드 벡터 공간으로 사상하여 음성의 특징 벡터를 추출한다(S405). 다음, 추출된 특징 벡터의 패턴을 시간축 상에서의 비선형 신축을 고려하여 보정하여 사용자의 음성 패턴을 생성한다(S406).If the user's voice signal is input (S402), the input user's voice signal is sampled (S403), and the feature parameter of the user's voice is extracted from the sampled voice signal (S404) And extracts the feature vector of speech by extracting the speech feature vector (S405). Next, a pattern of the extracted feature vector is corrected in consideration of non-linear expansion and contraction on the time axis to generate a user's voice pattern (S406).

기추출된 사용자의 음성패턴이 존재하지 않으면(S407), 즉 첫 번째 샘플 문자열을 기반으로 추출된 사용자의 음성패턴이면, 추출된 사용자의 음성 패턴을 저장하고(S408), 단계 S409로 진행한다. 단계 S409에서 음성 패턴 추출 대상 샘플 문자열이 남아 있으면 단계 S402로 진행하여 사용자의 음성 신호 입력을 대기한다.If there is no speech pattern of the user extracted (S407), that is, if the user's speech pattern is extracted based on the first sample string, the extracted speech pattern of the user is stored (S408), and the flow advances to step S409. If the sample string to be extracted is left in step S409, the process proceeds to step S402 to wait for input of the user's voice signal.

사용자의 음성신호가 입력되면 단계 S403 내지 단계 S406을 수행하여 음성 패턴을 추출한다.If the user's voice signal is input, steps S403 to S406 are performed to extract the voice pattern.

단계 S407에서 기추출된 사용자의 음성패턴이 존재하면, 새롭게 추출된 음성 패턴과 기추출된 음성 패턴을 비교하여(S410), 그 매칭율이 기설정된 기준치보다 크면(S411), 사용자의 음성 패턴을 훈련하고 훈련된 사용자의 음성 패턴을 최적화하여 업데이트한다(S412). 한편, 매칭율이 기설정된 기준치보다 크지 않으면(S411), 단계 S403부터 다시 수행하여 사용자의 음성 패턴을 다시 추출한다. 도면에 도시되어 있지는 않으나 다시 추출한 횟수가 기설정된 횟수 이상이면 음성 패턴 추출 오류로 인식하여 사용자에게 해당 샘플 문자열에 대한 발성을 다시 요청할 수도 있다.If the voice pattern of the user extracted in step S407 is present, the newly extracted voice pattern is compared with the previously extracted voice pattern (S410). If the matching rate is larger than the predetermined reference value (S411) The voice pattern of the trained user is optimized and updated (S412). On the other hand, if the matching rate is not larger than the preset reference value (S411), the process returns from step S403 to extract the voice pattern of the user again. If it is not shown in the drawing, if the number of times of extraction is more than a preset number, it is recognized as a speech pattern extraction error, and the user may request to speak again for the sample string.

단계 S409에서 음성 패턴을 추출할 샘플 문자열이 남아 있으면 단계 S402로 진행하여 다음 샘플 문자열에 대한 음성 패턴을 추출하고, 단계 S409에서 모든 샘플문자열에 대한 음성 패턴 추출이 완료되면 사용자의 음성 패턴 등록 절차를 완료한다(S413).If it is determined in step S409 that there is a sample string to be extracted as a voice pattern, the process proceeds to step S402 to extract a voice pattern for the next sample string. If voice pattern extraction for all sample strings is completed in step S409, (S413).

도 5는 이 발명에 따른 화자의 음성패턴을 추출하는 과정(S305)을 도시한 동작 흐름도이다.5 is a flowchart illustrating a process of extracting a speech pattern of a speaker according to the present invention (S305).

인증 서버는 입력되는 화자의 음성신호를 샘플링하고(S51), 샘플링된 음성신호로부터 화자 음성의 특징 파라미터를 추출하며(S52), 음성의 특징 파라미터를 코드북의 코드 벡터 공간으로 사상하여 화자 음성의 특징 벡터를 추출한다(S53). 다음, 추출된 특징 벡터의 패턴을 시간축 상에서의 비선형 신축을 고려하여 보정함으로써(S54), 사용자의 음성 패턴을 생성한다.
The authentication server samples the speech signal of the input speaker (S51), extracts the characteristic parameter of the speaker's voice from the sampled speech signal (S52), maps the characteristic parameter of the speech to the code vector space of the codebook, The vector is extracted (S53). Next, the pattern of the extracted feature vector is corrected in consideration of non-linear expansion and contraction on the time axis (S54) to generate the user's voice pattern.

110 : 이동통신단말기 120 : 컴퓨터장치
130 : 통신망 140 : 온라인 서비스 시스템
150 : 인증서버 151 : 저장부
152 : 인증문자열발급부 153 : 음성패턴추출부
154 : 음성패턴검증부 155 : 음성인식부
156 : 인증문자열검증부110: mobile communication terminal 120: computer device
130: communication network 140: online service system
150: Authentication server 151:
152: authentication string issuing unit 153: voice pattern extracting unit
154: voice pattern verification unit 155: voice recognition unit
156: Authentication string verification unit

Claims

A storage unit for storing the user identifier account information, the mobile communication terminal information, and the voice pattern information of the user in association with each other; an authentication string for issuing an authentication string to the computer device or the mobile communication terminal requesting voice- A voice pattern extracting unit that extracts a voice pattern from a voice signal of a speaker input from the mobile communication terminal; a voice pattern extractor that compares a voice pattern of the speaker extracted by the voice pattern extracting unit with a voice pattern of the user; And an authentication string verifying unit for comparing the character string recognized by the speech recognition unit and the authentication string issued by the authentication string issuing unit,
The speech pattern extracting unit includes a sampling unit for sampling the speech signal of the speaker, a Mel Frequency Cepstral Coefficients (MFCC) for extracting feature parameters of the speaker speech from the sampled speech signal, And a DTW (Dynamic Time Warping) processor for extracting a speech pattern of the speaker by correcting the pattern of the feature vector in consideration of non-linear expansion and contraction on the time axis. Based authentication system.

The voice-based authentication system according to claim 1, wherein the authentication character string issuing unit activates the mobile communication terminal using a push server.

delete

2. The apparatus of claim 1, wherein the speech pattern extracting unit comprises: a matching processor for comparing a matching ratio of the speech pattern of the speaker and the speech pattern of the user to a threshold value; And a voice pattern training unit for training the voice pattern of the user.

A second step of the authentication server registering a voice pattern of a user corresponding to a user identifier; a second step of generating and issuing an authentication string by the authentication server when a voice-based authentication request is received for the user identifier; A third step of the authentication server extracting a voice pattern of the speaker from the voice signal of the speaker when the voice signal of the speaker is input through the corresponding mobile communication terminal; A fourth step of verifying the speech pattern of the speaker by comparing the patterns with each other, a fifth step of speech recognition of the speech signal of the speaker and extracting a character string uttered by the speaker, And a sixth step of comparing the issued authentication string and verifying the character string uttered by the speaker,
The third step comprises a thirty-first sub-step of sampling the speech signal of the speaker by the authentication server, a thirty-second sub-step of extracting a feature parameter of the speaker's voice from the sampled speech signal, A thirty-third step of extracting a feature vector of speech by mapping the code vector space of the codebook to a code vector space of the codebook; and a thirty-fourth step of generating a speech pattern of the speaker by correcting the extracted pattern of the feature vector by considering non- Based authentication method.

6. The method of claim 5, wherein the authentication server transmits a push message to the mobile communication terminal through a push server.

6. The method according to claim 5,
An eleventh step of the authentication server outputting a sample string,
A twelfth step of, when the voice signal of the user is input, the authentication server sampling the voice signal of the user;
A thirteenth step of extracting a feature parameter of the user speech from the sampled speech signal,
A seventeenth step of mapping a feature parameter of the user speech into a code vector space of a codebook to extract a feature vector of the speech;
And a fifteenth step of generating a user's voice pattern by correcting the extracted pattern of feature vectors in consideration of non-linear expansion and contraction on the time axis.

8. The method according to claim 7, wherein the speech patterns of at least two or more users are generated and learned by performing the eleventh to fifteenth steps for at least two sample strings.

delete

6. The method of claim 5, wherein in the fourth step, if the matching rate of the speech pattern of the speaker and the speech pattern of the user is greater than a preset threshold value, Further comprising the step of: