KR20010110964A

KR20010110964A - The method for verifying users by using voice recognition on the internet and the system thereof

Info

Publication number: KR20010110964A
Application number: KR1020000031002A
Authority: KR
Inventors: 이기용
Original assignee: 강희태; 주식회사 웹프로텍
Priority date: 2000-06-07
Filing date: 2000-06-07
Publication date: 2001-12-15

Abstract

본 발명은 인터넷상에서 인터넷 서비스 이용자의 인증에 음성을 이용함으로써 비용을 적게 들이면서 이용자 정보의 보안을 확실하게 하기 위한 것이다.The present invention is to ensure the security of user information at low cost by using voice for authentication of Internet service users on the Internet.

본 발명의 인터넷상에서 음성을 이용하여 이용자를 인증하는 방법은, 이용자의 음성을 등록하는 단계와, 이용자 인증을 희망하는 이용자의 음성을 수신하는 단계와, 상기 이용자로부터 수신된 음성을 이용하여 상기 등록된 음성과 비교하는 단계와, 상기 비교 결과를 데이터베이스에 저장하는 단계를 포함한다.The method for authenticating a user using voice on the Internet of the present invention includes the steps of registering a user's voice, receiving a user's voice for user authentication, and using the voice received from the user. Comparing the result with the voice, and storing the result of the comparison in a database.

Description

The method for verifying users by using voice recognition on the internet and the system approximately}

본 발명은 음성을 이용하여 이용자를 등록하고 인증하기 위한 방법 및 그 시스템에 관한 것으로, 좀 더 구체적으로는 인터넷상에서 음성을 이용하여 인터넷 서비스 이용자를 등록하고 인증하기 위한 방법 및 그 시스템에 관한 것이다.The present invention relates to a method and system for registering and authenticating a user using voice, and more particularly, to a method and system for registering and authenticating an Internet service user using voice over the Internet.

인터넷을 이용하는 계층은 이미 소수의 특정 집단을 벗어나 대중에게로 확산되었고, 인터넷 시장은 이미 포화 상태인 현실속의 시장과는 달리 개척의 여지와 확장의 가능성을 보유한 장이다. 또한 네트워크를 통한 전자상거래는 보안과 인증 등이 중요해지고 있다.The class that uses the Internet has already spread out to a certain number of groups to the public, and the Internet market is a place with a lot of room for expansion and expansion, unlike the already saturated market. In addition, security and authentication are becoming more important in electronic commerce through a network.

지금까지 인터넷등의 네트워크상에서 이용자를 인증하기 위해 대부분 데이터베이스를 활용한 ID/Password의 방식을 사용하여 왔다. 상기 ID/Password 방식은 인터넷 서비스 사용자가 인터넷 서비스 제공자의 초기 사이트에 접속하여 사용자 자신의 ID와 Password 등을 입력하면, 인터넷 서비스 제공자는 자신의 시스템의 데이터베이스에 인터넷 서비스 사용자의 ID, Password 등을 저장하고, 상기 데이터베이스화된 사용자의 ID와 Password에 따라 사용자를 인증하도록 한 것이다.Until now, most users have used the ID / Password method using a database to authenticate users on a network such as the Internet. In the ID / Password method, when an Internet service user accesses the initial site of the Internet service provider and inputs his or her ID and password, the Internet service provider stores the ID and password of the Internet service user in the database of the system. The user is authenticated according to the ID and password of the database user.

또한, 최근 관심되고 있는 이용자 인증 방법으로는 인터넷 서비스 이용자등이 인증기관으로부터 발급받은 인증서를 이용하여 인터넷 서비스 제공자에게 제출하면, 인터넷 서비스 제공자는 상기 제출된 인증서를 판독하여 이용자 인증을 해주는 방법이 있다.In addition, as a user authentication method which is of recent interest, when an Internet service user or the like submits to an Internet service provider using a certificate issued from a certification authority, the Internet service provider reads the submitted certificate and authenticates the user. .

상기 설명한 종래의 ID/PASSWORD 방법은, ID와 PASSWORD의 도용에 의해 사칭자가 용이하게 다른 사람의 ID와 PASSWORD를 사용할 수 있으므로, 특히 은행 등의 금융시스템을 이용하는 이용자 입장에서는 보안상의 큰 문제점이 아닐 수 없다.In the conventional ID / PASSWORD method described above, the impersonator can easily use another person's ID and PASSWORD by stealing ID and PASSWORD. Therefore, this may not be a big security problem for a user using a financial system such as a bank. none.

한편, 지문, 음성, 얼굴 등과 같은 생체정보는 각기 고유한 형태를 유지하고 있으며, 잃어버리거나 잊어버리는 경우가 없어서 신원 확인, 신원 검증에 유용하게 사용될 수 있으며, 이중에서도 음성인식 방법은 그 인식을 위해 이용되는 하드웨어가 기존의 널리 사용되는 장비를 그대로 사용할 수 있어서 비용면에서 유리한 이점을 가지고 있다.On the other hand, biometric information such as fingerprints, voices, and faces maintain their own unique forms, and they can be used for identity verification and identity verification because they are not lost or forgotten. Among them, the voice recognition method is used for recognition. The hardware used has advantages in terms of cost since it can use existing widely used equipment as it is.

본 발명은 상기와 같은 것을 고려하여, 인터넷 등의 네트워크상에서 음성을 이용하여 이용자를 등록하고 인증함으로써 비용을 적게 들이면서 이용자 정보의 보안을 확실하게 하기 위한 것이다.In view of the above, the present invention is to ensure the security of user information at low cost by registering and authenticating a user using voice over a network such as the Internet.

도 1은 본 발명의 인터넷상의 네트워크 시스템의 하나의 실시예의 개략도.1 is a schematic diagram of one embodiment of a network system on the Internet of the present invention;

도 2는 본 발명의 음성 인증 서버의 블럭도.2 is a block diagram of a voice authentication server of the present invention.

도 3은 이용자의 음성을 등록하기 위한 과정을 보여주는 흐름도.3 is a flowchart showing a process for registering a user's voice;

도 4는 본 발명의 음성 인증 시스템을 이용하여 이용자가 음성을 등록하기 위한 인터넷 접속 초기 화면을 보여주는 블럭도.4 is a block diagram showing an initial screen for accessing an Internet for a user to register a voice using the voice authentication system of the present invention.

도 5는 본 발명의 음성 인증 시스템에서 이용자의 음성이 화자 DB에 등록되는 과정을 보여주는 블럭도.5 is a block diagram showing a process in which a user's voice is registered in a speaker DB in the voice authentication system of the present invention.

도 6은 이용자의 음성을 인증하기 위한 과정을 보여주는 흐름도.6 is a flowchart showing a process for authenticating a user's voice.

도 7는 본 발명의 음성 인증 시스템을 이용하여 이용자가 음성을 인증하기 위한 인터넷 접속 초기 화면을 보여주는 블럭도.FIG. 7 is a block diagram showing an initial screen of an Internet connection for authenticating a voice by a user using the voice authentication system of the present invention. FIG.

도 8은 본 발명의 음성 인증 시스템에서 이용자의 음성이 인증되는 과정을 보여주는 블럭도.8 is a block diagram showing a process of authenticating a user's voice in the voice authentication system of the present invention.

도 9는 본 발명의 음성 인증 시스템이 이용되는 인터넷상의 네트워크 시스템의 다른 실시예의 개략도.9 is a schematic diagram of another embodiment of a network system on the Internet in which the voice authentication system of the present invention is used.

본 발명의 인증 시스템은 크게 음성 인증 서비스 제공자측과 음성 인증 서비스 이용자측으로 나눌 수 있으며, 음성 인증 서비스 제공자는 음성인식/화자인증 기능을 수행하는 음성인증서버와, 음성인증서버와 음성인증서비스 이용자측간의 통신 중계를 위한 중계모듈을 포함하며, 음성 인증 서비스 이용자는 인터넷 서비스 이용자와 인터넷 서비스 제공자를 포함한다.The authentication system of the present invention can be divided into a voice authentication service provider side and a voice authentication service user side. A voice authentication service provider includes a voice authentication server performing a voice recognition / speaker authentication function, and a voice authentication server and a voice authentication service user side. It includes a relay module for communication relay of, the voice authentication service user includes an Internet service user and an Internet service provider.

음성인증서버는 크게 음성인식 기능과 화자등록 기능과 화자인증 기능을 수행하는데, 음성인식을 수행하는 목적은 이용자의 목소리를 임의적으로 녹음한 사칭자를 방지하기 위한 것으로, 이를 테면 소정 개수의 숫자음 입력을 요구하고, 입력된 숫자음이 요구한 숫자음과 일치한 경우에는 음성인식이 성공한 것이다(음성인식). 이렇게 음성인식이 성공한 경우에 이용자 등록을 요구하는 이용자 개인의 음성의 특징 파라미터를 화자 DB에 저장하여 등록한다(화자 등록). 이후 이용자로부터 인증 요청이 전송된 경우, 화자등록에서와 마찬가지로 음성인식을 수행하고 나서 음성인식이 성공한 경우 입력된 이용자의 음성의 특징과 화자 DB에 저장된 음성의 특징 파라미터를 이용하여 유사도를 얻게 되며, 얻어진 유사도를 소정의 임계값과 비교하여 임계값보다 유사도가 크거나 같으면 화자인증이 성공한 것이고, 임계값보다 작으면 인증이 화자인증이 실패한 것이다(화자인증 또는 음성인증).The voice authentication server performs a voice recognition function, a speaker registration function, and a speaker authentication function. The purpose of performing voice recognition is to prevent impersonators who randomly recorded a user's voice, for example, inputting a predetermined number of numeric sounds. If the number to be entered matches the requested number, the voice recognition is successful (voice recognition). When the voice recognition succeeds in this way, the feature parameters of the user's individual voice requesting user registration are stored and registered in the speaker DB (speaker registration). After the authentication request is sent from the user, as in the speaker registration, if the voice recognition is successful after the voice recognition, the similarity is obtained by using the characteristics of the input user's voice and the feature parameters of the voice stored in the speaker DB. When the similarity obtained is compared with a predetermined threshold and the similarity is greater than or equal to the threshold value, the speaker authentication succeeds, and when the similarity level is less than the threshold value, the speaker authentication fails (speaker authentication or voice authentication).

본 발명의 하나의 특징은, 인터넷상에서 음성을 이용하여 이용자를 인증하는 방법에 관한 것으로, 이용자의 음성을 등록하는 단계와, 이용자 인증을 희망하는 이용자의 음성을 수신하는 단계와, 상기 이용자로부터 수신된 음성을 이용하여 상기 등록된 음성과 비교하는 단계와, 상기 비교 결과를 데이터베이스에 저장하는 단계를 포함한다.One feature of the present invention relates to a method for authenticating a user using voice over the Internet, the method comprising: registering a user's voice, receiving a user's voice desired for user authentication, and receiving from the user Comparing the registered voice with the registered voice, and storing the comparison result in a database.

본 발명의 다른 특징은 상기 인터넷상에서 음성을 이용하여 이용자를 인증하는 방법에서, 상기 이용자 인증을 희망하는 이용자의 음성을 수신하는 단계가, 인터넷 서비스 제공자의 인터넷 서비스에 로그온을 희망하는 이용자에게 소정의 발성을 요청하는 단계와, 인터넷 서비스에 로그온을 위해 이용자 인증을 희망하는 이용자의 음성을 수신하는 단계를 포함하고, 상기 데이터베이스에 저장된 결과에 따라 이용자에 의한 상기 인터넷 서비스 제공자의 인터넷 서비스 로그온을 제어하는 단계를 더 포함하는 것이다.According to another aspect of the present invention, in the method for authenticating a user by using voice on the Internet, the step of receiving a voice of the user who wants to authenticate the user is predetermined to a user who wants to log on to an Internet service of an Internet service provider. Requesting utterance, and receiving a voice of a user who wishes to authenticate a user for logging on to the Internet service, and controlling the Internet service logon of the Internet service provider by the user according to the result stored in the database. It further includes the steps.

본 발명의 또다른 특징은 상기 인터넷상에서 음성을 이용하여 이용자를 인증하는 방법에서, 이용자 인증을 희망하는 이용자에게 소정의 발성을 요청하는 단계를 더 포함하고, 상기 이용자로부터 수신된 음성을 이용하여 상기 등록된 음성과 비교하는 단계는, 상기 이용자로부터 수신된 음성이 이용자에게 요청한 소정의 발성인지를 판단하는 단계와, 상기 판단 결과 이용자로부터 수신된 음성이 요청한 소정의 발성이라고 판단되지 않은 경우에는 이용자에게 소정의 발성을 다시 요청하고, 이용자로부터 수신된 음성이 요청한 소정의 발성이라고 판단된 경우에는 이용자로부터 수신된 음성을 상기 등록된 음성과 비교하는 단계를 포함하는 것이다.Still another aspect of the present invention provides a method of authenticating a user using voice on the Internet, the method further comprising: requesting a user who desires user authentication to have a predetermined voice, using the voice received from the user. The comparing with the registered voice may include determining whether the voice received from the user is a predetermined utterance requested by the user, and if the voice received from the user is not determined to be the predetermined utterance requested by the user, Requesting a predetermined utterance again, and if it is determined that the voice received from the user is the requested utterance, comparing the voice received from the user with the registered voice.

본 발명의 또다른 특징은 상기 인터넷상에서 음성을 이용하여 이용자를 인증하는 방법에서, 이용자의 ID와 PASSWORD를 등록하는 단계와, 인터넷 서비스 제공자의 인터넷 서비스에 로그온을 희망하는 이용자의 ID와 PASSWORD를 수신하는 단계를 더 포함하고, 상기 인터넷 서비스 제공자의 인터넷 서비스에 로그온을 희망하는 이용자의 음성을 수신하는 단계가, 상기 수신된 이용자의 ID와 PASSWORD가 상기 등록된 ID와 PASSWORD와 일치하는 경우에만 상기 이용자의 음성을 수신하는 것이다.In still another aspect of the present invention, there is provided a method of authenticating a user using voice on the Internet, including registering an ID and a password of a user, and receiving an ID and a password of a user who wants to log on to an internet service of an internet service provider. And receiving the voice of the user who wishes to log on to the internet service of the internet service provider, only if the received user ID and password match the registered ID and password. To receive your voice.

본 발명의 또다른 특징은 상기 인터넷상에서 음성을 이용하여 이용자를 인증하는 방법에서, 상기 데이터베이스에 저장된 비교 결과에 따라 상기 이용자에 의한 상기 인터넷 서비스 제공자의 인터넷 서비스 로그온을 제어하는 단계는, 상기 비교 결과를 상기 이용자가 로그온을 희망하는 인터넷 서비스 제공자에게 전송하는 단계를 포함하는 것이다.According to yet another aspect of the present invention, in the method of authenticating a user using voice on the Internet, controlling the Internet service logon of the Internet service provider by the user according to the comparison result stored in the database, the comparison result Transmitting the data to an Internet service provider for which the user wishes to log on.

본 발명의 또다른 특징은 인터넷상에서 음성을 이용하여 이용자의 인터넷 서비스 로그온을 제어하는 방법에 관한 것으로, 이용자의 음성을 등록하는 단계와, 인터넷 서비스를 희망하는 이용자의 음성을 수신하는 단계와, 상기 수신된 음성을 이용하여 상기 등록된 음성과 비교하는 단계와, 상기 비교 결과에 따라 상기 이용자에 의한 인터넷 서비스 로그온을 제어하는 단계를 포함한다.Another aspect of the present invention relates to a method for controlling a user's logon on an Internet service using voice over the Internet, the method comprising: registering a user's voice, receiving a user's voice for an Internet service, Comparing the registered voice with the registered voice by using the received voice, and controlling the logon of the Internet service by the user according to the comparison result.

본 발명의 또다른 특징은 인터넷상에서 음성을 이용하여 이용자를 인증하는 시스템에 관한 것으로, 이용자의 ID와 PASSWORD를 저장하는 기본정보 데이터베이스와, 이용자의 음성의 특징을 저장하는 화자 데이터베이스와, 음성인식을 위해 사용되는 음성정보 데이터베이스와, 인증결과를 저장하는 인증결과 데이터베이스를 포함하는 기억장치와, 화자 인증을 요청하는 이용자에게 소정의 발성을 요청하고, 상기 이용자로부터 음성을 수신하여 상기 수신된 발성이 요청한 소정의 발성인지를 판단하는 음성 인식 모듈과, 상기 음성 인식 모듈의 판단결과 이용자로부터 수신한 음성이 요청한 소정의 발성이라고 판단되는 경우 상기 이용자로부터 수신한 음성과 상기 화자 데이터베이스에 저장된 음성의 특징이 유사도가 소정의 임계값을 초과하는 지를 판단하고 상기 판단결과를 인증결과 모듈에 저장하는 화자 인증 모듈을 포함한다.Another aspect of the present invention relates to a system for authenticating a user using voice over the Internet, comprising: a basic information database storing a user ID and a password, a speaker database storing the characteristics of a user's voice, and voice recognition. A storage device including a voice information database used for storing the information, an authentication result database storing the authentication result, and requesting a user to speak a speaker, and receiving a voice from the user. A voice recognition module for determining whether a voice is a predetermined voice, and when the voice received from the user determines that the voice received from the user is a requested voice, the characteristics of the voice received from the user and the voice stored in the speaker database are similar to each other. Determines if the threshold exceeds And the result of the determination includes a speaker authentication module to store the authentication result module.

본 발명의 또다른 특징은 상기 인터넷상에서 음성을 이용하여 이용자를 등록하고 인증하는 시스템에서, 인터넷 서비스 제공자 및 이용자와의 통신은 모두 중계모듈을 통하여 이루어지는 상기 중계모듈을 더 포함하는 것이다.Another feature of the present invention is a system for registering and authenticating a user using voice on the Internet, wherein the communication service with the Internet service provider and the user further includes the relay module, which is made through the relay module.

이하에서는 도면을 참조하여 본 발명을 상세히 설명한다.Hereinafter, the present invention will be described in detail with reference to the drawings.

도 1은 본 발명에 따른 네트워크 시스템의 하나의 실시예의 구성으로, 음성인증서버(100)와 중계모듈(101)을 포함하는 음성 인증 서비스 제공자(10), 인터넷 서비스 이용자(이하, '이용자',102), 인터넷 서비스 제공자(이하, '제공자',103)를 포함한다. 상기 음성인증서버, 인터넷 서비스 이용자, 인터넷 서비스 제공자는 통신 네트워크로 통해 연결된다. 여기서, 네트워크는 인터넷을 의미하나 기타 다른 네트워크망, 예컨대, LAN, WAN, PSTN(Public Switched Telephone Network), PSDN(Public Switched Data Network), cable TV망, 무선통신망 등도 물론 가능하다. 음성인증서버(100)는 음성 인증 시스템을 이용하기를 희망하는 이용자로부터의 요청에 따라 이용자의 음성을 등록하고 이용자로부터의 인증 요청이 있는 경우에 인증 서비스를 제공하는 서버이다. 또한, 인터넷 서비스 이용자 및 인터넷 서비스 제공자의 무단 침입을 방지하기 위해 음성인증서버(100)에 직접 액세스할 수 없도록 인터넷 서비스 이용자나 인터넷 서비스 제공자와 음성인증서버(100)의 통신은 항상 중계모듈(101)을 통해 이루어진다. 인터넷 서비스 이용자(102)는 통신 네트워크를 통해 자신의 개인 정보와 음성을 전송하여 이용자 등록을 하고, 또한 음성 인증을 희망하는 때에 음성인증서버가 요청한 발성을 전송하고 인증 결과를 수신한다. 인터넷 서비스 제공자(103)는 음성인증서버의 인증 결과 인증 성공인 때에 인터넷 서비스 이용자의 개인정보와 인증 결과를 음성인증서버로부터 전송받는다.1 is a configuration of one embodiment of a network system according to the present invention, a voice authentication service provider 10 including a voice authentication server 100 and a relay module 101, an Internet service user (hereinafter, 'user', 102), an Internet service provider (hereinafter, 'provider', 103). The voice authentication server, Internet service user, and Internet service provider are connected through a communication network. Here, the network means the Internet, but other network networks, for example, a LAN, a WAN, a Public Switched Telephone Network (PSTN), a Public Switched Data Network (PSDN), a cable TV network, a wireless communication network, and the like, are also possible. The voice authentication server 100 is a server that registers a user's voice in response to a request from a user who wishes to use the voice authentication system and provides an authentication service when there is an authentication request from the user. In addition, the communication between the Internet service user or the Internet service provider and the voice authentication server 100 is always performed by the relay module 101 to prevent direct access to the voice authentication server 100 to prevent unauthorized intrusion of the Internet service user and the Internet service provider. Through). The Internet service user 102 transmits his or her personal information and voice through the communication network to register the user, and when voice authentication is desired, transmits the voice requested by the voice authentication server and receives the authentication result. The Internet service provider 103 receives the personal information and the authentication result of the Internet service user from the voice authentication server when the authentication result of the voice authentication server is successful.

도 2는 본 발명의 음성 인증 서비스 제공자(10)의 구성을 개략적으로 보여주는 블럭도이다. 음성 인증 서버(100)는 음성인식 제어모듈(201)과, 화자등록 제어모듈(202)과, 화자인증 제어모듈(203)을 포함하는 제어모듈과, 이용자의 이름, 주민등록번호, ID, PASSWORD 등의 정보를 저장하는 이용자 개인정보 DB(204)와, 음성인식 기능을 위해 음성의 일반 패턴 정보를 저장하는 음성정보 DB(205)와, 이용자의 음성에서 추출한 특징 파라미터를 저장하는 화자 DB(206)와, 음성인식, 화자등록, 화자인증에 사용되는 다수의 함수(예컨대, 이용자로부터 데이터 수신을 담당하는 함수, 이용자로의 모든 메시지 생성과 전송을 담당하는 함수, 이용자 음성의 파라미터 추출에 관련한 함수, 이용자 음성 DB 관리 함수, 음성 인증 알고리즘 관련 함수)를 저장하는 함수 DB(207)와, 이용자의 음성 인증 결과를 저장하는 인증결과 DB(208)를 포함한다.2 is a block diagram schematically showing the configuration of the voice authentication service provider 10 of the present invention. The voice authentication server 100 includes a voice recognition control module 201, a speaker registration control module 202, a control module including a speaker authentication control module 203, a user's name, social security number, ID, password, and the like. A user personal information DB 204 for storing information, a voice information DB 205 for storing general pattern information of a voice for a voice recognition function, a speaker DB 206 for storing feature parameters extracted from a user's voice, and , A number of functions used for speech recognition, speaker registration, speaker authentication (e.g., functions for receiving data from the user, functions for generating and transmitting all messages to the user, functions related to parameter extraction of the user's voice, user A voice DB management function, a voice authentication algorithm related function), and a function DB 207 for storing the voice authentication result of the user.

음성인식 제어모듈(201)은 인터넷 서비스 이용자로부터 음성을 수신한 경우에 이러한 수신된 음성이 음성인증서버가 요청한 음성인지를 판단하고, 요청한 음성인 경우는 화자 등록이나 화자 인증등의 다음 처리 모듈로 데이터를 전송하고, 요청한 음성이 아닌 경우는 이용자에게 음성을 다시 요청하거나 음성인식 결과를 출력하는 것을 제어한다.When the voice recognition control module 201 receives a voice from the Internet service user, the voice recognition control module 201 determines whether the received voice is the voice requested by the voice authentication server, and when the voice is requested, the voice recognition control module 201 is sent to the next processing module such as speaker registration or speaker authentication. If the data is transmitted, and the requested voice is not the requested voice, the voice is requested again from the user or the voice recognition result is output.

화자등록 제어모듈(202)은 음성인식이 성공한 이용자의 음성에서 이용자 개인의 특성 파라미터를 추출하여 화자별 DB(206)에 저장하는 것을 제어한다. 화자인증 제어모듈(203)은 상기 화자별 DB(206)에 저장된 데이터를 이용하여 음성인증을 요청하는 이용자의 음성을 인증하는 것을 제어한다.The speaker registration control module 202 controls to extract the characteristic parameter of the user's individual from the voice of the user whose voice recognition is successful and store it in the speaker-specific DB 206. The speaker authentication control module 203 controls the authentication of the voice of the user requesting voice authentication using the data stored in the speaker-specific DB 206.

중계모듈(101)은 음성인증서버(100)로부터 출력되는 데이터를 중계하여 네트워크를 통해 인터넷 서비스 이용자와 인터넷 서비스 제공자에게 전송하며, 인터넷 서비스 이용자로부터의 데이터를 네트워크를 통해 수신하여 음성인증서버(100)에 전송한다. 이와 같이, 중계모듈(101)은 음성인증서버와 인터넷 서비스 이용자와의 통신을 중계하는데, 이때 매 패킷마다 인증 시스템에서 정의된 메시지 형식인지 확인하여 정의된 패킷이 아닌 경우는 폐기한다. 또한, 다수의 음성인증서버로 시스템이 구성될 경우, 중계모듈은 음성인증서버들의 부하에 따라 인터넷 서비스 이용자들과의 통신 연결을 분산시킬 수 있다.The relay module 101 relays the data output from the voice authentication server 100 and transmits the data to the Internet service user and the Internet service provider through the network, and receives the data from the Internet service user through the network to the voice authentication server 100. To be sent). In this way, the relay module 101 relays the communication between the voice authentication server and the Internet service user. In this case, the relay module 101 checks whether each packet is a message format defined by the authentication system and discards the packet if it is not a defined packet. In addition, when the system is composed of a plurality of voice authentication server, the relay module may distribute the communication connection with the Internet service users according to the load of the voice authentication server.

이하에서는, 도 3 내지 5를 참조하여 이용자의 음성을 등록하는 화자등록 과정을 설명하고, 도 6 내지 8을 참조하여 등록된 음성을 이용하여 화자를 인증하는 화자인증 과정을 설명한다.Hereinafter, a speaker registration process for registering a user's voice will be described with reference to FIGS. 3 to 5, and a speaker authentication process for authenticating a speaker using the registered voice will be described with reference to FIGS. 6 to 8.

도 3은 화자 등록을 위한 흐름도이다. 음성인증서버는 이용자로부터 화자등록 요청을 수신하고(단계 301), 이용자에게 이용자의 개인 정보를 입력할 것을 요구한다(단계 302). 다음, 음성인증서버는 이용자가 입력한 개인 정보를 수신하여(단계 303), 상기 개인정보를 이용자 개인정보 DB에 저장하고, 이용자에게 학습할 발성 데이터를 요청한다(단계 304). 다음, 음성인증서버는 이용자로부터 발성 데이터를 수신하고(단계 305), 음성인식기능을 이용하여 이용자로부터 수신된 발성 데이터가 음성인증서버가 요청한 발성 데이터인지 판단한다(단계 306). 상기 음성인식기능에 대해서는 도 5를 참조하여 상세히 설명하기로 한다. 음성인증서버의 상기 판단의 결과 인식 실패의 경우 즉, 이용자로부터 수신된 발성 데이터가 요청한 발성 데이터가 아니라고 판단된 경우에는 재차 이용자에게 학습할 발성 데이터를 요청하고(단계 304), 인식 성공의 경우 즉, 이용자로부터 수신된 발성 데이터가 요청한 발성 데이터라고 판단된 경우에는 수신한 이용자의 발성 데이터에서 특징 파라미터를 추출하고(단계 307), 상기 추출한 특징 파라미터를 화자별 DB(206)에 저장한다(단계 308). 이와 같이 음성인증서버는 이용자의 음성에서 특징을 추출하여 저장함으로써 화자 등록을 완료한다.3 is a flowchart for speaker registration. The voice authentication server receives a speaker registration request from the user (step 301) and requests the user to enter the user's personal information (step 302). Next, the voice authentication server receives the personal information input by the user (step 303), stores the personal information in the user personal information DB, and requests the user for voice data to learn (step 304). Next, the voice authentication server receives voice data from the user (step 305), and determines whether the voice data received from the user is voice data requested by the voice authentication server using the voice recognition function (step 306). The voice recognition function will be described in detail with reference to FIG. 5. In the case of the recognition failure as a result of the determination of the voice authentication server, that is, when it is determined that the speech data received from the user is not the requested speech data, the user is asked again to speak the speech data to be learned (step 304). If it is determined that the voice data received from the user is the requested voice data, the feature parameter is extracted from the received voice data of the user (step 307), and the extracted feature parameter is stored in the speaker-specific DB 206 (step 308). ). In this way, the voice authentication server extracts and stores features from the user's voice to complete speaker registration.

이제 이용자의 개인정보와 음성 데이터를 수신하는 과정(단계 302 내지 305)을 설명하기 위해 도 4에 이용자가 개인정보와 음성 데이터를 입력할 수 있는 초기화면을 도시한다. 화자등록을 희망하는 이용자가 음성인증서버에 접속하면 도 4의 초기화면이 이용자 컴퓨터의 디스플레이장치에 표시된다. 본 실시예에서는 이용자가 음성인증서버에 직접 접속하는 경우를 예로 들고 있지만, 이용자가 희망하는 인터넷 서비스 제공자의 웹 페이지에 접속한 후 도 4에 도시된 바와 같은 음성인증서버의 초기화면이 연결되어 이용자에게 보여질 수도 있다. 이용자는 이용자정보 입력 메뉴에 포함된 ID, PASSWORD, 이름, 주민등록번호, 계좌번호를 입력할 수 있도록 된 폼(FORM)에 데이터를 입력한다. 이용자는 개인 정보를 모두 입력한 후 음성 입력 메뉴에 포함된 녹음 아이콘을 클릭하여 이용자 자신의 음성을 입력한다. 음성 입력 메뉴의 상단의 지시사항에 따라 0부터 9까지의 숫자를 3회 반복하여 읽고, 음성 입력이 끝나면 정지 아이콘을 클릭하여 녹음을 종료한다. 재생 아이콘은 이용자가 녹음한 자신의 음성을 다시 들어보기를 희망할 때 이용할 수 있다. 이용자가 개인정보와 음성을 모두 입력한 후에 화면 우상단에 표시되어 있는 종료 아이콘을 클릭함으로써, 이용자가 입력한 개인정보 데이터와 음성 데이터는 중계모듈(101)을 통하여 음성인증서버로 전송될 수 있다. 실제로는, 녹음이 완료되면 우선 wave 형태의 데이터가 생성되고, DC 성분, 묵음 구간 wave 형태의 데이터가 가지는 헤더 정보를 제거한 raw 데이터가 성공적으로 생성되면, 개인정보 데이터와 음성 정보를 나타내는 raw 데이터가 전송된다.An initial screen in which the user can input personal information and voice data is shown in FIG. 4 to explain the process (steps 302 to 305) of receiving the user's personal information and voice data. When the user who wishes to register a speaker connects to the voice authentication server, the initial screen of Fig. 4 is displayed on the display device of the user computer. In this embodiment, the user directly connects to the voice authentication server as an example, but after accessing the web page of the desired Internet service provider, the initial screen of the voice authentication server as shown in FIG. 4 is connected to the user. May be shown to The user enters data in a form that allows the user to enter an ID, password, name, social security number, and account number included in the user information input menu. After the user inputs all personal information, the user inputs his or her own voice by clicking the recording icon included in the voice input menu. Follow the instructions at the top of the voice input menu to read the numbers 0 to 9 three times, and when the voice input is finished, click the stop icon to end recording. The play icon can be used when the user wishes to hear his / her recorded voice again. After the user inputs both the personal information and the voice, by clicking the end icon displayed on the upper right of the screen, the personal information data and the voice data input by the user may be transmitted to the voice authentication server through the relay module 101. In fact, when recording is completed, wave-type data is first generated, and when raw data is successfully generated from which header information of the DC component and the silence section wave-type data is successfully generated, raw data representing personal information data and voice information is generated. Is sent.

음성입력 메뉴의 하단에는 현재 동작의 상태를 나타내는 상태 표시줄이 표시되는데, 예를 들어, 이용자의 개인정보와 음성정보를 수신한 음성인증서버가 음성인식의 판단결과 인식성공의 경우에는 상태 표시줄에 "음성 입력이 성공하였습니다" 또는 인식실패의 경우에는 상태 표시줄에 "음성 입력이 실패하였습니다"라는 메시지를 표시할 수 있다. 인식실패의 경우 이용자는 음성 입력 과정을 재차 반복하여 이용자 자신의 음성 데이터를 중계모듈을 통하여 음성인증서버(100)로 전송한다.At the bottom of the voice input menu, a status bar indicating the status of the current operation is displayed. For example, when the voice authentication server receiving the user's personal information and voice information recognizes the voice recognition result, the status bar is displayed. In the case of "voice input succeeded" or in case of recognition failure, the message "voice input failed" may be displayed in the status bar. In the case of recognition failure, the user repeats the voice input process again and transmits the user's own voice data to the voice authentication server 100 through the relay module.

다음, 도 5를 참조하여 이용자로부터 수신한 음성을 화자 DB에 등록하는 과정을 설명한다. 도 5의 전처리과정(503)과 유사도측정(504)이 이용자의 음성 인식을 수행하는 부분이며, 패턴학습과정(505)이 이용자의 음성을 화자 DB(207)에 등록하는 화자등록을 수행하는 부분이다. 전처리과정(501)과 패턴학습과정(502)은 음성인식에 사용되는 음성정보 DB(204)를 구축하는 과정을 보여주기 위한 것이다.Next, a process of registering the voice received from the user with the speaker DB will be described with reference to FIG. 5. The preprocessing process 503 and the similarity measurement 504 of FIG. 5 are parts for performing the user's speech recognition, and the pattern learning process 505 is for performing speaker registration for registering the user's voice in the speaker DB 207. to be. The preprocessing process 501 and the pattern learning process 502 are for showing a process of constructing the voice information DB 204 used for speech recognition.

먼저, 전처리 과정(501)은 소리 신호를 수신하여 그중 음성의 특징을 얻는과정으로서, 입력된 소리 신호중 시작점과 끝점을 검출하여 신호가 아닌 부분을 제거하여, 음성부분을 찾고(Endpoint Detection), Endpoint Detection으로부터 입력된 음성신호의 구간을 구획하여 음성 프레임을 만들고 이 구간에 대한 특징값을 찾는다(LPC-Cep(선형 예측 계수 켑스트럼) 특성 추출). 음성의 특성 추출 부분은 프리엠퍼시스(pre-emphasis), 해밍 윈도우(Hamming windows), 선형예측계수(LPC)를 이용한 다음 LPC 켑스트럼 순서로 추출한다. 패턴학습과정(502)은 전처리 과정에서 구해진 음성의 특징값을 이용하여 확률 모델을 만드는 과정으로서, 음성의 확률 모델을 수립하는 CHMM 알고리즘을 사용하기 위해 초기화를 하고(CHMM(Continuous Hidden Markov Model) 모델 초기화), CHMM을 이용하여 음성의 확률모델을 계산하며(Training 과정), 계산된 확률 모델이 수렴하여 안정된 상태인지를 체크하고, 수렴이 확인되면 음성정보의 CHMM 모델 파라미터 λ(π,A,C,μ,Σ)를 음성정보 DB(204)에 저장한다.First, the preprocessing process 501 is a process of receiving a sound signal to obtain a feature of the voice, detecting the start point and the end point of the input sound signal to remove the non-signal part, and find the voice part (Endpoint Detection), Endpoint A speech frame is divided by segmenting the input speech signal from the detection, and the feature value for the interval is found (LPC-Cep (linear prediction coefficient 계수 strum) feature extraction). The speech feature extraction portion is extracted in the order of pre-emphasis, Hamming windows, and linear predictive coefficients (LPC) and then LPC cepstrum. The pattern learning process 502 is a process of creating a probabilistic model using the feature values of the speech obtained in the preprocessing process. The pattern learning process 502 is initialized to use the CHMM algorithm for establishing the probabilistic model of the speech (Continuous Hidden Markov Model (CHMM) model). Initialization), calculates the probability model of speech using CHMM (Training process), checks whether the calculated probability model converges and is stable, and if convergence is confirmed, CHMM model parameter λ (π, A, C of speech information) ,?,?) are stored in the voice information DB 204.

이제 화자등록을 희망하는 이용자의 음성 데이터가 전처리과정(503)에 입력되면, 입력된 데이터중 음성부분을 찾고(Endpoint Detection), 상기 음성신호중에서 음성에 대한 특징값을 추출한다(LPC-Cep 특성 추출). 유사도 측정(504)은 이미 수립된 확률 모델을 이용하여 입력된 음성 신호가 유사한 정도를 측정하는 것으로, 전처리과정(503)으로부터 수신한 특징 파라미터 X(t)와 음성 정보 DB(204)에 저장된 특징 파라미터 λ의 표준 패턴이 얼마나 잘 매칭하는가를 판단하는 것이다. 이때 예를 들어, 음성정보 DB에 저장되어 있는 λ가 숫자음 0 에서 9 에 대한 것이라면, X(t)를 숫자음 0 에서 9 까지 각각 매칭을 시키는데, 이때 유사도가 가장 큰Max(arg Likelihood)로 음성인식을 수행한다.Now, when the voice data of the user who wants to register the speaker is input to the preprocessing process 503, the voice part of the input data is found (Endpoint Detection), and the feature value of the voice is extracted from the voice signal (LPC-Cep characteristic). extraction). The similarity measure 504 measures the degree to which the input voice signal is similar using a previously established probability model. The similarity measure 504 stores the feature parameter X (t) received from the preprocessing 503 and the feature stored in the voice information DB 204. It is to determine how well the standard pattern of the parameter λ matches. At this time, for example, if λ stored in the voice information DB corresponds to the numeric tones 0 to 9, X (t) is matched to the numeric tones 0 to 9, respectively, with the largest similarity Max (arg Likelihood). Perform voice recognition.

예를 들면, "3"에 대한 음성인식을 수행하는데, 입력 음성이 "4"라면, 입력 음성 "4"에 대한 Max(arg Likelihood)는 4에서 나타날 것이고, 이것은 "3"으로 음성인식된 것이 아니므로, 음성인식은 실패로 되는 것이다.For example, if speech input for "3" is performed and the input speech is "4", Max (arg Likelihood) for input speech "4" will appear at 4, which is speech recognition as "3". No, voice recognition is a failure.

상기한 과정에 의해 음성인식이 실패한 경우에는 이용자로부터 다시 음성 데이터를 입력받아 전처리 과정과 유사도 측정을 행하고, 음성인식이 성공한 경우에는 입력된 이용자의 음성으로부터 화자 인증에 사용될 정보를 얻기 위해 패턴 학습 과정(505)을 행한다. 패턴학습과정(505)은 전처리 과정(503)에서 구해진 음성의 특징값을 이용하여 확률 모델을 만드는 과정으로서, 여기에서는 이용자 개인만의 음성의 특징 정보가 잘 드러나는 확률 모델을 만드는 것이므로, 다수의 사람의 일반적인 음성의 특징 정보가 담겨진 확률 모델을 만드는 과정인 패턴학습과정(502)과는 그 모델에 있어서 차이가 있다. 패턴학습과정(505)에서는 음성의 확률 모델을 수립하는 GMM 알고리즘을 사용하기 위해 초기화를 하고(GMM(Gaussian Mixture Model) 모델 초기화), GMM을 이용하여 음성의 확률모델을 계산한다(Training 과정). 트레이닝 과정의 목적은, 화자로부터 트레이닝 음성이 주어질 때, 특징 벡터의 분포를 잘 맵핑시키는 GMM의 파라미터 λ를 추정하는 것이다. 다음, 계산된 확률 모델이 수렴하여 안정된 상태인지를 체크하고, 수렴이 확인되면 GMM 모델 파라미터 λ(p,μ,Σ)(여기서, p는 mixture 가중치, μ는 평균 벡터(mean vector), Σ는 공분산 행렬(covariance matrix)이다)를 화자 DB(206)에 저장한다. 이와 같은 과정에 의해 이용자 개인의 음성의 특징은 화자 DB(207)에 등록된다.If the voice recognition fails by the above process, the voice data is input again from the user to measure similarity with the preprocessing process. If the voice recognition is successful, the pattern learning process is used to obtain information to be used for speaker authentication from the inputted user's voice. 505 is performed. The pattern learning process 505 is a process of creating a probabilistic model using the feature values of the speech obtained in the preprocessing process 503. Since the pattern learning process 505 creates a probabilistic model in which the feature information of the user's own speech is well represented, There is a difference in the model from the pattern learning process 502, which is a process of creating a probabilistic model containing characteristic information of general speech. In the pattern learning process 505, an initialization is performed to use a GMM algorithm for establishing a probability model of speech (GMM (Gaussian Mixture Model) model initialization), and a probability model of the speech is calculated using the GMM (Training process). The purpose of the training process is to estimate the parameter λ of the GMM which, when given training speech from the speaker, maps the distribution of the feature vector well. Next, it is checked whether the calculated probability model converges and is stable.When convergence is confirmed, the GMM model parameters λ (p, μ, Σ) (where p is the mixture weight, μ is the mean vector, and Σ is A covariance matrix) is stored in the speaker DB 206. By this process, the characteristics of the user's personal voice are registered in the speaker DB 207.

도 6은 화자 인증을 위한 흐름도이다. 음성인증서버는 이용자로부터 화자 인증 요청을 수신하고(단계 601), 이용자에게 소정의 발성을 요청한다(단계 602). 음성인증서버가 이용자에게 요청하는 소정의 발성은 문자음일수도 있고 숫자음일수도 있으며, 화자 DB에 등록된 이용자의 음성을 녹음하여 사칭하는 것을 방지하기 위해 음성인증서버가 요청하는 문자음 또는 숫자음 발성의 조합은 소정의 난수(random number)에 의해 행해진다. 음성인증서버는 이용자의 발성을 수신하고(단계 603), 수신한 이용자의 발성 데이터로부터 특징 파라미터를 추출한다(단계 604). 다음, 음성인식을 이용하여 이용자로부터 수신된 발성이 요청한 발성 데이터 인지를 판단하고(단계 605), 인식 실패의 경우에는 이용자에게 다시 소정의 발성을 요청하고, 인식 성공의 경우는 상기 추출된 특징 파라미터와 화자 DB(206)에 저장된 특징 파라미터를 비교하여 매칭되는 것이 있는지 판단하고, 판단결과 인증실패한 경우에는 "인증실패" 메시지를 이용자 컴퓨터의 디스플레이장치에 표시하고, 인증성공한 경우에는 화자인증을 요청한 이용자의 개인정보와 인증결과를 인터넷 서비스 제공자에게 전송하고, "인증성공" 메시지를 인터넷 서비스 이용자에게 표시한다. 이와 같은 과정에 의해 화자 인증을 수행한다.6 is a flowchart for speaker authentication. The voice authentication server receives a speaker authentication request from the user (step 601) and requests the user a predetermined voice (step 602). The predetermined voice requested by the voice authentication server may be a character sound or a numeric sound, and the voice or numeric sound requested by the voice authentication server to prevent impersonation by recording the user's voice registered in the speaker DB. Is combined by a predetermined random number. The voice authentication server receives the user's voice (step 603) and extracts feature parameters from the received user's voice data (step 604). Next, it is determined whether the speech received from the user is the requested speech data using voice recognition (step 605), and in case of recognition failure, the user is asked for a predetermined speech again, and in case of recognition success, the extracted feature parameter. And the feature parameters stored in the speaker DB 206 to determine whether there is a match, and if the authentication fails, a "authentication failure" message is displayed on the display device of the user computer. Send personal information and authentication result to internet service provider, and display "successful authentication" message to internet service user. By this process, speaker authentication is performed.

이제, 화자등록을 한 이용자가 화자인증을 요청할 수 있도록 음성 데이터를 입력할 수 있는 초기화면을 도 7에 도시한다. 화자인증을 희망하는 이용자가 음성인증서버에 접속하면 도 7에 도시되는 바와 같은 초기화면이 이용자 컴퓨터의 디스플레이장치에 표시된다. 화자 등록의 경우와 마찬가지로, 화자 인증을 희망하는 이용자는 희망하는 인터넷 서비스 제공자의 웹 페이지에 접속한 후 도 7에 도시된바와 같은 음성인증서버의 초기 화면에 접속할 수도 있다. 이용자는 이용자 입력 메뉴에 포함된 ID, PASSWORD를 입력할 수 있도록 제공된 폼(FORM)에 이용자 자신의 ID와 PASSWORD를 입력한다. 다음, 이용자는 음성입력메뉴에 포함된 녹음 아이콘을 클릭하고나서, 음성입력메뉴에 표시된 숫자(***)음을 읽어 음성 입력이 끝나면 정지 아이콘을 클릭한다. 다음, 이용자가 인증 아이콘을 클릭하면 이용자가 입력한 ID와 PASSWORD 및 이용자의 음성이 네트워크를 통하여 중계모듈을 거쳐 음성인증서버(100)로 전송된다.Fig. 7 shows an initial screen for inputting voice data so that a user who has registered a speaker can request speaker authentication. When a user who wishes to authenticate a speaker connects to the voice authentication server, an initial screen as shown in Fig. 7 is displayed on the display device of the user computer. As in the case of speaker registration, the user who wishes to authenticate the speaker may access the initial screen of the voice authentication server as shown in FIG. 7 after accessing the web page of the desired Internet service provider. The user inputs his or her ID and PASSWORD in the form provided so that the user can enter the ID and PASSWORD included in the user input menu. Next, the user clicks the recording icon included in the voice input menu, reads the number (***) displayed in the voice input menu, and clicks the stop icon when voice input is completed. Next, when the user clicks the authentication icon, the ID, password, and user's voice input by the user are transmitted to the voice authentication server 100 through the relay module through the network.

도 8에 화자인증을 요청하는 이용자의 음성을 수신한 음성인증서버가 화자인증을 하는 과정을 도시한다. 이용자의 음성인식은 유사도 측정(802)에서 행해지며, 화자인증은 테스트 과정(803)에서 행해진다.8 illustrates a process in which the voice authentication server that receives the voice of the user requesting the speaker authentication performs the speaker authentication. The user's voice recognition is performed in the similarity measure 802, and the speaker authentication is performed in the test process 803.

이용자의 음성 S(n)이 수신되면 도 5의 전처리과정(503), 유사도측정(504)에서와 동일한 방법에 의해 이용자의 음성인식이 행해진다. 또한, 음성인식이 실패한 경우 이용자에게 다시 발성할 것을 요청하고 성공한 경우에는 화자인증을 위한 테스트 과정(803)을 행한다. 테스트 과정(803)에서는 전처리 과정(801)에서 얻어진 이용자 음성의 특징 파라미터 X(t)와 화자별 DB(206)에 저장된 GMM 모델 파라미터 λ(p,μ,Σ)를 입력받고 매칭시켜 유사도를 구하고, 상기 유사도를 화자등록 과정에서 구해진 임계치(입력된 음성의 특징 파라미터가 화자 DB에 저장된 GMM 모델에 포함된다고 판단될 만한 한계값)와 비교한다(Threshold와 비교). 비교결과 유사도 값이 임계치보다 같거나 높으면 화자가 일치하는 것으로 판단되므로 화자 인증은 성공으로 분류하고, 임계치보다 낮으면 화자가 일치하지 않는 것으로 판단되므로 화자 인증은 실패로 분류한다. 상기와 같은 화자인증의 결과를 인증결과 DB(207)에 저장한다.When the user's voice S (n) is received, the user's voice recognition is performed by the same method as in the preprocessing 503 and the similarity measurement 504 of FIG. In addition, if voice recognition fails, the user is asked to speak again, and if successful, a test process 803 for speaker authentication is performed. In the test process 803, the similarity is obtained by inputting and matching the feature parameter X (t) of the user's voice obtained in the preprocessing process 801 and the GMM model parameter λ (p, μ, Σ) stored in the speaker DB DB 206. The similarity is compared with a threshold value (a threshold value that can be determined to be included in the GMM model stored in the speaker DB) obtained from the speaker registration process (compared with the threshold). As a result of the comparison, if the similarity value is equal to or higher than the threshold, the speaker is considered to be a match, and if it is lower than the threshold, the speaker is classified as a success. The result of the speaker authentication as described above is stored in the authentication result DB 207.

상기와 같은 방법에 의해 인증결과 DB에 저장된 인증결과에 따라 중계모듈과 네트워크를 통해 인터넷 서비스 이용자에게는 "인증성공" 또는 "인증실패"라는 메시지를 전송하고, 또한 상기 인증결과를 인터넷 서비스 이용자가 이용하기를 희망하는 인터넷 서비스 제공자에게 전송할 수 있다. 인증결과를 인터넷 서비스 이용자와 인터넷 서비스 제공자에게 전송하는 구체적인 한 예는 다음과 같다. 화자 인증을 수행한 후, 음성인증 서버는 이용자에게 패킷 형태로 결과를 보내게 되는데, 인증결과가 성공이면 인터넷 서비스 제공자의 웹 페이지에 접속된다. 한편, 음성인증 서버는 인증결과가 성공인 경우에만 인증결과 DB에 이용자의 ID, 인증결과 저장 시간, 이용자의 IP 주소 등을 저장하고, 음성인증 서버의 인증결과 전송에의해 이용자가 인터넷 브라우저로 인터넷 서비스 제공자의 웹페이지에 접속하면, 인터넷 서비스 제공자는 이용자의 IP 주소를 이용하여 음성인증서버에 질의하여 인증결과 DB를 검색하고, 저장된 IP 주소와 접속을 시도한 이용자의 IP 주소를 검색하여 매칭되는 것이 있으면 저장된 시간을 검사한 후, 상기 저장된 시간과 현재 시간과의 차이가 예를 들어, 30초 이내이면 이용자의 액세스 로그온을 허용하고 그렇지 않으면 액세스 로그온을 차단한다. 또한, 음성인증 서버는 접속 허가 실패에 관계없이 30초가 지난 인증결과는 인증결과 DB에서 모두 삭제한다.According to the authentication result stored in the authentication result DB according to the above method, the message "Authentication Success" or "Authentication Failure" is transmitted to the Internet service user through the relay module and the network, and the authentication result is used by the Internet service user. Can be sent to an Internet service provider that wishes to: One specific example of transmitting the authentication result to the Internet service user and the Internet service provider is as follows. After performing the speaker authentication, the voice authentication server sends a result to the user in the form of a packet. If the authentication result is successful, the voice authentication server is connected to the web page of the Internet service provider. On the other hand, the voice authentication server stores the user's ID, the authentication result storage time, the user's IP address, etc. in the authentication result DB only when the authentication result is successful, and the user transmits the authentication result of the voice authentication server to the Internet browser. When accessing the service provider's web page, the Internet service provider queries the voice authentication server using the user's IP address to search the authentication result DB, and searches and matches the stored IP address with the user's IP address. If it is, the stored time is checked, and if the difference between the stored time and the current time is, for example, within 30 seconds, the user is allowed to log on, otherwise the access logon is blocked. In addition, the voice authentication server deletes all authentication results over 30 seconds from the authentication result DB, regardless of the connection permission failure.

도 9는 본 발명의 음성 인증 시스템이 이용되는 인터넷상의 네트워크 시스템의 다른 실시예를 보여준다. 도 1에 도시된 실시예와 달리, 도 9의 실시예에서는인터넷 서비스 제공자(103)가 음성인증을 수행하는 음성인증서버(100)를 포함한다. 따라서 인터넷 서비스 제공자(103)는 인터넷 서비스 이용자(102)로부터 네트워크를 통해 음성인증 요청을 수신한 경우, 음성인증서버(100)에 의하여 음성인증을 수행하고 나서, 상기 음성 인증 결과를 이용하여 인터넷 서비스 이용자의 액세스 로그온을 허용하거나 금지할 수 있다. 본 실시예의 경우에는 인터넷 서비스 제공자(103)가 음성인증서버의 인증결과를 포함하므로, 도 1에 도시된 실시예에서와 같이 음성인증서버의 인증결과를 네트워크를 통하여 인터넷 서비스 제공자(103)에게 전송할 필요가 없어진다.9 shows another embodiment of a network system on the Internet in which the voice authentication system of the present invention is used. Unlike the embodiment shown in FIG. 1, in the embodiment of FIG. 9, the Internet service provider 103 includes a voice authentication server 100 that performs voice authentication. Therefore, when the Internet service provider 103 receives the voice authentication request from the Internet service user 102 through the network, the voice service is performed by the voice authentication server 100 and then the Internet service is performed using the voice authentication result. You can allow or prohibit a user from logging on to access. In this embodiment, since the Internet service provider 103 includes the authentication result of the voice authentication server, the authentication result of the voice authentication server is transmitted to the Internet service provider 103 through the network as shown in the embodiment shown in FIG. There is no need.

이상의 방법 중 각 과정들은 이 방법을 수행하기 위하여 그 순서가 바뀔 수 있다. 또한, 본 발명은 본 발명이 속하는 분야에서 다양한 변경, 수정이 당업자에 의해 행해질 수 있고, 본 발명은 상기 예시적으로 설명한 예들에 의해 한정되는 것이 아니며, 특허청구범위에 기재된 범위에 의해 한정된다.Each of the above methods may be reversed in order to perform this method. In addition, various changes and modifications may be made by those skilled in the art in the field to which the present invention pertains, and the present invention is not limited to the examples described above, but is defined by the scope described in the claims.

본 발명은, 인터넷 등의 네트워크상에서 음성을 이용하여 이용자를 등록하고 인증함으로써, 코스트를 많이 들게 하지 않으면서 이용자 정보의 보안을 확실히 기할 수 있게 된다.According to the present invention, by registering and authenticating a user using voice over a network such as the Internet, the user information can be secured without incurring a high cost.

Claims

In the method of authenticating a user using voice on the Internet,

Registering the user's voice;

Receiving a voice of a user who desires user authentication,

Comparing the registered voice with the registered voice using the voice received from the user;

Storing the result of the comparison in a database.

The method of claim 1,

Receiving the voice of the user who wants to authenticate the user,

Requesting a predetermined voice from a user who wishes to log on to the Internet service of the Internet service provider;

Receiving a voice of a user who wishes to authenticate a user for logging on to the Internet service,

Controlling the logon of the internet service provider by the user according to the result stored in the database.

The method of claim 1,

Requesting a predetermined voice from a user who desires user authentication,

Comparing with the registered voice using the voice received from the user,

Determining whether the voice received from the user is a predetermined voice requested by the user;

As a result of the determination, if the voice received from the user is not judged to be the predetermined utterance requested, the user requests the predetermined utterance again, and if the voice received from the user is determined to be the requested utterance, the voice received from the user And comparing the registered voice with the registered voice.

The method of claim 1,

Registering the user's ID and password,

Receiving an ID and PASSWORD of the user who wants to log on to the Internet service of the Internet service provider,

Receiving a voice of a user who wants to log on to the Internet service of the Internet service provider,

And receiving the user's voice only when the received user's ID and password match the registered ID and password.

The method of claim 2,

Controlling the Internet service logon of the Internet service provider by the user according to the comparison result stored in the database includes transmitting the comparison result to an Internet service provider to which the user wishes to log on. To authenticate the user using a.

A method of controlling a user's logon on an Internet service using voice on the Internet,

Registering the user's voice;

Receiving a voice of a user who desires an Internet service,

Comparing the registered voice with the registered voice using the received voice;

Controlling an Internet service logon by the user according to the comparison result.

In a system for authenticating a user using voice on the Internet,

A storage device including a basic information database storing user ID and password, a speaker database storing user's voice characteristics, a voice information database used for voice recognition, and an authentication result database storing authentication results; ,

A voice recognition module for requesting a speaker to request a speaker, and for determining whether the received voice is a predetermined voice requested by receiving a voice from the user;

If the voice recognition module determines that the voice received from the user is the requested utterance, it is determined whether the characteristics of the voice received from the user and the voice stored in the speaker database have similarities exceeding a predetermined threshold value. A system for registering and authenticating a user using voice on the Internet including a speaker authentication module for storing the determination result in the authentication result module.

The method of claim 7, wherein

A communication system for registering and authenticating a user using voice on the Internet, further comprising the relay module, wherein both the communication with the Internet service provider and the user is via a relay module.