KR100456408B1

KR100456408B1 - Search of audio date and sample

Info

Publication number: KR100456408B1
Application number: KR10-2004-0008009A
Authority: KR
Inventors: 박민수
Original assignee: (주)뮤레카
Priority date: 2004-02-06
Filing date: 2004-02-06
Publication date: 2004-11-10

Abstract

본 발명은 라디오나 텔레비전등에서 디스플레이된 오디오를 휴대폰 등 무선통신기기로부터 전송받아 이를 성문으로 오디오 인식에 사용되는 오디오유전자를 생성하고, 그 오디오유전자를 이용하여 오디오를 인식하는, 오디오유전자 생성방법 및 오디오데이터 검색방법에 관한 것이다.The present invention receives audio displayed on a radio or television from a wireless communication device such as a mobile phone, and generates an audio gene used for audio recognition as a gate, and recognizes the audio using the audio gene. It relates to a data retrieval method.

Description

Audio Gene Generation Method and Audio Data Search Method {SEARCH OF AUDIO DATE AND SAMPLE}

본 발명은 오디오유전자 생성방법 및 오디오데이터 검색방법에 관한 것으로서, 보다 상세하게는 라디오나 텔레비전등에서 디스플레이된 오디오를 휴대폰 등 무선통신기기로부터 전송받아 이를 성문으로 오디오 인식에 사용되는 오디오유전자를 생성하고, 그 오디오유전자를 이용하여 오디오를 인식하는, 오디오유전자 생성방법 및 오디오데이터 검색방법에 관한 것이다.The present invention relates to a method for generating an audio gene and a method for retrieving audio data, and more particularly, to receive an audio displayed on a radio or a television from a wireless communication device such as a mobile phone, and to generate an audio gene used for audio recognition by using a voiceprint. An audio gene generating method and an audio data retrieval method for recognizing audio using the audio gene.

음악 또는 다양한 공급원으로부터 발생된 다른 오디오 신호의 자동 인식에 대한 필요성이 증가하고 있다. 즉, 음악 저작권자들은 자신의 음악이 라디오나 텔레비전에 어느 정도 방송되어 어느 정도의 저작권료를 청구할 수 있는가 하는 근거자료로 얻고자 한다. 또한 MP3등 디지털데이터들이 발달하면서 인터넷을 통한 음악파일 다운로드시 라디오나 텔레비전에서 들었던 음악의 음악파일을 다운로드받고 싶어한다.There is an increasing need for automatic recognition of music or other audio signals originating from various sources. That is, music copyright holders want to obtain as a basis for how much their music is broadcasted on radio or television and how much can be claimed. Also, with the development of digital data such as MP3, when you download music files through the Internet, you want to download music files from the radio or television.

컴퓨터시스템을 이용하여 오디오신호를 자동 인식하는 방법들이 종래 다수 소개되었으나 현재 사용되고 있지는 않고 있다. 그 자동인식 방법 중에 하나로 2003년 7월 7일 대한민국에서 특허공개된 "오디오데이터베이스에서의 검색방법"(대한민국 특허공개공보 제2003-59085호)이 있었다.A number of methods for automatically recognizing audio signals using a computer system have been introduced in the past, but are not currently used. One of the automatic recognition methods was a "search method in an audio database" (Korean Patent Publication No. 2003-59085), which was patented in Korea on July 7, 2003.

이 오디오 데이터베이스에서의 검색방법(대한민국 특허공개공보 제2003-59085호)은, 샘플의 특정 위치에서의 한 세트의 핑거프린트를 결정하는 단계, 데이터베이스 색인 내에 매칭 핑거프린트를 발견하는 단계, 샘플의 위치와 동등한 핑거프린트를 갖는 파일의 위치 사이의 대응관계를 생성하는 단계 및 상당히 많은 수의 대응 관계가 실질적으로 선형적으로 관련되는 미디어 파일을 식별하는 단계를 포함하고 있었다. 많은 수의 대응 관계를 갖는 파일을 식별하는 하나의 방법은 대응 관계의 쌍으로부터 생성된 산점도의 대각선에 대한 스케닝에 상당하는 것을 실행하는 것이었다.The search method in this audio database (Korean Patent Publication No. 2003-59085) includes determining a set of fingerprints at a specific location of a sample, finding a matching fingerprint in a database index, and location of the sample. Creating a correspondence between the locations of the files having a fingerprint equal to and identifying a media file to which a substantial number of correspondences are substantially linearly related. One way to identify files with a large number of correspondences has been to perform the equivalent of scanning the diagonals of the scatter plot generated from pairs of correspondences.

도8a는 샘플과 오디오데이터간의 동일한 핑거프린트를 갖는 위치간의 선형대응을 나타내어 오디오데이터가 샘플을 포함하는 것을 도시하고 있고, 도8b는 선형대응을 나타내지 않아 샘플을 포함하는 오디오데이터가 발견되지 않았음을 도시하고 있다.Fig. 8A shows a linear correspondence between positions having the same fingerprint between the sample and the audio data, so that the audio data contains a sample, and Fig. 8B shows no linear correspondence, so no audio data containing the sample was found. It is shown.

이 오디오 데이터베이스에서의 검색방법은, 동일한 핑거프린트를 갖는 샘플과 오디오데이터의 위치 사이의 대응관계가 샘플과 선형적으로 관계되는 오디오데이터를 선별하는 것으로, 이전의 방법들보다 높은 레벨의 잡음 및 왜곡을 받은 오디오신호를 비교적 실시간으로 선별하는 효과가 있기는 하였다.The search method in this audio database selects audio data in which the correspondence between a sample having the same fingerprint and the position of the audio data is linearly related to the sample, and has a higher level of noise and distortion than the previous methods. It has been effective in screening the received audio signal in relatively real time.

그러나 종래의 오디오 데이터베이스에서의 검색방법은 일차적으로 샘플의 핑거프린트들과 동일한 오디오데이터의 핑거프린트들을 모두 선별한 후, 오디오데이터마다 동일한 핑거프린트를 갖는 샘플과 오디오데이터의 위치관계의 선형성을 확인해야 하므로, 검색시간이 이 검색방법이 원래 기대했던 것보다 빠르지 않았다. 즉 불필요하게 동일한 핑거프린트들을 선별하여 위치관계를 확인하여야 하므로 시간을 낭비하는 결과를 야기하였다.However, in the conventional audio database, a search method must first select all fingerprints of the same audio data as the fingerprints of the sample, and then check the linearity of the positional relationship between the audio data and the sample having the same fingerprint for each audio data. Therefore, the search time was not faster than the search method originally expected. In other words, the same fingerprints need to be selected to confirm the positional relationship, which causes a waste of time.

또한 종래의 오디오데이터베이스에서의 검색방법은 오디오데이터와 샘플을 비교하기 위해서 위치, 즉 재생이 가능하게 계산될 수 있는 위치인랜드마크(landmark)를 각각의 핑거프린트와 함께 저장하여야 하므로 오디오데이터의 데이터베이스의 용량이 커지고 샘플의 목록의 크기가 커지는 문제점이 존재하였다. 이러한 용량과 크기의 증가는 프로세서(processor)의 전체적인 처리속도를 저하시켜 순차적으로 검색속도를 떨어뜨리는 문제점을 갖고 있었다.In addition, in the conventional audio database search method, since a landmark, which is a position that can be calculated to be reproduced, must be stored together with each fingerprint in order to compare the audio data and the sample, the database of the audio data There was a problem in that the capacity of and the size of the list of samples became large. This increase in capacity and size has a problem in that the overall processing speed of the processor lowers the search speed sequentially.

또한, 종래의 오디오데이터베이스에서의 검색방법은, 모든 오디오데이터의 인식방법을 제시한다고 하였으나, 휴대폰이나 PCS폰과 같이 오디오신호의 변환 또는 변조가 존재하는 변형된 오디오신호에 대한 검색방법을 구체적으로 제시하지 못하는 아쉬움을 남겼다.In addition, the conventional method of searching in the audio database suggests a method of recognizing all audio data, but specifically presents a method of searching for a modified audio signal in which the conversion or modulation of the audio signal exists, such as a mobile phone or a PCS phone. I did not want to leave.

본 발명의 목적은 상기의 문제점을 해결하기 위한 것으로서 , 핑거프린트를 이용하여 오디오데이터를 검색하되 핑거프린트의 위치정보를 별도로 저장하지 않아 오디오데이터의 검색속도를 향상시킬 수 있는 오디오유전자 생성방법 및 오디오데이터 검색방법을 제공하는 것이다.An object of the present invention is to solve the above problems, and to search for audio data using a fingerprint, the audio gene generation method and audio that can improve the speed of searching the audio data by not storing the location information of the fingerprint separately It is to provide a data retrieval method.

또한, 본 발명의 또다른 목적은, 휴대폰이나 PCS폰과 같이 변조된 오디오신호의 오디오데이터에 대해서도 인식이 가능한 오디오유전자 생성방법 및 오디오데이터 검색방법을 제공하는 것이다.In addition, another object of the present invention is to provide an audio gene generation method and an audio data retrieval method capable of recognizing audio data of a modulated audio signal such as a mobile phone or a PCS phone.

또한, 본 발명의 또다른 목적은, 비교하거나 저장할 데이터의 용량을 최소화하여 저장공간을 절약하고 프로세서의 처리속도를 향상시킬 수 있는 오디오유전자 생성방법 및 오디오데이터 검색방법을 제공하는 것이다.Another object of the present invention is to provide an audio gene generation method and an audio data retrieval method capable of minimizing the capacity of data to be compared or stored, thereby saving storage space and improving processing speed of a processor.

또한, 본 발명의 또다른 목적은, 비교하는 샘플이 중첩된 시간간격들로부터 주파수를 추출하므로, 하나의 시간간격에 오류가 발생하더라도 중첩하여 비교검색하므로 검색률, 즉 검색의 정확도를 향상시킬 수 있는 오디오유전자 생성방법 및 오디오데이터 검색방법을 제공하는 것이다.In addition, another object of the present invention is to extract the frequency from the overlapping time intervals to compare the sample, even if an error occurs in one time interval to compare and search because overlapping can improve the search rate, that is, the accuracy of the search The present invention provides a method for generating an audio gene and a method for retrieving audio data.

도1은 본 발명의 일실시예에 따른 오디오데이터의 검색시스템의 개념도.1 is a conceptual diagram of a system for searching audio data according to an embodiment of the present invention;

도2는 본 발명의 일실시예에 따른 오디오데이터의 검색방법의 전체 흐름도.2 is an overall flowchart of a method of retrieving audio data according to an embodiment of the present invention;

도3은 도2의 휴대폰의 오디오샘플 생성단계의 흐름도.3 is a flowchart of an audio sample generation step of the mobile phone of FIG. 2;

도4a는 샘플의 샘플시간동안의 신호크기의 그래프.Fig. 4a is a graph of the signal magnitude during the sample time of the sample.

도4b는 도4a의 특정 시간간격에 포함된 주파수들의 신호크기의 그래프.FIG. 4B is a graph of signal magnitudes of frequencies included in a particular time interval of FIG. 4A. FIG.

도4c는 도4b의 신호크기를 특정 크기 이상인 경우 증폭하고, 미만인 경우 감쇄한 그래프.Figure 4c is a graph of the signal size of Figure 4b amplified when more than a certain size, attenuated when less than.

도5는 도2의 샘플로부터 샘플 오디오유전자를 생성하는 단계의 흐름도.5 is a flow chart of generating a sample audio gene from the sample of FIG.

도6은 도2의 샘플 오디오유전자와 오디오데이터 오디오유전자를 비교하는 단계의 흐름도.6 is a flowchart of a step of comparing the sample audio gene and audio data audio gene of FIG.

도7은 잡음을 제거하기 위해 사용되는 기준 주파수인 피아노 음계들의 주파수들을 나타낸 도표.Fig. 7 is a diagram showing frequencies of piano scales which is a reference frequency used to remove noise.

도8a는 종래 검색방법의 경우로, 샘플과 오디오데이터간의 동일한 핑거프린트를 갖는 위치간의 선형대응을 나타내어 오디오데이터가 샘플을 포함하는 것을 도시한 일예.Fig. 8A is an example of the conventional search method, which shows a linear correspondence between positions having the same fingerprint between the sample and the audio data so that the audio data includes the sample.

도8b는 종래 검색방법의 경우로, 선형대응을 나타내지 않아 샘플을 포함하는 오디오데이터가 발견되지 않았음을 도시한 일예.FIG. 8B is an example of the conventional search method, in which no audio data including a sample was found because no linear correspondence was shown. FIG.

* 도면의 주요 부분에 대한 부호의 설명 *Explanation of symbols on the main parts of the drawings

10: 오디오데이터 검색시스템 12: 정보통신기기10: audio data retrieval system 12: information and communication equipment

14: ARS시스템 16: 샘플의 오디오유전자생성 서버14: ARS System 16: Audio Generating Server of Sample

18: 오디오데이터 저장DB 또는 오디오정보 저장DB18: Audio data storage DB or audio information storage DB

20: 검색서버 22: 라디오20: search server 22: radio

24: 샘플 저장서버 10: 오디오샘플 저장단계24: sample storage server 10: audio sample storage step

S20: 샘플의 오디오유전자 생성 및 저장단계S20: Audio Gene Generation and Storage Step of Sample

S30: 오디오데이터DB 생성단계 S40:오디오유전자 검색단계S30: Audio data DB generation step S40: Audio gene search step

S50: 검색결과 출력단계S50: Search result output step

상기 목적을 달성하기 위하여 , 본 발명은, 오디오데이터로부터 오디오유전자를 생성하는 오디오유전자 생성방법으로, 오디오신호를 일정한 시간간격으로 분할하는 시간분할단계와; 시간간격마다 또는 다수의 시간간격에 포함되는 주파수들의 신호의 크기를 계산하는 주파수변환단계와; 주파수영역을 일정구간으로 분할하여 인접 주파수구간 사이의 신호의 크기의 차를 계산하는 차계산단계와; 인접 시간간격 사이의 상기 계산값의 차를 구하는 기울기 계산단계와; 상기 기울기가 0 이상인 경우 1로, 0 미만인 경우 0으로 양자화하는 양자화단계와; 상기 양자화된 값들을 저장하여 오디오유전자를 생성하는 오디오유전자 생성단계;를 갖는 오디오유전자 생성방법을 제공한다.In order to achieve the above object, the present invention provides an audio gene generating method for generating an audio gene from audio data, comprising: a time division step of dividing an audio signal at regular time intervals; A frequency conversion step of calculating a magnitude of a signal of frequencies included in each time interval or in a plurality of time intervals; Calculating a difference in the magnitude of the signal between adjacent frequency sections by dividing the frequency domain into predetermined sections; A slope calculation step of obtaining a difference of said calculated value between adjacent time intervals; A quantization step of quantizing to 1 when the slope is greater than or equal to 0 when less than 0; And generating an audio gene by storing the quantized values.

또 다른 측면에서, 본 발명은, 일정 시간동안의 오디오신호를 저장한 오디오샘플과, 전체 오디오신호를 저장한 다수의 오디오데이터를 비교하여 오디오샘플과 동일한 오디오데이터를 검색하는 오디오데이터 검색방법으로, 상기 오디오샘플을 상기 제1항의 오디오유전자 생성방법을 이용하여 오디오샘플의 오디오유전자를 생성하는 샘플 오디오유전자 생성단계와; 상기 오디오데이터를 상기 제1항의 오디오유전자 생성방법을 이용하여 오디오데이터의 오디오유전자를 생성하는 오디오데이터 오디오유전자 생성단계와; 상기 샘플 오디오유전자와 동일한 오디오유전자를 포함하는 오디오데이터를 검색하는 검색단계;를 갖는 오디오데이터 검색방법을 제공한다.In another aspect, the present invention is an audio data retrieval method for retrieving the same audio data as the audio sample by comparing the audio sample storing the audio signal for a predetermined time and a plurality of audio data stored the entire audio signal, A sample audio gene generation step of generating an audio gene of an audio sample using the audio gene generation method of claim 1; An audio data audio gene generating step of generating the audio data of the audio data using the audio gene generating method of claim 1; And a search step of searching for audio data including the same audio gene as the sample audio gene.

또한, 상기 검색단계는, 샘플 오디오유전자 중 일정한 개수의 샘플 오디오유전자를 선택하는 선택단계와; 상기 샘플 오디오유전자와 동일하거나 1비트만 다른 비트값을 생성하는 유사값 생성단계와; 상기 오디오데이터 오디오유전자들 중 상기 유사값과 동일한 값을 갖는 구간을 검색하되, 상기 샘플 오디오유전자의 선택구간과 간격이 동일한 간격을 갖는 오디오데이터 오디오유전자를 검색하는 구간검색단계와; 상기 유사값을 동일한 간격으로 포함하는 오디오데이터 오디오유전자와 샘플 오디오유전자와 전체적으로 동일하지 여부를 계산하고, 그 차이가 일정기준 이하인 경우만 오디오샘플과 동일한 오디오데이터인 것으로 선택하는 선택단계를 가질 수 있다.The searching may include selecting a predetermined number of sample audio genes from the sample audio genes; A pseudo value generating step of generating a bit value that is the same as or different from the one sample audio gene; A section searching step of searching for a section having the same value as the similar value among the audio data audio genes, and searching for the audio data audio gene having the same interval as the selected section of the sample audio gene; It may have a selection step of calculating whether the audio data including the similar value at the same interval and the sample audio gene is the same as a whole, and selecting only the same audio data as the audio sample when the difference is less than a predetermined standard. .

또한, 상기 오디오샘플은, 무선통신기기로부터 입력된 오디오신호로부터 일정한 신호크기 이상인 것은 증폭하고 미만인 것은 감쇄하므로 잡음이 제거된 무선통신기기로부터 전송된 오디오신호일 수 있다.In addition, the audio sample may be an audio signal transmitted from a wireless communication device from which noise is removed since amplification of a signal size greater than a predetermined signal size and attenuation of a value less than a predetermined signal size are reduced from the audio signal input from the wireless communication device.

또한, 상기 오디오샘플은, 라디오나 텔레비전 등 오디오기로부터 직접 연결되어 전송된 오디오신호일 수 있다.In addition, the audio sample may be an audio signal that is directly connected and transmitted from an audio apparatus such as a radio or a television.

또한, 상기 검색단계에서, 상기 오디오데이터들이 저장된 데이터베이스의 오디오데이터들의 오디오유전자를 분산시스템에 분산하여 임시저장하여 놓고 상기 샘플 오디오유전자와 오디오데이터들의 오디오유전자를 비교할 때 순차적으로 또는동시에 비교하여 검색시간을 단축할 수 있다.Further, in the searching step, the audio genes of the audio data of the database in which the audio data are stored are distributed and temporarily stored in a distributed system, and the search time is sequentially or simultaneously compared when the sample audio gene and the audio genes of the audio data are compared. Can shorten.

또한, 상기 검색단계에는, 상기 선택단계 이후에 오디오데이터의 선택횟수를 저장하여 상기 오디오기로부터 방송된 횟수를 계산하는 방송횟수 계산단계를 추가로 포함할 수도 있다.In addition, the searching step may further include a broadcast counting step of storing the number of selections of the audio data after the selecting step to calculate the number of times broadcast from the audio device.

본 명세서에서 "샘플" 또는 "오디오샘플"은 라디오나 텔레비전, 레코딩된 음반, 광고방송, MP3플레이어, CD플레이어 등 미디어매체나 오디오장치로부터 출력되거나 옥내외 공연이나 이벤트 음악, 로고송, 음악리듬 등의 소리(sound), 음성, 음악 또는 이들의 결합을 포함하는 모든 종류의 오디오로부터 얻어진 임의의 크기의 오디오데이터의 조각을 의미한다.In the present specification, "sample" or "audio sample" is output from media or audio devices such as radio or television, recorded recordings, commercials, MP3 players, CD players, indoor or outdoor performances, event music, logo songs, music rhythms, etc. A piece of audio data of any size obtained from all kinds of audio, including sound, voice, music, or a combination thereof.

또한, 본 명세서에서 "오디오데이터"란 소리, 음성, 음악 또는 이들의 결합을 포함하는 모든 종류의 오디오데이터를 의미한다.In addition, in this specification, "audio data" means all kinds of audio data including sound, voice, music, or a combination thereof.

이하, 본 발명의 실시예들을 도면을 참조하여 상세히 설명한다. 실시예들에 대한 설명에서 당업자에게 주지된 기술이나 사실들은 자세한 설명을 생략하고 약술한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. Techniques or facts well known to those skilled in the art in the description of the embodiments will be omitted and omitted.

실시예1: 오디오데이터 검색시스템Example 1 Audio Data Retrieval System

도1을 참조하면, 본 발명의 제1실시예에 따른 오디오데이터 검색시스템(10)은 정보통신기기(12)와 ARS시스템(14)과 샘플의 오디오유전자생성 서버(16)과 오디오데이터 저장DB 또는 오디오정보 저장DB(18)과 샘플과 오디오데이터의 검색서버(20)를 갖는다. 또한 오디오데이터 검색시스템(10)은 라디오(22)와 샘플 저장서버(24)를 갖는다.Referring to FIG. 1, the audio data retrieval system 10 according to the first embodiment of the present invention includes an information communication device 12, an ARS system 14, an audio gene generation server 16 of a sample, and an audio data storage DB. Or an audio information storage DB 18 and a search server 20 for samples and audio data. The audio data retrieval system 10 also has a radio 22 and a sample storage server 24.

정보통신기기(12)는 휴대폰이나 전화기, 무전기, 유무선통신모뎀이 탑재된 단말기 등 통신네트워크를 이용하여 ARS서버(14)에 접속할 수 있는 모든 단말기를 포함한다. 정보통신기기(12)는 라디오나 TV, 공연현장, 이벤트 현장에서 출력되는 음악이나 음원 등 오디오데이터를 통신네트워크를 통해 ARS서버(14)에 전송하는 역할을 한다. 이때 오디오데이터의 종류나 형태 등은 현존하는 것뿐만 아니라 미래에 새로 개발될 모든 것들을 포함하며, 오디오데이터가 출력되는 미디어도 라디오나 TV뿐만 아니라 어떠한 미디어매체나 공연, 이벤트 행사, 연주 온오프라인 상의 모든 것들을 포함한다.The information communication device 12 includes all terminals capable of connecting to the ARS server 14 using a communication network such as a mobile phone, a telephone, a radio, a terminal equipped with a wired / wireless communication modem. The information communication device 12 transmits audio data such as music or sound source output from a radio or TV, a performance site, or an event site to the ARS server 14 through a communication network. At this time, the type and form of audio data includes not only existing ones but also all things newly developed in the future, and the media on which audio data is output are not only radio or TV, but also any media media, performances, event events, performances online or offline. Include things.

ARS서버(14)는 정보통신기기(12)가 접속요청을 할 때 서비스의 종류를 선택하게 하고, 서비스 종류에 따라 정보통신기기(12)로부터 오디오데이터가 전송되면 이 오디오데이터를 일차적으로 저장하는 기능을 한다.The ARS server 14 allows the information communication device 12 to select a type of service when making a connection request, and if audio data is transmitted from the information communication device 12 according to the service type, the ARS server 14 stores the audio data primarily. Function

오디오유전자 생성서버(16)는 통신네트워크에 의해 ARS서버(14)와 연결되어, ARS서버(14)에 저장된 오디오데이터를 오디오정보 저장DB(18)에 이미 저장된 오디오유전자들과 비교하기에 적당한 형태, 즉 동일한 형태로 변환하여 오디오유전자검색서버(18)에 전송하는 기능을 한다. 이때 이 오디오유전자는 오디오데이터로부터의 성문(Audio Fingerprinting)을 이용한 오디오유전자(Audio DNA)일 수 있다. 이때 오디오유전자는 정보통신기기(12)의 전송방식에 따른 잡음이나 음의 왜곡이 발생하는 환경에서 샘플링되어 데이터 변조나 변환이 이루어진 경우 그 변조되거나 변환된 오디오데이터의 성문을 이용한 오디오유전자일 수도 있다. 예를 들면, 정보통신기기(12)가 휴대폰인 경우 잡음과 통화음을 구별하여 잡음으로 간주되는 신호들을 필터링하므로, 필터링된 나머지 신호들만을 오디오데이터의 성문으로 이용하여 오디오유전자를 생성한다. 오디오유전자의 생성방법은 실시예2에서 자세히 설명한다.The audio gene generation server 16 is connected to the ARS server 14 by a communication network, and is suitable for comparing audio data stored in the ARS server 14 with audio genes already stored in the audio information storage DB 18. That is, it converts to the same form and transmits to the audio gene search server 18. In this case, the audio gene may be an audio DNA using audio fingerprinting from the audio data. In this case, the audio gene may be an audio gene using a voiceprint of the modulated or converted audio data when data modulation or conversion is performed by sampling in an environment in which noise or sound distortion occurs according to the transmission method of the information communication device 12. . For example, when the information communication device 12 is a mobile phone, the noise and the call tone are distinguished and filtered to be considered noise. Thus, only the remaining filtered signals are used as the voiceprint of the audio data to generate an audio gene. The method of generating the audio gene is described in detail in the second embodiment.

오디오정보 저장DB(18)는 음악이나 음원으로부터 미리 다양한 오디오유전자들을 저장해 놓은 DB이다. 이때 저장된 오디오유전자들은 정보통신기기(12)의 종류에 따라 변조되거나 변환된 오디오데이터의 성문을 이용하여 생성된 오디오유전자들이다. 이때 오디오유전자도 정보통신기기(12)로부터 잡음이나 음의 왜곡이 발생하는 환경에서 샘플링되어 데이터 변조나 변환된 오디오데이터를 성문으로 생성한 오디오유전자를 비교하고자 할 경우에는 미리 그 변조되거나 변환된 오디오데이터들의 성문을 이용한 오디오유전자를 생성하여 저장해 놓는다.The audio information storage DB 18 is a DB in which various audio genes are stored in advance from a music or a sound source. At this time, the stored audio genes are audio genes generated by using a voiceprint of audio data modulated or converted according to the type of the information communication device 12. In this case, the audio gene is also sampled in the environment in which noise or sound distortion is generated from the information and communication device 12, and in order to compare the audio gene generated by the data modulation or the converted audio data into the gate, the modulated or converted audio in advance Create and store audio genes using the gates of the data.

오디오정보 저장DB(18)에 저장되는 오디오유전자들은 해당 오디오데이터의 조성, 박자, 수치 등을 근간으로 분류하여 저장된다. 이 오디오유전자를 생성하는 방법은 오디오유전자 생성서버(16)의 생성방법인 것이 바람직하나 양쪽의 오디오유전자들을 비교하여 동일한 오디오유전자를 검색할 수만 있다면 다른 생성방법을 이용하더라도 무관하다.Audio genes stored in the audio information storage DB 18 are classified based on the composition, time signature, and numerical value of the corresponding audio data. The method of generating the audio gene is preferably the method of generating the audio gene generating server 16. However, if the same audio gene can be searched by comparing the two audio genes, another generation method may be used.

이때 오디오정보 저장DB(18)는 오디오유전자뿐만 아니라 그 음악유전자에 해당하는 음악정보, 예를 들면 곡명, 가수, 가사, 광고주 등 오디오데이터와 관련된 정보들을 함께 저장하고 있다.In this case, the audio information storage DB 18 stores not only the audio gene but also music information corresponding to the music gene, for example, information related to audio data such as a song name, a singer, lyrics, and an advertiser.

오디오정보 저장DB(18)에 저장되는 오디오데이터의 오디오유전자의 생성 및 저장방법은 실시예2에서 상세히 설명한다. 오디오정보 저장DB(18)은 원음을 성문으로 오디오유전자를 생성하여 저장한 원음 오디오유전자DB(도1에 DNA DB라 표시함)와 원음이 휴대폰등을 통해 변조되었을 때 미리 변조된 오디오신호를 성문으로 오디오유전자를 생성하여 저장한 변조된 오디오유전자DB(도1에 fDNA DB라 표시함)을 갖고 있다. 한편 오디오정보 저장DB는 통상적인 DBMS(21)에 의해 관리된다.The method of generating and storing the audio gene of the audio data stored in the audio information storage DB 18 will be described in detail in the second embodiment. The audio information storage DB 18 generates the original sound audio gene DB (indicated as DNA DB in FIG. 1), which generates and stores the audio gene as the voice text, and the pre-modulated audio signal when the original sound is modulated through a mobile phone. Has a modulated audio gene DB (denoted as fDNA DB in FIG. 1) that generates and stores an audio gene. On the other hand, the audio information storage DB is managed by a normal DBMS 21.

오디오유전자 검색서버(20)는 오디오정보 저장DB(18)에 미리 저장된 다수의 오디오유전자들과 오디오유전자 생성서버(16)로부터 전송된 오디오유전자들을 비교하여 일치하는 오디오유전자 및 그 외 관련된 음악정보를 찾아내는 기능을 한다. 이때 오디오유전자 검색서버(20)는 오디오정보 저장DB(18)로부터 오디오유전자의 검색을 효율적으로 수행하기 위하여 오디오유전자를 직접 비교하기 전에 조성, 박자, 수치 등을 이용하여 검색을 범위를 축소하며, ?허용오차? 기술을 적용하여 오디오유전자의 패턴을 신속하게 검색하는 검색방법을 이용한다. 즉, 검색대상인 오디오유전자의 패턴과 오디오정보 저장DB(20)의 오디오유전자의 패턴을 검색하여 허용오차, 예를 들면 36% 이내의 오차를 나타내면 동일한 오디오유전자인 것으로 간주한다.The audio gene search server 20 compares a plurality of audio genes previously stored in the audio information storage DB 18 with audio genes transmitted from the audio gene generation server 16, and matches the corresponding audio genes and other related music information. Function to find. At this time, the audio gene search server 20 narrows down the search by using the composition, time signature, and numerical value before directly comparing the audio genes in order to efficiently search for the audio genes from the audio information storage DB 18. ?Tolerance? The technique is applied to the search method to quickly search for patterns of audio genes. That is, if the pattern of the audio gene to be searched and the pattern of the audio gene of the audio information storage DB 20 are searched and show an error within an allowable error, for example, 36%, the same audio gene is considered.

오디오유전자 검색서버(20)는 검색속도를 향상시키기 위하여 여러개의 검색서버들을 구비하고, 각각의 검색서버들에 오디오정보 저장DB(18)로부터 오디오유전자들을 나누어 저장하여 놓고 각각의 검색서버가 동시에 또는 순차적으로 ARS서버(14)로부터 전송된 오디오유전자와 동일한 오디오유전자를 검색할 수 있다.The audio gene search server 20 includes a plurality of search servers in order to improve the search speed. The audio gene search server 20 divides and stores the audio genes from the audio information storage DB 18 in each search server, and each search server simultaneously or It is possible to sequentially search for the same audio gene as the audio gene transmitted from the ARS server 14.

예를 들어, 오디오정보 저장DB(18)에 저장된 오디오정보가 10만건이고 오디오유전자 검색서버(20)가 10대로 구성된 경우, 각각의 검색서버(20)들은 1만건의 오디오정보만을 각각 저장하여 놓고 ARS서버(14)로부터 오디오유전자가 전송되면 동시에 또는 순차적으로 오디오유전자를 검색할 수 있다. 이를 통해 오디오유전자의 검색속도를 10배 향상시킬 수 있다.For example, if there are 100,000 audio information stored in the audio information storage DB 18 and the audio gene search server 20 is composed of 10, each search server 20 stores only 10,000 audio information, respectively. When the audio gene is transmitted from the ARS server 14, the audio gene may be searched simultaneously or sequentially. This can improve the search speed of audio genes by 10 times.

오디오유전자 검색서버(20)의 오디오샘플과 오디오데이터를 이들로부터 생성된 오디오유전자들을 이용하여 비교검색방법에 대하여는 실시예2에서 보다 상세히 설명한다.A comparative search method using audio samples generated from the audio samples and audio data of the audio gene search server 20 will be described in more detail in the second embodiment.

실시예2: 오디오데이터 검색방법Example 2 Audio Data Search Method

발명의 제2실시예에 따른 오디오데이터 검색방법은 오디오샘플 저장단계(S10)와 샘플의 오디오유전자 생성 및 저장단계(S20)와 오디오데이터DB 생성단계(S30)와 오디오유전자 검색단계(S40)와 검색결과 출력단계(S50)를 갖는다.The audio data retrieval method according to the second embodiment of the present invention includes the audio sample storage step (S10), the audio gene generation and storage step (S20) of the sample, the audio data DB generation step (S30), and the audio gene search step (S40) and A search result output step S50 is provided.

도1 내지 도4를 참조하면, 오디오샘플 저장단계(S10)는 정보통신기기, 예를 들면 휴대폰(12)을 통해 입력된 오디오신호를 휴대폰(12)의 음성코더(speech coder)에 의해 샘플 시간(3초) 동안 샘플로 저장하고, 도4a와 같이 시간에 따른 오디오신호의 크기(db)를 측정하는 단계로부터 시작된다(S12). 이때 샘플은 3초동안 11msec마다의 300Hz에서 3kHz 범위의 주파수에 따른 오디오신호의 크기를 측정하여 저장한다(S14).1 to 4, the audio sample storage step (S10) is a sample time of the audio signal input through the information communication device, for example, the mobile phone 12 by the speech coder (speech coder) of the mobile phone 12 (3 seconds) is stored as a sample, and starts from the step of measuring the size (db) of the audio signal over time as shown in Figure 4a (S12). In this case, the sample measures and stores the size of the audio signal according to a frequency in the range of 300 Hz to 3 kHz every 11 msec for 3 seconds (S14).

도4b와 같이 주파수에 따른 오디오신호 중 크기가 특정크기, 예를 들면 500db 이상인 오디오신호는 10배 증폭하고, 미만인 오디오신호는 0.1배 감쇄하여 256개의 시간구간의 오디오신호의 크기를 저장한다(S16). 도4c와 같이 이러한 변조의 결과 11msec마다 특정 주파수의 신호만이 존재하는 형태로 오디오신호가 변환되게 된다(S18). 변환된 오디오신호는 예를들면, A.WAV와 같은 오디오파일형태로 저장되어 휴대폰(12)을 통해 ARS시스템(14)으로 전송된다. 따라서 오디오샘플은 11msec단위로 256*11msec간 300Hz에서 3kHz범위의 주파수의 오디오신호중 증폭 또는 감쇄되어 변조된 오디오신호를 포함하게 된다.As shown in FIG. 4B, an audio signal having a specific size, for example, 500db or more, is amplified by 10 times, and attenuated by 0.1 times by an audio signal having a size of less than 500 db (S16). ). As shown in FIG. 4C, the audio signal is converted into a form in which only a signal of a specific frequency exists every 11 msec (S18). The converted audio signal is stored in the form of an audio file such as A.WAV and transmitted to the ARS system 14 through the mobile phone 12. Therefore, the audio sample includes an audio signal amplified or attenuated and modulated among the audio signals in the frequency range of 300 Hz to 3 kHz for 256 * 11 msec in 11 msec units.

도1 및 도2, 도5를 참조하면, 오디오샘플로부터 샘플 오디오유전자를 생성하는 단계(S20)는, ARS시스템(14)과 연결된 샘플 오디오유전자 생성서버(16)에 의해서, 오디오샘플 저장단계(S10)에 의해 생성된 오디오샘플을 저장하므로 시작된다.1, 2, and 5, the step S20 of generating a sample audio gene from an audio sample is performed by the sample audio gene generation server 16 connected to the ARS system 14. It starts by storing the audio sample generated by S10).

오디오샘플들은 각각 750Hz에서 2750Hz의 주파수만을 선택하여 33개의 주파수구간(FI1~FI33)으로 분할한다. 각 주파수구간에 따른 오디오신호의 크기를 각각저장하여 놓는다. 수학식1과 같이 결과적으로 256개의 시간구간(단위시간: 11msec)과 33개의 주파수구간(단위주파수: 66Hz)의 곱에 해당하는 256*33개의 오디오 신호의 크기가 저장되게 된다(S21).Audio samples are divided into 33 frequency sections (FI1 to FI33) by selecting only frequencies of 750Hz to 2750Hz. The size of the audio signal according to each frequency section is stored separately. As a result, as shown in Equation 1, 256 * 33 audio signals corresponding to the product of 256 time intervals (unit time: 11 msec) and 33 frequency intervals (unit frequency: 66 Hz) are stored (S21).

신호의 크기[i,j]=[Ai, j]Magnitude of the signal [i, j] = [Ai, j]

여기서 i(1≤i≤256의 자연수)는 256개의 시간구간을 의미하며, j(1≤j≤33의 자연수)는 33개의 주파수구간을 의미한다.Here, i (natural number of 1 ≦ i ≦ 256) means 256 time intervals, and j (natural number of 1 ≦ j ≦ 33) means 33 frequency intervals.

오디오신호로부터 오디오유전자를 생성하기 위해, 수학식2와 같이 특정 시간대, 예를 들면 11msec의 인접하는 주파수간, 예를 들면 FI1과 FI2의 신호크기의 차를 구한다. FI2와 FI3, FI3와 FI4,...FI32와 FI33의 신호의 크기의 차를 같은 방법으로 구한다. 다음 시간대 11*2msec에서도 인접하는 주파수구간들간의 신호의 크기의 차이를 구한다. 같은 방법으로 256*11msec까지 구한다. 따라서 256*32개의 신호의 크기의 차의 값이 계산된다(S22).In order to generate an audio gene from the audio signal, the difference in signal magnitudes between adjacent frequencies of a specific time period, for example, 11 msec, for example, FI1 and FI2, is obtained as shown in Equation (2). The difference in the magnitudes of the signals of FI2 and FI3, FI3 and FI4, ... FI32 and FI33 is calculated in the same way. In the next time period, 11 * 2msec, the difference in signal magnitudes between adjacent frequency sections is obtained. In the same way, obtain up to 256 * 11msec. Therefore, the value of the difference of the magnitude of 256 * 32 signals is calculated (S22).

신호의 크기의 차(i=1)=[A1, k] -[A1, k+1]Difference in signal magnitude (i = 1) = [A1, k]-[A1, k + 1]

여기서 k는 1~32의 자연수Where k is a natural number from 1 to 32

수학식3과 같이 구해진 차이값을 인접한 시간대, 예를 들면 11msec와 11*2msec 시간대의 차이값의 차이(이하, 기울기라 함)를 다시 계산하여 이 값이 0이상이면 "1"로, 0미만이면 "0"의 값을 저장한다(S23 내지 S26). 이러한 과정을 전 시간구간에 대하여 수행한다(S27). 이 기울기를 모두 구하여 저장하면 32*256개의 양자화된 2진수값이 저장되게 된다(S28). 이 양자화된 2진수값을 샘플의 오디오유전자(audio DNA of sample)라 한다.The difference value calculated as in Equation 3 is calculated by recalculating the difference (hereinafter, referred to as slope) of the difference value between adjacent time zones, for example, 11 msec and 11 * 2 msec time zones, and when the value is 0 or more, it is "1" and less than 0. If so, the value "0" is stored (S23 to S26). This process is performed for the entire time interval (S27). If all the slopes are obtained and stored, 32 * 256 quantized binary values are stored (S28). This quantized binary value is called the audio DNA of sample.

기울기=([A1, k] -[A1, k+1])-([A2, k] -[A2, k+1])Slope = ([A1, k]-[A1, k + 1])-([A2, k]-[A2, k + 1])

여기서 k는 1~32의 자연수Where k is a natural number from 1 to 32

다시 도1 및 도2를 참조하면, 오디오데이터DB(18)에 저장되는 오디오데이터의 오디오유전자의 생성방법(S30)은 샘플의 오디오유전자 생성방법과 동일한 방법으로 생성한다. 다만, 샘플은 샘플시간이 대략 3초이지만, 오디오데이터는 전체 오디오데이터에 대해서 오디오유전자를 생성하게 된다. 이 전체 오디오데이터의 오디오유전자는 오디오데이터DB(18)에 저장되어 DBMS(21)에 의해 관리된다.Referring again to FIGS. 1 and 2, the method S30 of generating the audio gene of the audio data stored in the audio data DB 18 is generated in the same manner as the method of generating the audio gene of the sample. However, although the sample has a sample time of about 3 seconds, the audio data generates an audio gene for all audio data. The audio genes of the entire audio data are stored in the audio data DB 18 and managed by the DBMS 21.

도1 및 도2, 도6을 참조하면, 샘플 오디오유전자와 오디오데이터의 오디오유전자의 검색단계(S40)는 휴대폰(12)을 통해 샘플 오디오유전자 생성서버(20)에서 생성된 샘플 오디오유전자와 오디오데이터DB에 저장된 오디오데이터의 오디오유전자간을 비교하여 샘플 오디오유전자를 포함하는 오디오유전자를 찾아내는 단계이다. 이 단계는 오디오데이터DB(18)로부터 분산되어 각각의 오디오데이터의 오디오유전자를 임시저장하고 있는 10개의 분산된 검색서버들(20)에서 동시에 또는 순차적으로 이루어진다.1, 2, and 6, a search step (S40) of the sample audio gene and the audio gene of the audio data is the sample audio gene and audio generated by the sample audio gene generation server 20 through the mobile phone 12. Comparing the audio genes of the audio data stored in the data DB to find an audio gene containing a sample audio gene. This step is performed simultaneously or sequentially in ten distributed search servers 20 which are distributed from the audio data DB 18 and temporarily store the audio gene of each audio data.

오디오유전자의 검색단계(S40)는, 샘플 오디오유전자 중 일정한 개수의 샘플 오디오유전자를 선택하는 선택단계(S42)와, 샘플 오디오유전자와 동일하거나 1비트만 다른 비트값을 생성하는 유사값 생성단계(S44)와, 오디오데이터 오디오유전자들 중 유사값과 동일한 값을 갖는 구간을 검색하되, 샘플 오디오유전자의 선택구간과 간격이 동일한 간격을 갖는 오디오데이터 오디오유전자를 검색하는 구간검색단계(S46)와, 유사값을 동일한 간격으로 포함하는 오디오데이터 오디오유전자와 샘플 오디오유전자와 전체적으로 동일하지 여부를 계산하고, 그 차이가 일정기준 이하인 경우만 오디오샘플과 동일한 오디오데이터인 것으로 선택하는 최종선택단계(S46)를 갖는다.The searching step S40 of the audio gene may include selecting step S42 of selecting a predetermined number of sample audio genes from sample audio genes, and generating a similar value generating a bit value that is the same as or different from the sample audio gene by one bit ( A section search step (S46) of searching for a section having the same value as the similar value among the audio data audio genes, but searching for the audio data audio gene having the same interval as the selected section of the sample audio gene; A final selection step (S46) is performed to calculate whether the audio data and the sample audio gene are similar to the audio data including similar values at equal intervals, and to select the same audio data as the audio sample only when the difference is less than a predetermined standard. Have

선택 및 유사값 생성단계(S42, S44)에서는, 먼저 샘플 오디오유전자 중 11msec와, 11*50msec, 11*100msec, 11*150msec, 11*200msec, 11*250msec의 6개의 시간대의 오디오유전자 각각의 32비트 중 동일하거나 1비트만 다른 비트값을 생성하여, 이들과 동일한 비트값을 갖는 오디오데이터 오디오유전자를 각각의 분산된 검색서버들(11)에서 검색한다.In the selection and similarity generating step (S42, S44), first of 11 msec of the sample audio genes, each of the audio genes of six time zones of 11 * 50msec, 11 * 100msec, 11 * 150msec, 11 * 200msec, and 11 * 250msec By generating bit values that are the same or only one bit out of the bits, audio data audio genes having the same bit values are retrieved from each distributed search server 11.

구간 검색단계(S46)에서는, 6개의 시간대에 32비트 중 동일하거나 1비트만 다른 비크값과 동일한 비트값을 갖은 오디오데이터 오디오유전자중 11*50msec 거리를 유지하는 오디오데이터 오디오유전자를 갖는 구간을 찾아낸다. 만약 이러한 구간이 존재하지 않는다면 다른 오디오데이터의 오디오유전자를 순차적으로 검색한다.In the section search step (S46), a section having an audio data audio gene maintaining a distance of 11 * 50 msec among audio data audio genes having the same bit value of the same or only one bit among 32 bits in six time zones is found. Serve If this section does not exist, audio genes of other audio data are sequentially searched.

최종 선택단계(S48)에서는, 구간이 검색되면, 전체 샘플 오디오유전자와, 오디오데이터의 오디오유전자 중 예산 구간의 오디오유전자를 전체 비교하여 BER(bit error rate)을 계산한다. 만약, 256개 전체의 샘플 오디오유전자와 예상구간의 오디오데이터의 오디오유전자의 BER이 일정한 값, 예를 들면 32% 이하이면 오디오데이터의 오디오유전자가 샘플의 오디오유전자를 포함하고 있는 것으로 판정하고 분산서버의 검색을 종료한다. 물론 BER이 32% 초과이면 다른 오디오데이터에 대하여 동일한 검색을 순차적으로 실시한다.In the final selection step (S48), when a section is searched, a bit error rate (BER) is calculated by comparing the entire sample audio gene and the audio gene of the budget section among the audio genes of the audio data. If the BER of the entirety of 256 sample audio genes and the predicted interval audio data is a constant value, for example, 32% or less, it is determined that the audio gene of the audio data includes the sample audio gene and the distributed server. Terminate search. Of course, if the BER exceeds 32%, the same search is performed sequentially for the other audio data.

다시 도1 및 도2를 참조하면, 샘플 오디오유전자를 포함하는 오디오데이터의 정보는 오디오데이터DB(18)에 저장된 오디오정보를 이용하여 출력된다. 샘플과 동일한 오디오데이터를 검색하여 그 결과를 출력하므로 모든 절차가 완료되게 된다.Referring again to FIGS. 1 and 2, the information of the audio data including the sample audio gene is output using the audio information stored in the audio data DB 18. All the procedures are completed by searching the same audio data as the sample and outputting the result.

실시예3: 라디오 샘플 수집Example 3 Radio Sample Collection

만약 라디오(22)로부터 오디오샘플을 샘플 저장서버(24)가 직접 캡처, 저장할 수 있다면, 음악의 음원을 그대로 저장할 수 있으므로 그 음원으로부터 샘플 오디오유전자를 생성하여 검색서버(20)로 전송한다. 검색서버(20)는 오디오데이터DB(18)로부터 임시저장하고 있는 음악들의 원음으로부터 생성한 오디오데이터의 오디오유전자를 임시로 저장해 놓은 상태에서 오디오유전자를 이용하여 비교검색한다. 위에서 설명한 방법과 동일한 방법으로 샘플 오디오유전자와 오디오데이터 오디오유전자간에 비교하여 샘플 오디오유전자를 포함하고 있는 오디오데이터를 결정하게 된다. 그 검색결과를 출력하므로 일련의 과정이 종료된다.If the sample storage server 24 can directly capture and store the audio sample from the radio 22, since the sound source of music can be stored as it is, the sample audio gene is generated from the sound source and transmitted to the search server 20. The search server 20 performs a comparative search using the audio gene while temporarily storing the audio gene of the audio data generated from the original sound of the music temporarily stored from the audio data DB 18. In the same manner as described above, the audio data including the sample audio gene is determined by comparing between the sample audio gene and the audio data audio gene. The search results are output, so the series of processes ends.

실시예4: 잡음제거Example 4: Noise Reduction

본 발명은, 실시예1 및 실시예2에서 설명한 휴대폰(12)으로부터 변조된 오디오신호의 오디오샘플이 전송되어 ARS시스템(14)에 저장된 후, 오디오샘플로부터 샘플 오디오유전자를 생성하기 이전에 오디오샘플에 포함된 잡음을 제거하는 방법을 제공한다.In the present invention, after an audio sample of a modulated audio signal is transmitted from the mobile phone 12 described in Embodiments 1 and 2 and stored in the ARS system 14, the audio sample is generated before generating a sample audio gene from the audio sample. It provides a method for removing the noise included in the.

실시예1 및 실시예2에서 설명한 바와 같이, 휴대폰(12)으로부터 전송되어 ARS시스템(14)에 저장되는 오디오샘플은, 이미 휴대폰(12)에서 오디오신호의 일정 크기에 따라 증폭되거나 감쇄되어 주파수 300Hz에서 3kHz 범위로 변조된 오디오신호이다. 그러나, 이 오디오샘플에는, 음악이나 음원 등의 오디오데이터로부터 추출된 오디오신호뿐만 아니라 불필요한 잡음이 포함되어 있다.As described in Embodiments 1 and 2, the audio samples transmitted from the mobile phone 12 and stored in the ARS system 14 are already amplified or attenuated according to a predetermined size of the audio signal in the mobile phone 12, and thus have a frequency of 300 Hz. Audio signal modulated in the 3kHz range. However, this audio sample contains unnecessary noise as well as an audio signal extracted from audio data such as music or sound source.

이 오디오샘플에 포함된 잡음을 제거하는 방법 중에 하나는, (1)오디오샘플의 주파수영역에서 특정 크기 이하의 신호는 제거하는 비존재 주파수 제거단계와, (2)잔존하는 오디오샘플의 주파수들 중에 의미있는 주파수들, 즉 음악이나 음원 등의 오디오데이터로부터 추출된 오디오신호의 주파수들만을 선택하는 유의미 주파수 선택단계를 갖는다.One of the methods of removing noise included in the audio sample includes (1) a non-existent frequency removing step of removing a signal having a certain size or less in the frequency domain of the audio sample, and (2) among the frequencies of the remaining audio sample. It has a meaningful frequency selection step of selecting only meaningful frequencies, that is, frequencies of an audio signal extracted from audio data such as music or sound source.

비존재 주파수 제거단계는, 오디오샘플 중 일정 크기 이하의 신호의 크기를 갖는 주파수영역의 신호들은, 오디오신호의 간섭에 의해 생성된 잡음들로 간주하여 제거하는 단계이다. 휴대폰(12)에서 신호의 크기를 기준으로 증폭 및 감쇄하므로 일정정도 잡음이 제거되었지만, 다시한번 신호의 크기로 잡음을 제거하는 것이다.In the non-existent frequency removing step, signals in a frequency domain having a signal size less than or equal to a predetermined size among audio samples are regarded as noises generated by interference of the audio signal and removed. Since the mobile phone 12 amplifies and attenuates the signal based on the magnitude of the signal, noise is removed to some extent, but once again, the noise is removed by the magnitude of the signal.

통상의 휴대폰(12)은 통화를 주요한 목적으로 하므로 일정범위의 잡음이 포함되더라도 통화에 전혀 지장이 없기 때문에 잡음이 상당히 많이 포함될 수 밖에 없다. 본 발명에서는 오디오데이터 검색을 위해 다시한번 신호의 크기를 기준으로 잡음을 제거하는 것이다. 다만, 휴대폰(12)에서 본 발명의 실시에 필요한 정도로 신호의 크기를 기준으로 잡음을 제거하고 있거나 제거할 수 있다면 이 비존재 주파수 제거단계는 생략될 수도 있다.Since the mobile phone 12 is a main purpose of the call, even if a certain range of noise is included, since there is no problem in the call, there is no choice but to include a lot of noise. In the present invention, the noise is once again removed based on the size of the signal for audio data retrieval. However, if the mobile phone 12 is removing or removing noise based on the signal size to the extent necessary for the practice of the present invention, this non-existent frequency removing step may be omitted.

유의미 주파수 선택단계는, 본 발명에서 처리하는 오디오데이터가 주로 음악이나 음원 등으로부터 추출한 오디오신호인 점에 착안하여, 음악이나 음원 등에서 존재하지 않는 주파수들을 신호의 크기가 일정 기준값 이상이더라도 잡음으로 간주하여 제거하는 단계이다. 음악이나 음원 등에 존재하는 주파수들을 선택하는 방법중에 하나로, 피아노의 96개의 음계의 주파수를 기준으로 이 음계의 주파수로부터 일정 범위 내에 있는 주파수 영역에 해당하는 신호만을 선택하는 것이다.Significant frequency selection step is focused on the fact that the audio data processed in the present invention is mainly an audio signal extracted from a music or a sound source, such that the frequencies that do not exist in the music or sound source, etc. are considered as noise even if the signal size is above a certain reference value. It is a step to remove. One method of selecting frequencies existing in music, sound sources, and the like is to select only signals corresponding to a frequency range within a predetermined range from the frequencies of the scales based on the frequencies of 96 piano scales.

도7을 통해 알 수 있는 바와 같이, 피아노 음계는 96개 존재하면, 각 음계는 특정한 주파수를 갖는다. 예를 들면 가장 낮은 음계인 C1은 32.70Hz이고, 가장 높은 음계인 B8은 7,900.88Hz를 갖는다. 따라서 음악이나 음원 등으로부터 추출한오디오데이터들은 피아노 음계의 주파수들과 동일하거나 일정 범위내의 주파수들로 구성된 오디오신호들일 수밖에 없다.As can be seen from Fig. 7, if there are 96 piano scales, each scale has a specific frequency. For example, the lowest scale, C1, is 32.70 Hz and the highest scale, B8, has 7,900.88 Hz. Therefore, the audio data extracted from the music or the sound source, etc. may be audio signals composed of frequencies equal to or within a predetermined range of the piano scale.

피아노 음계의 주파수를 기준으로 허용가능한 주파수범위는 통상의 당업자라면 용이하게 설정할 수 있다. 예를 들어 허용 주파수범위를 피아노 음계의 주파수의 10%영역으로 정하면, C3 음계의 주파수 영역은 130.80± 13.08Hz로 정할 수 있다.The allowable frequency range based on the frequency of the piano scale can be easily set by those skilled in the art. For example, if the allowable frequency range is set to 10% of the frequency of the piano scale, the frequency range of the C3 scale can be set to 130.80 ± 13.08Hz.

휴대폰(12)으로부터 전송되어 ARS시스템(14)에 저장된 오디오샘플에서 피아노 음계의 주파수의 허용 주파수범위의 오디오신호만을 선택하므로 사람소리, 차소리, 바람소리 등 대부분의 잡음을 제거할 수 있다.Since only the audio signal in the allowable frequency range of the frequency of the piano scale is selected from the audio sample transmitted from the mobile phone 12 and stored in the ARS system 14, most noises such as human sounds, car sounds, and wind noises can be removed.

실시예5: 선택된 오디오데이터 정보의 이용방법Example 5 Method of Using Selected Audio Data Information

본 발명은, 실시예1 내지 실시예3에서 설명한 시스템 및 방법에 의해 선택된 오디오데이터 정보, 예를 들면 곡명이나 가사, 가수 등의 음악정보나, 이 검색된 음악을 이용한 핸드폰 벨소리, 통화대기음, 음성메시지의 배경음악 등의 부가 서비스를 핸드폰이나 유무선 인터넷으로 제공할 수 있다.The present invention relates to audio data information selected by the systems and methods described in the first to third embodiments, for example, music information such as song names, lyrics, and singers, cell phone ringtones, call waiting sounds, and voice messages using the retrieved music. Additional services such as background music can be provided via mobile phone or wired / wireless internet.

아울러, 본 발명은, TV나 라디오 등에서 출력되는 음악의 방송횟수를 집계하여 기록한 방송정보를 통신네트워크를 이용하여 방송국이나 음반판매회사, 음반제작협회 등에 제공하여 음악저작권료를 정하는데 사용될 수도 있다.In addition, the present invention may be used to determine the music copyright by providing broadcast information recorded by counting the number of broadcasts of music output from a TV or a radio to a broadcasting station, a record sales company, a record production association, etc. using a communication network.

또한, 본 발명은, 본 발명자가 선출원한 특허출원 제2004-1246호(발명의 명칭: 오디오데이터인식을이용한광고방법)에서 설명한, 음악과 같은 오디오데이터를휴대폰 등을 통해 전송하여 경품 등 이벤트에 참여하므로, 오디오데이터가 포함된 광고의 효과를 극대화할 수 있는 오디오데이터 인식을 이용한 광고방법에 사용될 수도 있다.In addition, the present invention transmits audio data such as music through mobile phones or the like described in Patent Application No. 2004-1246 (Invention Name: Advertising Method using Audio Data Recognition), which is filed by the present inventors, to an event such as a prize. Participation may be used in an advertising method using audio data recognition, which can maximize the effect of an advertisement including audio data.

이하, 상기에서 설명한 본 발명 에 따른 또 다른 실시예의 작용 및 작동을 상세히 설명한다.Hereinafter, the operation and operation of another embodiment according to the present invention described above will be described in detail.

도1 내지 도7을 참조하면, 라디오(22)에서 들려오는 음악을 듣고 그 음악에 대한 정보를 알고 싶다면 휴대폰(12)으로 ARS시스템(14)에 전화연결한다. ARS시스템(14)은 음악이 출력되는 라디오에 약5초 정도 휴대폰을 향하도록 지시하게 된다. 이때 휴대폰(12)은 휴대폰(12)에 내장된 음성코더(VOCODER 또는 speech codec)를 이용하여 입력되는 음악과 각종 잡음들을 시간대로 분할하여 주파수별로 나누어 신호크기가 500db이상인 경우 증폭하고, 미만인 경우 감쇄하여 ARS시스템(14)으로 전송하게 된다.1 to 7, if the user wants to hear the music heard from the radio 22 and to know the information about the music, the mobile phone 12 connects to the ARS system 14 by telephone. The ARS system 14 instructs the radio on which music is output to face the mobile phone for about 5 seconds. At this time, the mobile phone 12 divides the music and various noises inputted into time zones by using a voice coder (VOCODER or speech codec) built in the mobile phone 12 and divides them by frequency and amplifies when the signal size is 500db or more, and attenuates it when it is less than. To the ARS system 14.

오디오유전자 생성서버(14)는 변조된 오디오신호에 대하여 각각 기울기를 구하고, 이 기울기값을 양자화하여 32비트로 256개 샘플 오디오유전자를 저장한다. 저장된 샘플 오디오유전자는 wav파일의 형태로 분산되어 있는 검색서버(20)에 전송되어 검색서버들(20)에 의해 동시에 동일한 오디오유전자가 포함된 오디오데이터DB 또는 오디오정보DB(18)의 오디오데이터의 오디오유전자를 검색한다.The audio gene generation server 14 obtains slopes for the modulated audio signals, quantizes the slope values, and stores 256 sample audio genes in 32 bits. The stored sample audio gene is transmitted to the search server 20 distributed in the form of a wav file, and the audio data DB or the audio data of the audio information DB 18 including the same audio gene by the search servers 20 at the same time. Search for the audio gene.

검색시 샘플 오디오유전자의 6개의 오디오유전자의 원소값을 11*50msec 간격으로 선택하여, 이 원소값들과 동일하거나 1비트 차이가 나는 원소값들이 각각의 검색서버를 통해 오디오데이터의 오디오유전자들에 대해서 검색하게 된다. 일단 동일한 6개의 원소값이 검색되는 오디오유전자들 중간 11*50msec의 간격을 유지하고 있는 것들만 선택하게 된다. 선택된 오디오유전자들의 구간과 샘플 오디오유전자 전체를 다시 비교하고 그 차이를 BER로 저장하여 그 값이 32%이하이면 동일한 오디오유전자로 결정하게 되고, 그렇지 않으면 동일하지 않은 것으로 결정하게 된다.In retrieval, the element values of six audio genes of the sample audio gene are selected at intervals of 11 * 50 msec so that element values equal to or different from these element values are transmitted to the audio genes of the audio data through the respective search server. Will be searched for. Once the same six element values are retrieved, only those with 11 * 50msec spacing are selected. The interval between the selected audio genes and the entire sample audio gene are compared again and the difference is stored as BER. If the value is less than 32%, the same audio gene is determined, otherwise the same is determined.

샘플 오디오유전자와 동일한 오디오유전자가 존재하는 것으로 결정되면, 검색서버(20)는 그 결과를 DBMS(21)로 전송하고, DBMS(21)는 오디오데이터DB(18)로부터 해당 오디오데이터의 오디오정보, 곡명과 가수, 악보, 가사 등을 출력하여 휴대폰의 SMS를 통해 전송할 수 있다.If it is determined that the same audio gene as the sample audio gene exists, the search server 20 transmits the result to the DBMS 21, and the DBMS 21 transmits the audio information of the corresponding audio data from the audio data DB 18. You can output the song name, singer, score, lyrics, etc. and send it via SMS on your mobile phone.

마찬가지로, 라디오(22)와 직접 연결하여 라디오(22)의 음악을 저장한 경우 원음 그대로를 변조없이 WAV파일로 저장하여 검색서버(20)에서 위 휴대폰(20)과 동일한 방법으로 검색하게 된다. 차이는 원음으로부터 오디오유전자를 생성하였으므로, 오디오데이터DB(18)에 저장된 오디오데이터의 오디오유전자들도 원음으로부터 오디오유전자를 생성하여 저장해 놓은 것에 있다. 샘플과 동일한 오디오데이터를 검색하였으면, 그 오디오정보뿐 만 아니라 라디오에서 출력되는 음악이나 광고와같은 오디오정보를 원 음악에 대한 저작권자나 광고주에게 전달하여 자작권료 계산에 사용되거나 광고주로부터 광고비 계산에 사용될 수 있다.Similarly, when the music of the radio 22 is directly stored in connection with the radio 22, the original sound is stored as a WAV file without being tampered with, and the search server 20 searches the same way as the mobile phone 20. The difference is that the audio genes are generated from the original sound, so that the audio genes of the audio data stored in the audio data DB 18 also generate and store the audio genes from the original sound. If the same audio data is retrieved, the audio information as well as the audio information such as music or advertisement output from the radio can be delivered to the copyright holder or advertiser for the original music and used for calculating the copyright fee or for calculating the advertising cost from the advertiser. have.

본 발명을 상기 실시예를 들어 설명하였으나, 본 발명은 이에 제한되는 것은 아니다.Although the present invention has been described with reference to the above embodiments, the present invention is not limited thereto.

상기 실시예에 있어서, 11msec 시간구간이나 750~2750Hz 등 예를 든 수치들은 본 발명에 제한되지 않고 입력된 오디오신호의 종류나 형태, 원하는 처리속도, 검색 정확도 등 사용자의 요구에 따라 다양한 변화가 가능하다.In the above embodiment, the numerical values such as the 11 msec time interval or 750 to 2750 Hz are not limited to the present invention, and various changes can be made according to the user's requirements such as the type or form of the input audio signal, desired processing speed, and search accuracy. Do.

또한, 상기 실시예에 있어서, 오디오샘플의 하나의 시간구간으로부터 주파수를 추출하여 오디오유전자를 생성한다고 하였으나, 여러개의 시간구간으로부터 중첩하여 주파수를 추출하여 오디오유전자를 생성할 수도 있다. 이렇게 시간구간을 중첩하여 주파수를 추출하므로 반복해서 주파수에 따른 신호의 존재여부를 반복해서 비교검색할 수 있는 효과가 있다.In addition, in the above embodiment, the audio gene is generated by extracting a frequency from one time section of the audio sample. However, the audio gene may be generated by extracting a frequency by overlapping the frequency from several time sections. As the frequency is extracted by overlapping the time intervals, the presence or absence of a signal according to the frequency can be repeatedly compared and searched.

또한, 상기 실시예에 있어서, 하나의 시간구간이나 중첩된 여러개의 시간구간으로부터 모든 주파수의 오디오신호를 추출하여 이 추출된 모든 오디오신호로부터 오디오유전자를 생성한다고 하였으나, 특정한 주파수들에 대한 오디오신호를 추출하여 이들로부터만 오디오유전자를 생성할 수도 있다. 검색대상이 될 오디오신호들은 동서양의 특정 음계, 즉 특정주파수에 해당하는 오디오신호만을 사용하므로이 특정주파수로부터만 오디오신호를 추출하여 오디오유전자를 생성하더라도 오디오데이터의 비교검색에 문제가 없을 뿐만 아니라 검색시간을 획기적으로 단축할 수 있는 효과가 있다.In addition, in the above embodiment, although the audio signal of all frequencies is extracted from one time period or several overlapping time intervals, an audio gene is generated from all the extracted audio signals. You can also extract and generate audio genes only from them. Since the audio signals to be searched use only the audio scale corresponding to the specific scale of the east and west, that is, the specific frequency, even if the audio signal is generated by extracting the audio signal only from this specific frequency, there is no problem in the comparative search of the audio data. There is an effect that can significantly shorten.

또한, 상기 실시예에 있어서, 여러 가지 서버로 나누어 구성요소들을 설명하였으나, 하나의 서버에서 여러개의 모듈 또는 프로그램에 의해 상기 실시예의 서버들의 기능을 수행할 수도 있고, 하나의 서버에서 수행하는 기능을 여러개의 서버로 나누어 수행할 수도 있다. 예를 들면, 오디오유전자 생성서버와 오디오유전자 검색서버가 나누어져 있는 것으로 설명하였으나, 하나의 서버에서 각각의 기능을 수행할 수도 있다.In addition, in the above embodiment, the components have been described by dividing into various servers, but the functions of the servers of the above embodiment may be performed by several modules or programs in one server, or a function performed by one server. You can also run it on multiple servers. For example, the audio gene generation server and the audio gene search server have been described as being divided, but one function may be performed by one server.

본 발명에 따른 오디오유전자 생성방법 및 오디오데이터 검색방법은, 핑거프린트를 이용하여 오디오데이터를 검색하되 핑거프린트의 위치정보를 별도로 저장하지 않아 오디오데이터의 검색속도를 향상시킬 수 있는 효과가 있다.Audio gene generation method and audio data retrieval method according to the present invention, it is possible to improve the search speed of the audio data by retrieving the audio data using the fingerprint, but does not store the position information of the fingerprint separately.

또한, 본 발명에 따른 오디오유전자 생성방법 및 오디오데이터 검색방법은, 휴대폰이나 PCS폰과 같이 변조된 오디오신호의 오디오데이터에 대해서도 인식이 가능한 효과가 있다.In addition, the audio gene generation method and the audio data retrieval method according to the present invention have the effect of recognizing the audio data of a modulated audio signal such as a mobile phone or a PCS phone.

또한, 본 발명에 따른 오디오유전자 생성방법 및 오디오데이터 검색방법은, 비교하거나 저장할 데이터의 용량을 최소화하여 저장공간을 절약하고 프로세서의처리속도를 향상시킬 수 있는 효과가 있다.In addition, the audio gene generation method and the audio data retrieval method according to the present invention have the effect of saving the storage space and improving the processing speed of the processor by minimizing the capacity of data to be compared or stored.

또한, 본 발명에 따른 오디오유전자 생성방법 및 오디오데이터 검색방법 은, 비교하는 샘플이 중첩된 시간간격들로부터 주파수를 추출하므로, 하나의 시간간격에 오류가 발생하더라도 중첩하여 비교검색하므로 검색률, 즉 검색의 정확도를 향상시킬 수 있는 효과가 있다.In addition, the audio gene generation method and the audio data retrieval method according to the present invention extract the frequencies from the overlapping time intervals of the samples to be compared. There is an effect to improve the accuracy of.

Claims

An audio gene generation method for generating an audio gene from audio data,

A time division step of dividing the audio signal at predetermined time intervals;

A frequency conversion step of calculating a magnitude of a signal of frequencies included in each time interval or in a plurality of time intervals;

Calculating a difference in the magnitude of the signal between adjacent frequency sections by dividing the frequency domain into predetermined sections;

A slope calculation step of obtaining a difference of said calculated value between adjacent time intervals;

A quantization step of quantizing to 1 when the slope is greater than or equal to 0 when less than 0;

And generating an audio gene by storing the quantized values.

An audio data search method for searching audio data identical to an audio sample by comparing an audio sample storing an audio signal for a predetermined time with a plurality of audio data storing the entire audio signal.

A sample audio gene generation step of generating an audio gene of an audio sample using the audio gene generation method of claim 1;

An audio data audio gene generating step of generating the audio data of the audio data using the audio gene generating method of claim 1;

And a search step of searching for audio data including the same audio gene as the sample audio gene.

The method of claim 2,

The search step,

A selection step of selecting a predetermined number of sample audio genes from the sample audio genes;

A pseudo value generating step of generating a bit value that is the same as or different from the one sample audio gene;

A section searching step of searching for a section having the same value as the similar value among the audio data audio genes, and searching for the audio data audio gene having the same interval as the selected section of the sample audio gene;

And calculating whether or not the audio data and the sample audio gene are equal to the audio data including the similar values at equal intervals, and selecting the same audio data as the audio sample only when the difference is equal to or less than a predetermined standard. Audio data retrieval method.

The method of claim 3, wherein

The audio sample is an audio data retrieval method, characterized in that the amplified above the predetermined signal size from the audio signal input from the wireless communication device and attenuated less than the audio signal transmitted from the wireless communication device from which the noise is removed.

The method of claim 4, wherein

The audio sample is characterized in that the noise corresponding to a frequency outside the frequency range of the tolerance or the tolerance range of the piano scale is removed.

The method of claim 3, wherein

The audio sample is an audio data retrieval method, characterized in that the audio signal transmitted directly connected from an audio device such as a radio or television.

The method according to claim 1 to 6,

In the searching step, the audio genes of the audio data of the database in which the audio data are stored are distributed and temporarily stored in a distributed system, and the search time is shortened by comparing the sample audio genes with the audio genes of the audio data sequentially or simultaneously. Audio data search method characterized in that.

The method of claim 7, wherein

The searching step further includes a broadcast counting step of calculating the number of broadcasts from the audio device by storing the number of times of audio data selection after the selecting step.