KR20160100224A

KR20160100224A - Method and device for constructing audio fingerprint database and searching audio fingerprint

Info

Publication number: KR20160100224A
Application number: KR1020160001876A
Authority: KR
Inventors: 레이 왕
Original assignee: 레이 왕
Priority date: 2015-02-13
Filing date: 2016-01-07
Publication date: 2016-08-23
Also published as: CN104636474A; KR20160100218A; DE102015015827A1

Abstract

The method includes extracting audio fingerprints and key values of the audio fingerprints; Establishing a correspondence between the audio fingerprints having the same key value and the key value among the audio fingerprints; Assigning the audio fingerprints corresponding to the key value and the key value to a server; And constructing an audio fingerprint database by establishing a correspondence relationship between the key value and the server. The present invention contemplates calculating audio fingerprints and key values associated with the audio fingerprints each time in an inquiry requirement by evenly assigning audio fingerprints and key values associated with the audio fingerprints to the servers And the computation optimizes the load balancing by ensuring a maximum range of resource utilization and statistics of the access frequency for each key in each server during the operating period of the system, Improve.

Description

Field of the Invention [0001] The present invention relates to a method and apparatus for constructing an audio fingerprint database,

BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an information technology field, and more particularly, to an audio fingerprint database construction method and an audio fingerprint search method and apparatus.

Audio fingerprint recognition is a data intensive and computationally intensive application: in order to ensure recognition of an application, a vast audio fingerprint database must have the characteristics of a sufficient number of audio files, and this scale usually reaches 10 million levels; The search algorithm has a huge amount of computation, and in order to ensure the search speed, it usually stores the vast amount of the audio fingerprint database and performs the calculation using the maximum memory.

Audio fingerprint recognition is known to require a high memory capacity (TB level) that can not currently be met by a single server. According to the prior art, it is common to break up the music library, build up each of the audio fingerprint databases using a large number of servers, and while searching all the servers, this technique will result in unnecessary searching and waste of resources. Therefore, how to build a massive audio fingerprint database system accompanied by optimum resource utilization has become a problem in this industry.

Embodiments of the present invention are for building a massive audio fingerprint database system with optimal resource utilization.

According to an aspect of the present invention, there is provided a method of generating audio fingerprints, the method comprising: extracting audio fingerprints and key values of the audio fingerprints; Establishing a correspondence between the audio fingerprints having the same key value and the key value among the audio fingerprints; Assigning the audio fingerprints corresponding to the key value and the key value to a server; And constructing an audio fingerprint database by establishing a correspondence relationship between the key value and the server.

The audio fingerprint database construction and audio fingerprint retrieval method and apparatus of the present invention can evenly assign audio fingerprints and key values associated with the audio fingerprints to servers so that each time an audio fingerprint And the key values associated with the audio fingerprints to a server arranged for computation, the computation ensuring a maximum range of resource utilization, and an access frequency for each key in each server during an operating period of the system To optimize load balancing to improve the throughput of the system.

1 is a flowchart for explaining a method of constructing an audio fingerprint database of the present invention
2 is a flowchart for explaining an audio fingerprint search method in a server according to the present invention;
3 is a block diagram illustrating the assignment of audio fingerprints associated with key values and key values.
Figure 4 shows the structure of a hash table
FIG. 5 is a diagram showing a structure in which key values in a server are distributed
6 is a block diagram illustrating an apparatus for constructing an audio fingerprint database according to the present invention
7 is a block diagram illustrating an audio fingerprint search apparatus in a server of the present invention

BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a flow chart illustrating a method for building an audio fingerprint database of the present invention. It should be understood that the flowchart shown in FIG. 1 is for illustrative purposes only, and that the steps described herein may be performed in a different order, performed in parallel, and omitted and / or other steps added . As shown in FIG. 1, a method for building an audio fingerprint database includes extracting (102) audio fingerprints and key values of the audio fingerprints; Establishing (104) a correspondence between the key fingerprint and the audio fingerprints having the same key value among the audio fingerprints; Assigning (106) the audio fingerprints corresponding to the key value and the key value to a server; And establishing a correspondence relationship between the key value and the server to establish an audio fingerprint database (108).

Step 102 is a step of extracting audio fingerprints and key values of the audio fingerprints, i.e., searching and acquiring a large amount of audio first, then extracting audio fingerprints from a large amount of audio, And setting each key value of the fingerprint.

Step 104 establishes a correspondence relationship between the audio fingerprints having the same key value and the key value among the audio fingerprints and establishes an index list including audio fingerprints corresponding to each key value and the key value . Listing the audio fingerprints having the same key value according to the same key value, causing a certain correspondence between the audio fingerprints and the key value to exist, Establish an index list that includes audio fingerprints, and the key value of the audio fingerprint may be obtained by retrieving the audio fingerprint.

Step 106 is a step of assigning the audio fingerprints corresponding to the key value and the key value to a server, and step 108 is a step of establishing an audio fingerprint database by establishing a correspondence between the key value and the server . Since there are a myriad of key values, a step of assigning the key values to the servers is required, and each of the servers includes a different number of key values.

In the audio fingerprint storage method of the present invention, the step of establishing correspondence between audio fingerprints having the same key value and the key value includes classifying audio fingerprints having the same key value into the same array . By classifying the audio fingerprints having the same key value into the same array, a correspondence between the audio fingerprints having the same key value and the key value can be established, and the key value can be obtained through the audio fingerprint. The array includes two elements, a key value and audio fingerprints.

As shown in FIG. 1, an audio fingerprint storage method includes: (110) calculating a search frequency of the array; And (112) allocating the arrays to the servers according to the number of searches to minimize a load difference between the servers.

Step 110 is a step of calculating the number of searches of the array, wherein the step of calculating the number of searches of the array includes: calculating a search frequency of the key value of the array; Determining the number of audio fingerprints of the array; And obtaining a search frequency of the array by calculating a product of the search frequency and the number of audio fingerprints. As the array includes two elements, i.e., key values and audio fingerprints, the search frequency of the key value in the array is calculated and the number of audio fingerprints in the array is determined to obtain two values, The number of search times of the array can be obtained by calculating the product of the number of fingerprints.

Step 112 is to allocate the arrays to the servers according to the number of searches to minimize the load difference between the servers. There are many ways to allocate arrays to servers, these methods can be randomly assigned, artificially and statically assigned, and can also be assigned through a particular algorithm. For example, it is possible to perform the calculation first and request the load required in each array, and then allocate a plurality of arrays to the same server at specific time intervals, so that a relatively even load can be realized have.

According to one embodiment of the present invention, arrays are preferably allocated to servers by a greedy algorithm to minimize the load difference between the servers, i. E., Load balancing between servers Respectively. The Gridi algorithm is also called a grid algorithm, and the grid algorithm always performs the best choices currently seen while solving the problem, ie, the grid algorithm does not consider the overall optimization, It is a plan.

Hereinafter, a step of allocating the arrays to the servers using the greedy algorithm to minimize the load difference between the servers will be described by way of a detailed example. There are six key values of key 1, key 2, key 3, key 4, key 5, and key 6. At this time, the retrieval frequency of key 1 is 10, the retrieval frequency of key 2 is 20, 30, the search frequency of key 4 is 40, the search frequency of key 5 is 50, and the search frequency of key 6 is 60; The number of audio fingerprints provided in key 1 is 10, the number of audio fingerprints provided in key 2 is 20, the number of audio fingerprints provided in key 3 is 30, the number of audio fingerprints provided in key 4 is 40, The number of audio fingerprints provided to key 6 is 60, and the number of audio fingerprints provided to key 6 is 60. Therefore, the product of the search frequency and the number of audio fingerprints can be calculated to obtain the number of searches in the array, and the audio fingerprints associated with key 1 and key 1 are placed in the first array, The prints are placed in a second array and the audio fingerprints associated with key 3 and key 3 are placed in the third array and the audio fingerprints associated with key 4 and key 4 are placed in the fourth array and key 5 and key 5 The audio fingerprints associated with key 6 and key 6 are placed in the sixth array. The number of searches of the first array is 100, the number of searches of the second array is 400, the number of searches of the third array is 900, the number of searches of the fourth array is 1600, the number of searches of the fifth array is 2500, The number of searches is obtained as 3600. If there are only three servers, the Greedy algorithm assigns the first array having the smallest search frequency and the sixth array having the largest search frequency to the first server, and the second small search frequency A fifth array having the second highest number of searches is allocated to the second server and a third array and the fourth array having the intermediate number of searches are assigned to the third server.

If there are only two servers, the first array, the fifth array and the sixth array are assigned to the first server, and the second array, the third array and the fourth array are assigned to the second server . Thus, the load difference between the servers is minimized by the Greedy algorithm. While the invention has been described in conjunction with specific embodiments thereof, it is not for the purpose of limiting the invention but for the purpose of explanation, and may be modified and / or omitted without departing from the scope of the invention. For example, in practical applications, the number of searches may be from ten to ten hundred million times, and the above-described embodiments are only for the purpose of explaining the overall concept of the present invention, and are not limited to specific amounts.

Although the load balancing method has been described above, it should be noted that the present invention is not limited to some specific algorithms. Methods that already exist or should be learned that can achieve load balancing of the server will also fall within the claimed scope of the invention.

Load balancing can be static and can also be dynamic. Static balancing first identifies the load on each array and balances the load on the servers, then maintains a correspondence between each server and the array. Dynamic equilibrium means to recalculate all the loads of the array at a specified time interval and reallocate them to the appropriate servers. Dynamic load balancing will allow the server to maintain high efficiency for long periods of time. Dynamic balancing means recalculating the load on all arrays at a specific time interval and allocating corresponding servers. Dynamic load balancing allows servers to maintain high efficiency for long periods of time.

According to a technical solution of the audio fingerprint storage method of the present invention, the following specific embodiments are disclosed:

(1) To construct an audio fingerprint index database, audio data is collected as much as possible using audio fingerprint data, and as shown in FIG. 3, each key value stored in one file And obtains a total of n arrays and stores them on the hard disk.

(2) Node ₁ , Node ₂ , ... , _K servers with node _k (node ₁ _, node ₂ _, ... _, node _k ), and initial data, i.e., the number of audio fingerprints of each server (num _i = 0). data of n number of keys and key _i _(i key), i.e., the number of the audio fingerprint of the key _i, it is assumed that the count value _i (valueNum _i) exists. Node ₁ , Node ₂ , ... , Key _{1 in} node _k , key ₂ , ... , Inputting the data quantity of the key _k , respectively, and updating the data quantity of each of the servers to satisfy the number _i = the value quantity _i ; Selects the smallest number _i each time for key _j (j = k + 1, ..., n) and inputs it to node _i , and updates _i to satisfy the number _i = number _i + value number _j ; Finally, as shown in Fig. 4, a hash table is constructed by obtaining the key _i as a key value and the server node _j input as a data value, and based on the key value, the audio fingerprint is quickly located Allows you to find the server.

(3) initialize the frequency of key _i to obtain key frequency _i (keyf _i ) = 0 (i = 1, ..., n); Store after receiving a search condition, extracting a key value corresponding to the audio fingerprint and the audio fingerprint, if the total of m array presence characterized _{i (feature i) (i =} 1, ..., m) and and; First (stored as key _i) in the hash table in step (2), characterized _i looking at the node _k server nodes of each of the key value, and transmits the feature _i for searching the node _k, receiving the intermediate results of all the features _i Summary after, performing the final analysis, using a search algorithm, and returns the final results, update the frequency of how often the key _i key _i in accordance with the number of searches for the key value of _i.

(4) The search frequency of the key _i value in each array is calculated, the number of audio fingerprints in each array is determined, and the product of the search frequency and the number of audio fingerprints is calculated to obtain the search frequency of each array. The system operates for a set period of time for the arrays corresponding to the key _i value so that the total _i (total _i ) = number of values _i * key frequency _i , so that the number of searches of each array is calculated, and the key _i The arrays corresponding to the values can be assigned to the corresponding servers by the greedy algorithm and the total _i so that the load difference between the servers can be minimized.

2 is a flowchart illustrating a method of searching for an audio fingerprint in a server according to the present invention. It should be understood that the flowchart shown in FIG. 2 is for illustrative purposes only, and that the steps described herein may be performed in a different order, may be performed in parallel, and omit and / or other steps may be added . As shown in FIG. 2, a method of searching an audio fingerprint in a server includes extracting (202) a search target key value of a search target audio fingerprint and a search target audio fingerprint; Retrieving (204) a server corresponding to the retrieval target key value; And retrieving (206) an audio fingerprint corresponding to the search target audio fingerprint in the server. The server stores key values corresponding to the audio fingerprints and audio fingerprints, and the key value corresponds to the server.

Step 202 is a step of extracting a search target audio fingerprint and a search target key value of the search target audio fingerprint. If the information of the audio is to be searched, the search target audio fingerprint and the search target key value of the search target audio fingerprint need to be preferentially obtained. The search target audio fingerprint and the search target key value of the search target audio fingerprint are not one and the search target key values of the plurality of search target audio fingerprints and the search target audio fingerprints may be obtained according to the information of the audio.

Step 204 is a step of searching for a server corresponding to the search target key value. Since the key values corresponding to the audio fingerprints and the audio fingerprints are stored in the server, the corresponding key value is searched according to the search target key value . During the search, only the server corresponding to the key value is searched.

Step 206 is a step of searching for audio fingerprints corresponding to search target audio fingerprints in the server. The step searches the key value corresponding to the search target key value in the server of step 204 and searches the audio fingerprints corresponding to the search target audio fingerprint based on the key value in the server.

As shown in FIG. 2, the method for searching an audio fingerprint in a server further includes generating (208) audio information using the retrieved audio fingerprints. The audio information includes name, author, and information about such audio. The retrieved audio includes a plurality of search target audio fingerprints and search target key values of search target audio fingerprints, corresponding audio fingerprints in the server are retrieved, and the retrieved audio fingerprints are summarized and / So that audio information can be obtained.

According to a technical solution of the audio fingerprint search method of the present invention, the following specific embodiments are disclosed:

(5) When the audio is to be searched, after the search demand is received, the search target audio fingerprints of the search target audio and the search target audio fingerprints of the search target audio fingerprints are extracted, Retrieving corresponding audio fingerprints and key values in the server based on the retrieval object key values of the retrieval object audio fingerprints, and if m arrays exist, i. E. M key values and audio corresponding to the key values Stores feature _i (i = 1, ..., m) if fingerprints are present.

(6) find node _k , which is a server node of each key value of feature _i (stored with key i) in the small-sized hash table of step (2), transmit feature _i to node _k for retrieval, _i , and performs a final analysis using a search algorithm, returns a final result, and obtains audio information of the searched audio.

6 is a structural diagram showing an apparatus for constructing an audio fingerprint database according to the present invention. 6, the apparatus for constructing an audio fingerprint database includes an extraction unit 10, an analysis unit 11, an assignment unit 12, and a storage unit 13. The extraction unit 10 includes an audio finger Is used to extract key values of the prints and audio fingerprints; The analyzing unit 11 is used to establish correspondence between the key values and the audio fingerprints having the same key value; The assigning unit 12 is used to assign key values and corresponding audio fingerprints to the server; The storage unit 13 is used to establish an audio fingerprint database by establishing a correspondence relationship between the key value and the server.

The apparatus for constructing an audio fingerprint database according to the present invention further includes a calculation unit 14 and a reassignment unit 15. The calculation unit 14 is used for calculating the number of searches of the arrays, To allocate arrays to servers to minimize load differences between servers.

7 is a structural diagram illustrating an audio fingerprint search apparatus in a server according to the present invention. 7, the audio fingerprint storage device includes an acquisition unit 20, a search unit 21, and a matching unit 22, and the acquisition unit 20 acquires the audio fingerprints to be searched and the search target audio fingerprints The search unit 21 is used to search for a server corresponding to the search target key value, and the matching unit 22 searches for search target audio fingerprints of the audio fingerprints in the server Is used to search for corresponding audio fingerprints.

As shown in Fig. 7, the audio fingerprint search apparatus in the server further includes a generation unit 23, and the generation unit 23 is used to generate audio information using the searched audio fingerprints.

10:
11: Analysis department
12:
13:
14:
15: Reassignment
20:
21:
22:
23:

Claims

Extracting audio fingerprints and key values of the audio fingerprints;
Establishing a correspondence between the audio fingerprints having the same key value and the key value among the audio fingerprints;
Assigning the audio fingerprints corresponding to the key value and the key value to a server; And
Establishing a correspondence relationship between the key value and the server and building an audio fingerprint database.

The method according to claim 1,
Wherein establishing correspondence between the audio fingerprints having the same key value and the key value comprises classifying the audio fingerprints having the same key value into the same array, How to build a database.

The method of claim 2,
Calculating a search frequency of the arrays; And
And allocating the arrays to the servers according to the number of searches to minimize load differences between the servers.

The method of claim 3,
Wherein the step of calculating the number of searches of the array comprises:
Calculating a search frequency of the key value in the array;
Determining a number of audio fingerprints in the array;
And calculating a product of the search frequency and the number of audio fingerprints to obtain the search frequency of the array.

The method of claim 3,
Wherein the minimizing the load difference between the servers comprises:
And allocating the arrays to the servers using a greedy algorithm.