CN109726307A

CN109726307A - A kind of inter-network interconnected audio big data memory search method

Info

Publication number: CN109726307A
Application number: CN201811604508.7A
Authority: CN
Inventors: 黄健; 程庚; 陈雅芹; 孙昊; 殷国龙
Original assignee: Hefei Kaijie Technology Co Ltd
Current assignee: Hefei Kaijie Technology Co Ltd
Priority date: 2018-12-26
Filing date: 2018-12-26
Publication date: 2019-05-07

Abstract

The present invention relates to a kind of inter-network interconnected audio big data memory search methods, specifically according to following operating procedure: the audio file wait store or retrieve being carried out upload and is loaded into database, the audio file for being loaded into database to upload again carries out type segmentation, then data analysis is carried out to the audio file after segmentation, the audio file after segmentation is on the one hand subjected to retrieval control in analysis, if there are identical segmentation files in database, delete operation then is executed to the audio after existing segmentation, or in database there are similar file then can the audio after existing segmentation execute priority of the setting priority greater than the former of storage operation and the latter, audio data is fed back after the completion of the above process executes, the identical and similar audio of database is carried out by data presentation by display component.The present invention is split division by uploading audio file to user, and the problem of analyzed, rejected and stored simultaneously, optimize existing database redundancy.

Description

A kind of inter-network interconnected audio big data memory search method

Technical field

The present invention relates to audio signal processing technique field, specifically a kind of inter-network interconnected audio big data memory scan side Method.

Background technique

Existing popular music listens song to know bent, humming and know the functions such as song and be substantially to extract the unique characteristic quantity of audio and known Not, in order to which the precision for improving identification will reservation audio frequency characteristics amount as much as possible.Usual a segment of audio has thousands of Characteristic quantity, after digital audio reaches certain amount grade, audio frequency characteristics amount needs sufficiently large database to store, and leads to data Library inquiry speed dramatic decrease.Simultaneously in audio data memory scan field, Finger print characteristic abstract has been achieved with matching algorithm Good effect, but its audio data processing experiment for applying in general to middle and small scale, are difficult to realize accurate search function It can, it is desirable to meet commercial standard (CS) and also need to optimize.

Summary of the invention

In order to make up the above deficiency, the present invention provides a kind of inter-network interconnected audio big data memory search methods, with solution The problems in certainly above-mentioned background technique.

The technical scheme is that a kind of inter-network interconnected audio big data memory search method, specifically according to following behaviour Make step: the audio file wait store or retrieve being subjected to upload and is loaded into database, then database is loaded into upload Audio file carries out type segmentation, then carries out data analysis to the audio file after segmentation, on the one hand will segmentation in analysis Audio file afterwards carries out retrieval control, if there are identical segmentation files in database, to the audio after existing segmentation Execute in delete operation or database there are similar file then can the audio after existing segmentation execute storage operation and after The setting priority of person is greater than the former priority, feeds back, that is, passes through to audio data after the completion of the above process executes The identical and similar audio of database is carried out data presentation by display component.

Preferably, the database uses the PostgreSQL database of the face Hbase column, and Hbase joined the concept of column family, number According to storing in the form of a table, table locks a storage unit by ranks, each data cell saves the multiple of same part data Version is indicated by timestamp, wherein being displaced mark in the alive table of row, as the major key of retrieval record, is arranged by column family and limit Fixed symbol mark, actually by going, column family, qualifier, which is displaced, determines storage unit.

Preferably, the audio is loaded into process and is uploaded to using the audio processing program of electric terminal to audio file Database.

Preferably, the audio segmentation process is to be split processing to the audio file after upload, by believing audio It number is divided into the piece split of fixed size, split is further then resolved into key-value pair (R1, N1) one by one, (R2, N2), (R3, N3) ... (Rn, Nn), data system is that each split distributes a map function, by key-value pair (R1, N1) ... Input obtains a series of intermediate results (Rn, Nn).

Preferably, the audio analysis process be to the file in (Rn, Nn) each of after audio segmentation and database into Row check analysis is gathered (Rn, list(Nn) to identical key-value pair (Rn, Nn) present in database) and different keys Value (Rm, list(Nm) is gathered to (Rm, Nm)) and similar key-value pair (Ri, Ni) gathered (Ri, list(Ni)) Hbase is distributed to be handled towards the PostgreSQL database of column.

Preferably, the audio storage process is to the identical key-value pair set (Rn, list(Nn) after packing), it is different Key-value pair set (Rm, list(Nm)) and similar key-value pair set (Ri, list(Ni)) carry out classification processing, will be identical Key-value pair set is deleted, and by the identical key assignments stored in database exchange with to audio file then and different key assignments To set (Rm, list(Nm)) and similar key-value pair set (Ri, list(Ni)) be stored in Hbase database, and will knot Fruit processing result feeds back to user by display component.

The advantage of the invention is that be split by uploading audio file to user, and by the file key assignments after segmentation, Then each group of key assignments and the PostgreSQL database of the face Hbase column are analyzed simultaneously, rejected and is stored, by identical key assignments It rejects, while calling the identical key assignments in database, then store to similar key assignments and different key assignments, finally feed back to use Family, the problem of optimizing existing database redundancy.

Detailed description of the invention

In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention without any creative labor, may be used also for those of ordinary skill in the art To obtain other drawings based on these drawings.

Fig. 1 is audio data storage and retrieval system flow chart of the present invention.

Specific embodiment

The present invention is further illustrated in the following with reference to the drawings and specific embodiments.

With reference to Fig. 1 it is found that a kind of inter-network interconnected audio big data memory search method of the present invention, specifically operates according to following Step: the audio file wait store or retrieve is subjected to upload and is loaded into database, then is loaded into the sound of database to upload Frequency file carries out type segmentation, then carries out data analysis to the audio file after segmentation, after in analysis, one side is by segmentation Audio file carry out retrieval control, if being held there are identical segmentation file to the audio after existing segmentation in database In row delete operation or database there are similar file then can the audio after existing segmentation execute storage operation and the latter Setting priority be greater than the former priority, audio data is fed back after the completion of the above process executes, i.e., by aobvious Show that the identical and similar audio of database is carried out data presentation by component.

Specifically, the database uses the PostgreSQL database of the face Hbase column, Hbase joined the concept of column family, number According to storing in the form of a table, table locks a storage unit by ranks, each data cell saves the multiple of same part data Version is indicated by timestamp, wherein being displaced mark in the alive table of row, as the major key of retrieval record, is arranged by column family and limit Fixed symbol mark, actually by going, column family, qualifier, which is displaced, determines storage unit.

Audio file is uploaded to using the audio processing program of electric terminal specifically, the audio is loaded into process Database.

Specifically, the audio segmentation process is to be split processing to the audio file after upload, by believing audio It number is divided into the piece split of fixed size, split is further then resolved into key-value pair (R1, N1) one by one, (R2, N2), (R3, N3) ... (Rn, Nn), data system is that each split distributes a map function, by key-value pair (R1, N1) ... Input obtains a series of intermediate results (Rn, Nn).

Specifically, the audio analysis process be to the file in (Rn, Nn) each of after audio segmentation and database into Row check analysis is gathered (Rn, list(Nn) to identical key-value pair (Rn, Nn) present in database) and different keys Value (Rm, list(Nm) is gathered to (Rm, Nm)) and similar key-value pair (Ri, Ni) gathered (Ri, list(Ni)) Hbase is distributed to be handled towards the PostgreSQL database of column.

Specifically, the audio storage process is to the identical key-value pair set (Rn, list(Nn) after packing), it is different Key-value pair set (Rm, list(Nm)) and similar key-value pair set (Ri, list(Ni)) carry out classification processing, will be identical Key-value pair set is deleted, and by the identical key assignments stored in database exchange with to audio file then and different key assignments To set (Rm, list(Nm)) and similar key-value pair set (Ri, list(Ni)) be stored in Hbase database, and will knot Fruit processing result feeds back to user by display component.

The present invention is split, and when uploading audio file processing to user by the file key assignments after segmentation first, after And each group of key assignments and the PostgreSQL database of the face Hbase column are analyzed simultaneously, rejected and stored, identical key assignments is picked It removes, while calling the identical key assignments in database, then store to similar key assignments and different key assignments, finally feed back to user, The problem of optimizing existing database redundancy.

The above shows and describes the basic principle, main features and advantages of the invention.The technology of the industry Personnel are it should be appreciated that the present invention is not limited to the above embodiments, and the above embodiments and description only describe this The principle of invention, without departing from the spirit and scope of the present invention, various changes and improvements may be made to the invention, these changes Change and improvement all fall within the protetion scope of the claimed invention.The claimed scope of the invention by appended claims and its Equivalent thereof.

Claims

1. a kind of inter-network interconnected audio big data memory search method, it is characterised in that: specifically according to following operating procedure: will be to The audio file of storage or retrieval, which upload, is loaded into database, then the audio file for being loaded into database to upload carries out Type segmentation, then carries out data analysis to the audio file after segmentation, in analysis on the one hand by the audio file after segmentation Retrieval control is carried out, if there are identical segmentation files in database, delete operation is executed to the audio after existing segmentation, Or in database there are similar file then can the audio after existing segmentation execute storage operation and the setting of the latter is preferential Grade is greater than the former priority, feeds back after the completion of the above process executes to audio data, i.e., will be counted by display component Data presentation is carried out according to the identical and similar audio in library.

2. a kind of inter-network interconnected audio big data memory search method described in accordance with the claim 1, it is characterised in that: the number The PostgreSQL database of the face Hbase column is used according to library, Hbase joined the concept of column family, and data store in the form of a table, table by Ranks lock a storage unit, each data cell saves multiple versions of same part data, indicated by timestamp, It wherein is displaced mark in the alive table of row, as the major key of retrieval record, column are identified by column family and qualifier, actually by going, are arranged Race, qualifier, which is displaced, determines storage unit.

3. a kind of inter-network interconnected audio big data memory search method described in accordance with the claim 1, it is characterised in that: the sound Frequency loading process carries out audio file to be uploaded to database using the audio processing program of electric terminal.

4. a kind of inter-network interconnected audio big data memory search method described in accordance with the claim 1, it is characterised in that: the sound Frequency cutting procedure is to be split processing to the audio file after upload, by the piece that audio signal is divided into fixed size Split is then further resolved into key-value pair (R1, N1) one by one by split, (R2, N2), (R3, N3) ... (Rn, Nn), Data system is that each split distributes a map function, and by key-value pair (R1, N1) ..., input obtains a series of intermediate results (Rn, Nn).

5. a kind of inter-network interconnected audio big data memory search method described in accordance with the claim 1, it is characterised in that: the sound Frequency analysis process is to carry out check analysis to the file in (Rn, Nn) each of after audio segmentation and database, in database Existing identical key-value pair (Rn, Nn) is gathered (Rn, list(Nn)) and different key-value pair (Rm, Nm) gathered (Rm, list(Nm)) and similar key-value pair (Ri, Ni) gathered (Ri, list(Ni)) be distributed to Hbase towards column PostgreSQL database handled.

6. a kind of inter-network interconnected audio big data memory search method described in accordance with the claim 1, it is characterised in that: the sound Frequency storing process is to the identical key-value pair set (Rn, list(Nn) after packing), different key-value pair set (Rm, list (Nm)) and similar key-value pair set (Ri, list(Ni)) carry out classification processing, identical key-value pair set is deleted, And by the identical key assignments stored in database exchange with to audio file then and different key-value pair set (Rm, list(Nm)) With similar key-value pair set (Ri, list(Ni)) it is stored in Hbase database, and result treatment result is passed through into display group Part feeds back to user.