CN109726307A - A kind of inter-network interconnected audio big data memory search method - Google Patents
A kind of inter-network interconnected audio big data memory search method Download PDFInfo
- Publication number
- CN109726307A CN109726307A CN201811604508.7A CN201811604508A CN109726307A CN 109726307 A CN109726307 A CN 109726307A CN 201811604508 A CN201811604508 A CN 201811604508A CN 109726307 A CN109726307 A CN 109726307A
- Authority
- CN
- China
- Prior art keywords
- audio
- database
- segmentation
- key
- value pair
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention relates to a kind of inter-network interconnected audio big data memory search methods, specifically according to following operating procedure: the audio file wait store or retrieve being carried out upload and is loaded into database, the audio file for being loaded into database to upload again carries out type segmentation, then data analysis is carried out to the audio file after segmentation, the audio file after segmentation is on the one hand subjected to retrieval control in analysis, if there are identical segmentation files in database, delete operation then is executed to the audio after existing segmentation, or in database there are similar file then can the audio after existing segmentation execute priority of the setting priority greater than the former of storage operation and the latter, audio data is fed back after the completion of the above process executes, the identical and similar audio of database is carried out by data presentation by display component.The present invention is split division by uploading audio file to user, and the problem of analyzed, rejected and stored simultaneously, optimize existing database redundancy.
Description
Technical field
The present invention relates to audio signal processing technique field, specifically a kind of inter-network interconnected audio big data memory scan side
Method.
Background technique
Existing popular music listens song to know bent, humming and know the functions such as song and be substantially to extract the unique characteristic quantity of audio and known
Not, in order to which the precision for improving identification will reservation audio frequency characteristics amount as much as possible.Usual a segment of audio has thousands of
Characteristic quantity, after digital audio reaches certain amount grade, audio frequency characteristics amount needs sufficiently large database to store, and leads to data
Library inquiry speed dramatic decrease.Simultaneously in audio data memory scan field, Finger print characteristic abstract has been achieved with matching algorithm
Good effect, but its audio data processing experiment for applying in general to middle and small scale, are difficult to realize accurate search function
It can, it is desirable to meet commercial standard (CS) and also need to optimize.
Summary of the invention
In order to make up the above deficiency, the present invention provides a kind of inter-network interconnected audio big data memory search methods, with solution
The problems in certainly above-mentioned background technique.
The technical scheme is that a kind of inter-network interconnected audio big data memory search method, specifically according to following behaviour
Make step: the audio file wait store or retrieve being subjected to upload and is loaded into database, then database is loaded into upload
Audio file carries out type segmentation, then carries out data analysis to the audio file after segmentation, on the one hand will segmentation in analysis
Audio file afterwards carries out retrieval control, if there are identical segmentation files in database, to the audio after existing segmentation
Execute in delete operation or database there are similar file then can the audio after existing segmentation execute storage operation and after
The setting priority of person is greater than the former priority, feeds back, that is, passes through to audio data after the completion of the above process executes
The identical and similar audio of database is carried out data presentation by display component.
Preferably, the database uses the PostgreSQL database of the face Hbase column, and Hbase joined the concept of column family, number
According to storing in the form of a table, table locks a storage unit by ranks, each data cell saves the multiple of same part data
Version is indicated by timestamp, wherein being displaced mark in the alive table of row, as the major key of retrieval record, is arranged by column family and limit
Fixed symbol mark, actually by going, column family, qualifier, which is displaced, determines storage unit.
Preferably, the audio is loaded into process and is uploaded to using the audio processing program of electric terminal to audio file
Database.
Preferably, the audio segmentation process is to be split processing to the audio file after upload, by believing audio
It number is divided into the piece split of fixed size, split is further then resolved into key-value pair (R1, N1) one by one, (R2,
N2), (R3, N3) ... (Rn, Nn), data system is that each split distributes a map function, by key-value pair (R1, N1) ...
Input obtains a series of intermediate results (Rn, Nn).
Preferably, the audio analysis process be to the file in (Rn, Nn) each of after audio segmentation and database into
Row check analysis is gathered (Rn, list(Nn) to identical key-value pair (Rn, Nn) present in database) and different keys
Value (Rm, list(Nm) is gathered to (Rm, Nm)) and similar key-value pair (Ri, Ni) gathered (Ri, list(Ni))
Hbase is distributed to be handled towards the PostgreSQL database of column.
Preferably, the audio storage process is to the identical key-value pair set (Rn, list(Nn) after packing), it is different
Key-value pair set (Rm, list(Nm)) and similar key-value pair set (Ri, list(Ni)) carry out classification processing, will be identical
Key-value pair set is deleted, and by the identical key assignments stored in database exchange with to audio file then and different key assignments
To set (Rm, list(Nm)) and similar key-value pair set (Ri, list(Ni)) be stored in Hbase database, and will knot
Fruit processing result feeds back to user by display component.
The advantage of the invention is that be split by uploading audio file to user, and by the file key assignments after segmentation,
Then each group of key assignments and the PostgreSQL database of the face Hbase column are analyzed simultaneously, rejected and is stored, by identical key assignments
It rejects, while calling the identical key assignments in database, then store to similar key assignments and different key assignments, finally feed back to use
Family, the problem of optimizing existing database redundancy.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of invention without any creative labor, may be used also for those of ordinary skill in the art
To obtain other drawings based on these drawings.
Fig. 1 is audio data storage and retrieval system flow chart of the present invention.
Specific embodiment
The present invention is further illustrated in the following with reference to the drawings and specific embodiments.
With reference to Fig. 1 it is found that a kind of inter-network interconnected audio big data memory search method of the present invention, specifically operates according to following
Step: the audio file wait store or retrieve is subjected to upload and is loaded into database, then is loaded into the sound of database to upload
Frequency file carries out type segmentation, then carries out data analysis to the audio file after segmentation, after in analysis, one side is by segmentation
Audio file carry out retrieval control, if being held there are identical segmentation file to the audio after existing segmentation in database
In row delete operation or database there are similar file then can the audio after existing segmentation execute storage operation and the latter
Setting priority be greater than the former priority, audio data is fed back after the completion of the above process executes, i.e., by aobvious
Show that the identical and similar audio of database is carried out data presentation by component.
Specifically, the database uses the PostgreSQL database of the face Hbase column, Hbase joined the concept of column family, number
According to storing in the form of a table, table locks a storage unit by ranks, each data cell saves the multiple of same part data
Version is indicated by timestamp, wherein being displaced mark in the alive table of row, as the major key of retrieval record, is arranged by column family and limit
Fixed symbol mark, actually by going, column family, qualifier, which is displaced, determines storage unit.
Audio file is uploaded to using the audio processing program of electric terminal specifically, the audio is loaded into process
Database.
Specifically, the audio segmentation process is to be split processing to the audio file after upload, by believing audio
It number is divided into the piece split of fixed size, split is further then resolved into key-value pair (R1, N1) one by one, (R2,
N2), (R3, N3) ... (Rn, Nn), data system is that each split distributes a map function, by key-value pair (R1, N1) ...
Input obtains a series of intermediate results (Rn, Nn).
Specifically, the audio analysis process be to the file in (Rn, Nn) each of after audio segmentation and database into
Row check analysis is gathered (Rn, list(Nn) to identical key-value pair (Rn, Nn) present in database) and different keys
Value (Rm, list(Nm) is gathered to (Rm, Nm)) and similar key-value pair (Ri, Ni) gathered (Ri, list(Ni))
Hbase is distributed to be handled towards the PostgreSQL database of column.
Specifically, the audio storage process is to the identical key-value pair set (Rn, list(Nn) after packing), it is different
Key-value pair set (Rm, list(Nm)) and similar key-value pair set (Ri, list(Ni)) carry out classification processing, will be identical
Key-value pair set is deleted, and by the identical key assignments stored in database exchange with to audio file then and different key assignments
To set (Rm, list(Nm)) and similar key-value pair set (Ri, list(Ni)) be stored in Hbase database, and will knot
Fruit processing result feeds back to user by display component.
The present invention is split, and when uploading audio file processing to user by the file key assignments after segmentation first, after
And each group of key assignments and the PostgreSQL database of the face Hbase column are analyzed simultaneously, rejected and stored, identical key assignments is picked
It removes, while calling the identical key assignments in database, then store to similar key assignments and different key assignments, finally feed back to user,
The problem of optimizing existing database redundancy.
The above shows and describes the basic principle, main features and advantages of the invention.The technology of the industry
Personnel are it should be appreciated that the present invention is not limited to the above embodiments, and the above embodiments and description only describe this
The principle of invention, without departing from the spirit and scope of the present invention, various changes and improvements may be made to the invention, these changes
Change and improvement all fall within the protetion scope of the claimed invention.The claimed scope of the invention by appended claims and its
Equivalent thereof.
Claims (6)
1. a kind of inter-network interconnected audio big data memory search method, it is characterised in that: specifically according to following operating procedure: will be to
The audio file of storage or retrieval, which upload, is loaded into database, then the audio file for being loaded into database to upload carries out
Type segmentation, then carries out data analysis to the audio file after segmentation, in analysis on the one hand by the audio file after segmentation
Retrieval control is carried out, if there are identical segmentation files in database, delete operation is executed to the audio after existing segmentation,
Or in database there are similar file then can the audio after existing segmentation execute storage operation and the setting of the latter is preferential
Grade is greater than the former priority, feeds back after the completion of the above process executes to audio data, i.e., will be counted by display component
Data presentation is carried out according to the identical and similar audio in library.
2. a kind of inter-network interconnected audio big data memory search method described in accordance with the claim 1, it is characterised in that: the number
The PostgreSQL database of the face Hbase column is used according to library, Hbase joined the concept of column family, and data store in the form of a table, table by
Ranks lock a storage unit, each data cell saves multiple versions of same part data, indicated by timestamp,
It wherein is displaced mark in the alive table of row, as the major key of retrieval record, column are identified by column family and qualifier, actually by going, are arranged
Race, qualifier, which is displaced, determines storage unit.
3. a kind of inter-network interconnected audio big data memory search method described in accordance with the claim 1, it is characterised in that: the sound
Frequency loading process carries out audio file to be uploaded to database using the audio processing program of electric terminal.
4. a kind of inter-network interconnected audio big data memory search method described in accordance with the claim 1, it is characterised in that: the sound
Frequency cutting procedure is to be split processing to the audio file after upload, by the piece that audio signal is divided into fixed size
Split is then further resolved into key-value pair (R1, N1) one by one by split, (R2, N2), (R3, N3) ... (Rn, Nn),
Data system is that each split distributes a map function, and by key-value pair (R1, N1) ..., input obtains a series of intermediate results
(Rn, Nn).
5. a kind of inter-network interconnected audio big data memory search method described in accordance with the claim 1, it is characterised in that: the sound
Frequency analysis process is to carry out check analysis to the file in (Rn, Nn) each of after audio segmentation and database, in database
Existing identical key-value pair (Rn, Nn) is gathered (Rn, list(Nn)) and different key-value pair (Rm, Nm) gathered
(Rm, list(Nm)) and similar key-value pair (Ri, Ni) gathered (Ri, list(Ni)) be distributed to Hbase towards column
PostgreSQL database handled.
6. a kind of inter-network interconnected audio big data memory search method described in accordance with the claim 1, it is characterised in that: the sound
Frequency storing process is to the identical key-value pair set (Rn, list(Nn) after packing), different key-value pair set (Rm, list
(Nm)) and similar key-value pair set (Ri, list(Ni)) carry out classification processing, identical key-value pair set is deleted,
And by the identical key assignments stored in database exchange with to audio file then and different key-value pair set (Rm, list(Nm))
With similar key-value pair set (Ri, list(Ni)) it is stored in Hbase database, and result treatment result is passed through into display group
Part feeds back to user.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811604508.7A CN109726307A (en) | 2018-12-26 | 2018-12-26 | A kind of inter-network interconnected audio big data memory search method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811604508.7A CN109726307A (en) | 2018-12-26 | 2018-12-26 | A kind of inter-network interconnected audio big data memory search method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109726307A true CN109726307A (en) | 2019-05-07 |
Family
ID=66296484
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811604508.7A Pending CN109726307A (en) | 2018-12-26 | 2018-12-26 | A kind of inter-network interconnected audio big data memory search method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109726307A (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102915365A (en) * | 2012-10-24 | 2013-02-06 | 苏州两江科技有限公司 | Hadoop-based construction method for distributed search engine |
CN103455514A (en) * | 2012-06-01 | 2013-12-18 | 腾讯科技(深圳)有限公司 | Updating method and updating device for audio file |
CN104679892A (en) * | 2015-03-18 | 2015-06-03 | 成都影泰科技有限公司 | Medical image storing method |
CN104679847A (en) * | 2015-02-13 | 2015-06-03 | 王磊 | Method and equipment for building online real-time updating mass audio fingerprint database |
CN105117502A (en) * | 2015-10-13 | 2015-12-02 | 四川中科腾信科技有限公司 | Search method based on big data |
CN106407463A (en) * | 2016-10-11 | 2017-02-15 | 郑州云海信息技术有限公司 | Hadoop-based image processing method and system |
CN106776977A (en) * | 2016-12-06 | 2017-05-31 | 深圳前海勇艺达机器人有限公司 | Search for the method and device of music |
-
2018
- 2018-12-26 CN CN201811604508.7A patent/CN109726307A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103455514A (en) * | 2012-06-01 | 2013-12-18 | 腾讯科技(深圳)有限公司 | Updating method and updating device for audio file |
CN102915365A (en) * | 2012-10-24 | 2013-02-06 | 苏州两江科技有限公司 | Hadoop-based construction method for distributed search engine |
CN104679847A (en) * | 2015-02-13 | 2015-06-03 | 王磊 | Method and equipment for building online real-time updating mass audio fingerprint database |
CN104679892A (en) * | 2015-03-18 | 2015-06-03 | 成都影泰科技有限公司 | Medical image storing method |
CN105117502A (en) * | 2015-10-13 | 2015-12-02 | 四川中科腾信科技有限公司 | Search method based on big data |
CN106407463A (en) * | 2016-10-11 | 2017-02-15 | 郑州云海信息技术有限公司 | Hadoop-based image processing method and system |
CN106776977A (en) * | 2016-12-06 | 2017-05-31 | 深圳前海勇艺达机器人有限公司 | Search for the method and device of music |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Cook | Practical machine learning with H2O: powerful, scalable techniques for deep learning and AI | |
US20220342875A1 (en) | Data preparation context navigation | |
CN104424351B (en) | Across the method and system of the daily record data thesaurus of multiple storage device data storages | |
KR101740271B1 (en) | Method and device for constructing on-line real-time updating of massive audio fingerprint database | |
US11074242B2 (en) | Bulk data insertion in analytical databases | |
US20160203156A1 (en) | Method, apparatus and system for data analysis | |
US10963440B2 (en) | Fast incremental column store data loading | |
CN106970929B (en) | Data import method and device | |
CN103714096A (en) | Lucene-based inverted index system construction method and device, and Lucene-based inverted index system data processing method and device | |
CN101639859A (en) | Table classification device, table classification method, and table classification program | |
US11030172B2 (en) | Database archiving method and device for creating index information and method and device of retrieving archived database including index information | |
US10579616B2 (en) | Data search system, data search method, and program product | |
US20170109388A1 (en) | Signature-based cache optimization for data preparation | |
CN105117442B (en) | A kind of big data querying method based on probability | |
CN108009254A (en) | More indexing means and device, cloud system and computer-readable recording medium | |
JP4758429B2 (en) | Shared memory multiprocessor system and information processing method thereof | |
US20170359398A1 (en) | Efficient Sorting for a Stream Processing Engine | |
CN106407442A (en) | Massive text data processing method and apparatus | |
CN117235069A (en) | Index creation method, data query method, device, equipment and storage medium | |
CN109726307A (en) | A kind of inter-network interconnected audio big data memory search method | |
CN104537016B (en) | A kind of method and device of determining file place subregion | |
CN110659295A (en) | Method, apparatus and medium for recording valid data based on HAWQ | |
US8302045B2 (en) | Electronic device and method for inspecting electrical rules of circuit boards | |
CN114564501A (en) | Database data storage and query methods, devices, equipment and medium | |
JP2015176407A (en) | Search device, search method, search program and search data structure |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20190507 |