CN109726307A - A kind of inter-network interconnected audio big data memory search method - Google Patents

A kind of inter-network interconnected audio big data memory search method Download PDF

Info

Publication number
CN109726307A
CN109726307A CN201811604508.7A CN201811604508A CN109726307A CN 109726307 A CN109726307 A CN 109726307A CN 201811604508 A CN201811604508 A CN 201811604508A CN 109726307 A CN109726307 A CN 109726307A
Authority
CN
China
Prior art keywords
audio
database
segmentation
key
value pair
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811604508.7A
Other languages
Chinese (zh)
Inventor
黄健
程庚
陈雅芹
孙昊
殷国龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei Kaijie Technology Co Ltd
Original Assignee
Hefei Kaijie Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei Kaijie Technology Co Ltd filed Critical Hefei Kaijie Technology Co Ltd
Priority to CN201811604508.7A priority Critical patent/CN109726307A/en
Publication of CN109726307A publication Critical patent/CN109726307A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to a kind of inter-network interconnected audio big data memory search methods, specifically according to following operating procedure: the audio file wait store or retrieve being carried out upload and is loaded into database, the audio file for being loaded into database to upload again carries out type segmentation, then data analysis is carried out to the audio file after segmentation, the audio file after segmentation is on the one hand subjected to retrieval control in analysis, if there are identical segmentation files in database, delete operation then is executed to the audio after existing segmentation, or in database there are similar file then can the audio after existing segmentation execute priority of the setting priority greater than the former of storage operation and the latter, audio data is fed back after the completion of the above process executes, the identical and similar audio of database is carried out by data presentation by display component.The present invention is split division by uploading audio file to user, and the problem of analyzed, rejected and stored simultaneously, optimize existing database redundancy.

Description

A kind of inter-network interconnected audio big data memory search method
Technical field
The present invention relates to audio signal processing technique field, specifically a kind of inter-network interconnected audio big data memory scan side Method.
Background technique
Existing popular music listens song to know bent, humming and know the functions such as song and be substantially to extract the unique characteristic quantity of audio and known Not, in order to which the precision for improving identification will reservation audio frequency characteristics amount as much as possible.Usual a segment of audio has thousands of Characteristic quantity, after digital audio reaches certain amount grade, audio frequency characteristics amount needs sufficiently large database to store, and leads to data Library inquiry speed dramatic decrease.Simultaneously in audio data memory scan field, Finger print characteristic abstract has been achieved with matching algorithm Good effect, but its audio data processing experiment for applying in general to middle and small scale, are difficult to realize accurate search function It can, it is desirable to meet commercial standard (CS) and also need to optimize.
Summary of the invention
In order to make up the above deficiency, the present invention provides a kind of inter-network interconnected audio big data memory search methods, with solution The problems in certainly above-mentioned background technique.
The technical scheme is that a kind of inter-network interconnected audio big data memory search method, specifically according to following behaviour Make step: the audio file wait store or retrieve being subjected to upload and is loaded into database, then database is loaded into upload Audio file carries out type segmentation, then carries out data analysis to the audio file after segmentation, on the one hand will segmentation in analysis Audio file afterwards carries out retrieval control, if there are identical segmentation files in database, to the audio after existing segmentation Execute in delete operation or database there are similar file then can the audio after existing segmentation execute storage operation and after The setting priority of person is greater than the former priority, feeds back, that is, passes through to audio data after the completion of the above process executes The identical and similar audio of database is carried out data presentation by display component.
Preferably, the database uses the PostgreSQL database of the face Hbase column, and Hbase joined the concept of column family, number According to storing in the form of a table, table locks a storage unit by ranks, each data cell saves the multiple of same part data Version is indicated by timestamp, wherein being displaced mark in the alive table of row, as the major key of retrieval record, is arranged by column family and limit Fixed symbol mark, actually by going, column family, qualifier, which is displaced, determines storage unit.
Preferably, the audio is loaded into process and is uploaded to using the audio processing program of electric terminal to audio file Database.
Preferably, the audio segmentation process is to be split processing to the audio file after upload, by believing audio It number is divided into the piece split of fixed size, split is further then resolved into key-value pair (R1, N1) one by one, (R2, N2), (R3, N3) ... (Rn, Nn), data system is that each split distributes a map function, by key-value pair (R1, N1) ... Input obtains a series of intermediate results (Rn, Nn).
Preferably, the audio analysis process be to the file in (Rn, Nn) each of after audio segmentation and database into Row check analysis is gathered (Rn, list(Nn) to identical key-value pair (Rn, Nn) present in database) and different keys Value (Rm, list(Nm) is gathered to (Rm, Nm)) and similar key-value pair (Ri, Ni) gathered (Ri, list(Ni)) Hbase is distributed to be handled towards the PostgreSQL database of column.
Preferably, the audio storage process is to the identical key-value pair set (Rn, list(Nn) after packing), it is different Key-value pair set (Rm, list(Nm)) and similar key-value pair set (Ri, list(Ni)) carry out classification processing, will be identical Key-value pair set is deleted, and by the identical key assignments stored in database exchange with to audio file then and different key assignments To set (Rm, list(Nm)) and similar key-value pair set (Ri, list(Ni)) be stored in Hbase database, and will knot Fruit processing result feeds back to user by display component.
The advantage of the invention is that be split by uploading audio file to user, and by the file key assignments after segmentation, Then each group of key assignments and the PostgreSQL database of the face Hbase column are analyzed simultaneously, rejected and is stored, by identical key assignments It rejects, while calling the identical key assignments in database, then store to similar key assignments and different key assignments, finally feed back to use Family, the problem of optimizing existing database redundancy.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention without any creative labor, may be used also for those of ordinary skill in the art To obtain other drawings based on these drawings.
Fig. 1 is audio data storage and retrieval system flow chart of the present invention.
Specific embodiment
The present invention is further illustrated in the following with reference to the drawings and specific embodiments.
With reference to Fig. 1 it is found that a kind of inter-network interconnected audio big data memory search method of the present invention, specifically operates according to following Step: the audio file wait store or retrieve is subjected to upload and is loaded into database, then is loaded into the sound of database to upload Frequency file carries out type segmentation, then carries out data analysis to the audio file after segmentation, after in analysis, one side is by segmentation Audio file carry out retrieval control, if being held there are identical segmentation file to the audio after existing segmentation in database In row delete operation or database there are similar file then can the audio after existing segmentation execute storage operation and the latter Setting priority be greater than the former priority, audio data is fed back after the completion of the above process executes, i.e., by aobvious Show that the identical and similar audio of database is carried out data presentation by component.
Specifically, the database uses the PostgreSQL database of the face Hbase column, Hbase joined the concept of column family, number According to storing in the form of a table, table locks a storage unit by ranks, each data cell saves the multiple of same part data Version is indicated by timestamp, wherein being displaced mark in the alive table of row, as the major key of retrieval record, is arranged by column family and limit Fixed symbol mark, actually by going, column family, qualifier, which is displaced, determines storage unit.
Audio file is uploaded to using the audio processing program of electric terminal specifically, the audio is loaded into process Database.
Specifically, the audio segmentation process is to be split processing to the audio file after upload, by believing audio It number is divided into the piece split of fixed size, split is further then resolved into key-value pair (R1, N1) one by one, (R2, N2), (R3, N3) ... (Rn, Nn), data system is that each split distributes a map function, by key-value pair (R1, N1) ... Input obtains a series of intermediate results (Rn, Nn).
Specifically, the audio analysis process be to the file in (Rn, Nn) each of after audio segmentation and database into Row check analysis is gathered (Rn, list(Nn) to identical key-value pair (Rn, Nn) present in database) and different keys Value (Rm, list(Nm) is gathered to (Rm, Nm)) and similar key-value pair (Ri, Ni) gathered (Ri, list(Ni)) Hbase is distributed to be handled towards the PostgreSQL database of column.
Specifically, the audio storage process is to the identical key-value pair set (Rn, list(Nn) after packing), it is different Key-value pair set (Rm, list(Nm)) and similar key-value pair set (Ri, list(Ni)) carry out classification processing, will be identical Key-value pair set is deleted, and by the identical key assignments stored in database exchange with to audio file then and different key assignments To set (Rm, list(Nm)) and similar key-value pair set (Ri, list(Ni)) be stored in Hbase database, and will knot Fruit processing result feeds back to user by display component.
The present invention is split, and when uploading audio file processing to user by the file key assignments after segmentation first, after And each group of key assignments and the PostgreSQL database of the face Hbase column are analyzed simultaneously, rejected and stored, identical key assignments is picked It removes, while calling the identical key assignments in database, then store to similar key assignments and different key assignments, finally feed back to user, The problem of optimizing existing database redundancy.
The above shows and describes the basic principle, main features and advantages of the invention.The technology of the industry Personnel are it should be appreciated that the present invention is not limited to the above embodiments, and the above embodiments and description only describe this The principle of invention, without departing from the spirit and scope of the present invention, various changes and improvements may be made to the invention, these changes Change and improvement all fall within the protetion scope of the claimed invention.The claimed scope of the invention by appended claims and its Equivalent thereof.

Claims (6)

1. a kind of inter-network interconnected audio big data memory search method, it is characterised in that: specifically according to following operating procedure: will be to The audio file of storage or retrieval, which upload, is loaded into database, then the audio file for being loaded into database to upload carries out Type segmentation, then carries out data analysis to the audio file after segmentation, in analysis on the one hand by the audio file after segmentation Retrieval control is carried out, if there are identical segmentation files in database, delete operation is executed to the audio after existing segmentation, Or in database there are similar file then can the audio after existing segmentation execute storage operation and the setting of the latter is preferential Grade is greater than the former priority, feeds back after the completion of the above process executes to audio data, i.e., will be counted by display component Data presentation is carried out according to the identical and similar audio in library.
2. a kind of inter-network interconnected audio big data memory search method described in accordance with the claim 1, it is characterised in that: the number The PostgreSQL database of the face Hbase column is used according to library, Hbase joined the concept of column family, and data store in the form of a table, table by Ranks lock a storage unit, each data cell saves multiple versions of same part data, indicated by timestamp, It wherein is displaced mark in the alive table of row, as the major key of retrieval record, column are identified by column family and qualifier, actually by going, are arranged Race, qualifier, which is displaced, determines storage unit.
3. a kind of inter-network interconnected audio big data memory search method described in accordance with the claim 1, it is characterised in that: the sound Frequency loading process carries out audio file to be uploaded to database using the audio processing program of electric terminal.
4. a kind of inter-network interconnected audio big data memory search method described in accordance with the claim 1, it is characterised in that: the sound Frequency cutting procedure is to be split processing to the audio file after upload, by the piece that audio signal is divided into fixed size Split is then further resolved into key-value pair (R1, N1) one by one by split, (R2, N2), (R3, N3) ... (Rn, Nn), Data system is that each split distributes a map function, and by key-value pair (R1, N1) ..., input obtains a series of intermediate results (Rn, Nn).
5. a kind of inter-network interconnected audio big data memory search method described in accordance with the claim 1, it is characterised in that: the sound Frequency analysis process is to carry out check analysis to the file in (Rn, Nn) each of after audio segmentation and database, in database Existing identical key-value pair (Rn, Nn) is gathered (Rn, list(Nn)) and different key-value pair (Rm, Nm) gathered (Rm, list(Nm)) and similar key-value pair (Ri, Ni) gathered (Ri, list(Ni)) be distributed to Hbase towards column PostgreSQL database handled.
6. a kind of inter-network interconnected audio big data memory search method described in accordance with the claim 1, it is characterised in that: the sound Frequency storing process is to the identical key-value pair set (Rn, list(Nn) after packing), different key-value pair set (Rm, list (Nm)) and similar key-value pair set (Ri, list(Ni)) carry out classification processing, identical key-value pair set is deleted, And by the identical key assignments stored in database exchange with to audio file then and different key-value pair set (Rm, list(Nm)) With similar key-value pair set (Ri, list(Ni)) it is stored in Hbase database, and result treatment result is passed through into display group Part feeds back to user.
CN201811604508.7A 2018-12-26 2018-12-26 A kind of inter-network interconnected audio big data memory search method Pending CN109726307A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811604508.7A CN109726307A (en) 2018-12-26 2018-12-26 A kind of inter-network interconnected audio big data memory search method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811604508.7A CN109726307A (en) 2018-12-26 2018-12-26 A kind of inter-network interconnected audio big data memory search method

Publications (1)

Publication Number Publication Date
CN109726307A true CN109726307A (en) 2019-05-07

Family

ID=66296484

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811604508.7A Pending CN109726307A (en) 2018-12-26 2018-12-26 A kind of inter-network interconnected audio big data memory search method

Country Status (1)

Country Link
CN (1) CN109726307A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102915365A (en) * 2012-10-24 2013-02-06 苏州两江科技有限公司 Hadoop-based construction method for distributed search engine
CN103455514A (en) * 2012-06-01 2013-12-18 腾讯科技(深圳)有限公司 Updating method and updating device for audio file
CN104679892A (en) * 2015-03-18 2015-06-03 成都影泰科技有限公司 Medical image storing method
CN104679847A (en) * 2015-02-13 2015-06-03 王磊 Method and equipment for building online real-time updating mass audio fingerprint database
CN105117502A (en) * 2015-10-13 2015-12-02 四川中科腾信科技有限公司 Search method based on big data
CN106407463A (en) * 2016-10-11 2017-02-15 郑州云海信息技术有限公司 Hadoop-based image processing method and system
CN106776977A (en) * 2016-12-06 2017-05-31 深圳前海勇艺达机器人有限公司 Search for the method and device of music

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103455514A (en) * 2012-06-01 2013-12-18 腾讯科技(深圳)有限公司 Updating method and updating device for audio file
CN102915365A (en) * 2012-10-24 2013-02-06 苏州两江科技有限公司 Hadoop-based construction method for distributed search engine
CN104679847A (en) * 2015-02-13 2015-06-03 王磊 Method and equipment for building online real-time updating mass audio fingerprint database
CN104679892A (en) * 2015-03-18 2015-06-03 成都影泰科技有限公司 Medical image storing method
CN105117502A (en) * 2015-10-13 2015-12-02 四川中科腾信科技有限公司 Search method based on big data
CN106407463A (en) * 2016-10-11 2017-02-15 郑州云海信息技术有限公司 Hadoop-based image processing method and system
CN106776977A (en) * 2016-12-06 2017-05-31 深圳前海勇艺达机器人有限公司 Search for the method and device of music

Similar Documents

Publication Publication Date Title
Cook Practical machine learning with H2O: powerful, scalable techniques for deep learning and AI
US20220342875A1 (en) Data preparation context navigation
CN104424351B (en) Across the method and system of the daily record data thesaurus of multiple storage device data storages
KR101740271B1 (en) Method and device for constructing on-line real-time updating of massive audio fingerprint database
US11074242B2 (en) Bulk data insertion in analytical databases
US20160203156A1 (en) Method, apparatus and system for data analysis
US10963440B2 (en) Fast incremental column store data loading
CN106970929B (en) Data import method and device
CN103714096A (en) Lucene-based inverted index system construction method and device, and Lucene-based inverted index system data processing method and device
CN101639859A (en) Table classification device, table classification method, and table classification program
US11030172B2 (en) Database archiving method and device for creating index information and method and device of retrieving archived database including index information
US10579616B2 (en) Data search system, data search method, and program product
US20170109388A1 (en) Signature-based cache optimization for data preparation
CN105117442B (en) A kind of big data querying method based on probability
CN108009254A (en) More indexing means and device, cloud system and computer-readable recording medium
JP4758429B2 (en) Shared memory multiprocessor system and information processing method thereof
US20170359398A1 (en) Efficient Sorting for a Stream Processing Engine
CN106407442A (en) Massive text data processing method and apparatus
CN117235069A (en) Index creation method, data query method, device, equipment and storage medium
CN109726307A (en) A kind of inter-network interconnected audio big data memory search method
CN104537016B (en) A kind of method and device of determining file place subregion
CN110659295A (en) Method, apparatus and medium for recording valid data based on HAWQ
US8302045B2 (en) Electronic device and method for inspecting electrical rules of circuit boards
CN114564501A (en) Database data storage and query methods, devices, equipment and medium
JP2015176407A (en) Search device, search method, search program and search data structure

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190507