CN107368609B - Obtain the method, apparatus and computer readable storage medium of multimedia file - Google Patents
Obtain the method, apparatus and computer readable storage medium of multimedia file Download PDFInfo
- Publication number
- CN107368609B CN107368609B CN201710679015.9A CN201710679015A CN107368609B CN 107368609 B CN107368609 B CN 107368609B CN 201710679015 A CN201710679015 A CN 201710679015A CN 107368609 B CN107368609 B CN 107368609B
- Authority
- CN
- China
- Prior art keywords
- note
- subsequence
- multimedia file
- notes
- multiplicity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 230000003252 repetitive effect Effects 0.000 claims abstract description 30
- 239000011159 matrix material Substances 0.000 claims description 63
- 238000000605 extraction Methods 0.000 claims description 4
- 238000005516 engineering process Methods 0.000 abstract description 4
- 238000004891 communication Methods 0.000 abstract description 3
- 239000012634 fragment Substances 0.000 description 8
- 238000010586 diagram Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 6
- 230000008859 change Effects 0.000 description 5
- 239000000284 extract Substances 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 230000001960 triggered effect Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 108010076504 Protein Sorting Signals Proteins 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/63—Querying
- G06F16/632—Query formulation
- G06F16/634—Query by example, e.g. query by humming
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/54—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for retrieval
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Health & Medical Sciences (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention discloses a kind of method, apparatus and computer readable storage medium obtaining multimedia file, belong to network communication technology field.Method includes:The reference sequence of notes of the voice signal of acquisition is extracted, this includes multiple notes with reference to sequence of notes;For any multimedia file in multimedia file library, when the sequence of notes of any multimedia file has repetitive structure, the benchmark note subsequence of any multimedia file is obtained, the number for the note which includes at least one note and the benchmark note sub-series of packets includes is less than the number for the note that any multimedia file includes;The benchmark note subsequence that sequence of notes and any multimedia file are referred to according to this, determines the matching degree between the voice signal and any multimedia file;According to the matching degree between the voice signal and any multimedia file, the destination multimedia file that matching degree meets preset condition is obtained from multimedia file library.The present invention provides efficiency.
Description
Technical field
The present invention relates to network communication technology field, more particularly to a kind of method, apparatus and meter obtaining multimedia file
Calculation machine readable storage medium storing program for executing.
Background technology
Currently, most of terminal all supports music software, and most of music software all has the function of that song is listened to know song;
When user does not know title of the song, user can groan out the melody for the song for wanting search against terminal, and terminal is bent by listening song to know
Function, the corresponding song of the melody is searched out from multimedia server.
When terminal searches for the corresponding song of the melody from multimedia server, terminal acquires voice letter input by user
Number, send the voice signal to multimedia server;Multimedia server receives the voice signal, extracts the sound of the voice signal
High sequence calculates the matching degree between the pitch sequence of each song in the pitch sequence and library, according to the pitch sequence
Matching degree between the pitch sequence of each song selects the highest song of matching degree from library, and being sent to terminal should
The song of selection.
In the implementation of the present invention, the inventor finds that the existing technology has at least the following problems:
Since the duration of a song is generally at 4 minutes or so, the pitch sequence of a song includes more than 100
Pitch, multimedia server calculate between the pitch sequence of each song in the pitch sequence and library of the voice signal
For matching degree than relatively time-consuming, the efficiency that song is obtained so as to cause terminal is low.
Invention content
In order to solve problems in the prior art, the present invention provides a kind of method, apparatus and meter obtaining multimedia file
Calculation machine readable storage medium storing program for executing.Technical solution is as follows:
On the one hand, the present invention provides it is a kind of obtain multimedia file method, the method includes:
The reference sequence of notes of the voice signal of acquisition is extracted, the reference sequence of notes includes multiple notes;
For any multimedia file in multimedia file library, when the sequence of notes of any multimedia file has
When repetitive structure, the benchmark note subsequence of any multimedia file is obtained, the benchmark note sub-series of packets includes at least
One note, and the number of note that the benchmark note sub-series of packets includes is less than the note that any multimedia file includes
Number;
According to described with reference to sequence of notes and the benchmark note subsequence of any multimedia file, the voice is determined
Matching degree between signal and any multimedia file;
According to the matching degree between the voice signal and any multimedia file, from the multimedia file library
Obtain the destination multimedia file that matching degree meets preset condition.
In one possible implementation, the benchmark note subsequence for obtaining any multimedia file it
Before, the method further includes:
The sequence of notes of any multimedia file is divided into multiple note subsequences, each note subsequence includes
At least one note;
Based on default multiplicity algorithm, the multiplicity between each note subsequence is determined;
If the multiplicity between each note subsequence is more than default multiplicity, any multimedia text is determined
The sequence of notes of part has repetitive structure.
In one possible implementation, described based on default multiplicity algorithm, determine each note subsequence
Between multiplicity, including:
Based on similar matrix algorithm, at least one similar matrix between each note subsequence is determined, according to every
A similar matrix determines the characteristic value of each similar matrix, according to the characteristic value of each similar matrix, determine described in
Multiplicity between each note subsequence;Alternatively,
Based on cross correlation algorithm, at least one cross correlation measure between each note subsequence is determined, according to each
Cross correlation measure determines the multiplicity between each note subsequence;Alternatively,
Based on editing distance algorithm, determine that at least one editing distance between each note subsequence, root are each
Editing distance determines the multiplicity between each note subsequence;Alternatively,
Based on EMD distance algorithms, at least one EMD distances between each note subsequence are determined, according to each
EMD distances determine the multiplicity between each note subsequence.
In one possible implementation, the benchmark note subsequence for obtaining any multimedia file, packet
It includes:
A note subsequence is randomly choosed from the multiple note subsequence as any multimedia file
Benchmark note subsequence;Alternatively,
It includes the most note subsequence of note number as described any that one is selected from the multiple note subsequence
The benchmark note subsequence of multimedia file;Alternatively,
It includes the minimum note subsequence of note number as described any that one is selected from the multiple note subsequence
The benchmark note subsequence of multimedia file.
In one possible implementation, the intersection between two neighboring note subsequence includes preset number sound
Symbol, the preset number are more than or equal to 0, and less than the integer of specified numerical value, the specified numerical value is described any more
Media file includes the quotient of the number of note and the number of the note subsequence of division.
In one possible implementation, the note includes pitch and/or the duration of a sound, and the pitch is the note
Relative pitch between perfect pitch or two neighboring note.
On the other hand, the present invention provides a kind of device obtaining multimedia file, described device includes:
Extraction module, the reference sequence of notes of the voice signal for extracting acquisition, the reference sequence of notes includes more
A note;
First acquisition module is used for for any multimedia file in multimedia file library, when any multimedia
When the sequence of notes of file has repetitive structure, the benchmark note subsequence of any multimedia file, the benchmark are obtained
Note subsequence includes at least one note, and the number of note that includes of the benchmark note sub-series of packets be less than it is described any more
The number for the note that media file includes;
Determining module, for according to described with reference to sequence of notes and the sub- sequence of benchmark note of any multimedia file
Row, determine the matching degree between the voice signal and any multimedia file;
Second acquisition module, for according to the matching degree between the voice signal and any multimedia file, from
The destination multimedia file that matching degree meets preset condition is obtained in the multimedia file library.
In one possible implementation, described device further includes:
Division module, for the sequence of notes of any multimedia file to be divided into multiple note subsequences, each
Note subsequence includes at least one note;
The determining module is additionally operable to, based on default multiplicity algorithm, determine the weight between each note subsequence
Multiplicity;
The determining module, if the multiplicity being additionally operable between each note subsequence is more than default multiplicity,
Determine that the sequence of notes of any multimedia file has repetitive structure.
In one possible implementation, the determining module is additionally operable to be based on similar matrix algorithm, determines described every
At least one similar matrix between a note subsequence determines the spy of each similar matrix according to each similar matrix
Value indicative determines the multiplicity between each note subsequence according to the characteristic value of each similar matrix;Alternatively,
The determining module is additionally operable to be based on cross correlation algorithm, determines at least one between each note subsequence
A cross correlation measure determines the multiplicity between each note subsequence according to each cross correlation measure;Alternatively,
The determining module is additionally operable to be based on editing distance algorithm, determine between each note subsequence at least
One editing distance, each editing distance of root determine the multiplicity between each note subsequence;Alternatively,
The determining module is additionally operable to be based on EMD distance algorithms, determines at least one between each note subsequence
A EMD distances determine the multiplicity between each note subsequence according to each EMD distances.
In one possible implementation, first acquisition module is additionally operable to from the multiple note subsequence
Randomly choose benchmark note subsequence of the note subsequence as any multimedia file;Alternatively,
First acquisition module is additionally operable to select one from the multiple note subsequence to include that note number is most
Benchmark note subsequence of the note subsequence as any multimedia file;Alternatively,
First acquisition module is additionally operable to select one from the multiple note subsequence to include that note number is minimum
Benchmark note subsequence of the note subsequence as any multimedia file.
In one possible implementation, the intersection between two neighboring note subsequence includes preset number sound
Symbol, the preset number are more than or equal to 0, and less than the integer of specified numerical value, the specified numerical value is described any more
Media file includes the quotient of the number of note and the number of the note subsequence of division.
In one possible implementation, the note includes pitch and/or the duration of a sound, and the pitch is the note
Relative pitch between perfect pitch or two neighboring note.
On the other hand, the present invention provides a kind of device obtaining multimedia file, described device includes:It processor and deposits
Reservoir is stored at least one instruction in the memory, and described instruction is loaded by the processor and executed to realize first
Aspect any one of them method.
On the other hand, the present invention provides a kind of computer readable storage mediums, which is characterized in that described computer-readable
At least one instruction is stored in storage medium, described instruction is loaded by processor and executed to realize any one of first aspect institute
The method stated.
The advantageous effect that technical solution provided in an embodiment of the present invention is brought is:There is repetitive structure for sequence of notes
Multimedia file obtains the benchmark note subsequence of the multimedia file, more with this according to the reference sequence of notes of voice signal
The benchmark note subsequence of media file, determines the matching degree between the voice signal and the multimedia file, is based on matching degree,
The destination multimedia file that matching degree meets preset condition is obtained from multimedia file library.Due to the benchmark of the multimedia file
The number for the note that note subsequence includes is less than the number for the note that the multimedia file includes, therefore in the embodiment of the present invention
In, according to the benchmark note subsequence of the reference sequence of notes of voice signal and the multimedia file, determine the voice signal with
Matching degree between the multimedia file reduces and calculates the time, improves the efficiency for obtaining multimedia file.
Description of the drawings
Fig. 1 is a kind of schematic diagram of implementation environment provided in an embodiment of the present invention;
Fig. 2 is a kind of method flow diagram obtaining multimedia file provided in an embodiment of the present invention;
Fig. 3 is a kind of method flow diagram obtaining multimedia file provided in an embodiment of the present invention;
Fig. 4 is a kind of method flow diagram obtaining multimedia file provided in an embodiment of the present invention;
Fig. 5 is a kind of apparatus structure schematic diagram obtaining multimedia file provided in an embodiment of the present invention;
Fig. 6 is a kind of block diagram of multimedia server provided in an embodiment of the present invention.
Specific implementation mode
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached drawing to embodiment party of the present invention
Formula is described in further detail.
In the prior art, multimedia server is in the voice signal based on acquisition, when recommending multimedia file for user,
Multimedia server calculates the sound of the pitch sequence and each multimedia file in multimedia file library of the voice signal of the acquisition
Matching degree between high sequence, the pitch sequence of the pitch sequence of the voice signal based on the acquisition and each multimedia file it
Between matching degree, for user recommend multimedia file.However since the pitch that the pitch sequence of multimedia file includes is more, because
This multimedia file calculates between the pitch sequence of the voice signal of the acquisition and the pitch sequence of each multimedia file
It is low so as to cause the efficiency for obtaining multimedia file with degree than relatively time-consuming.
In order to improve the efficiency for obtaining multimedia file;In embodiments of the present invention, have for sequence of notes and repeat to tie
The multimedia file of structure, the extraction unit dieresis sequence from the multimedia file for ease of description claim part sequence of notes
On the basis of note subsequence, between the reference sequence of notes for directly calculating the voice signal of the benchmark note subsequence and acquisition
Matching degree, the number of the note included due to benchmark note sub-series of packets are less than the number for the note that the multimedia file includes, from
And reduce and calculate the time, improve the efficiency for obtaining multimedia file.
Fig. 1 is a kind of implementation environment provided in an embodiment of the present invention, and referring to Fig. 1, which includes terminal 101 and more
Media server 102.It is connected by communication network between terminal 101 and multimedia server 102.
Wherein, 102 associated application of multimedia server is run in terminal 101, can be taken with multimedia by the application
It is interacted between business device 102.For example, terminal 101 logs in the application based on user identifier or directly logs in the application, to
It is interacted with multimedia server 102.The application can be a variety of applications such as voice applications or Video Applications.The user marks
It is user account, telephone number etc. to know, and it is not limited in the embodiment of the present invention.
Terminal 101 can be mobile phone terminal 101, PAD (portable android device, tablet computer) terminal 101
Or computer terminal 101 etc..Multimedia server 102 can be a multimedia server 102, or by several multimedias
102 cluster of multimedia server or 102 center of cloud computing multimedia server that server 102 forms, the disclosure
Embodiment does not limit this;Multimedia server 102 can be video server or audio server.
An embodiment of the present invention provides a kind of methods obtaining multimedia file, in this method application of multimedia server,
Referring to Fig. 2, this method includes:
Step 201:The reference sequence of notes of the voice signal of acquisition is extracted, this includes multiple notes with reference to sequence of notes.
Step 202:For any multimedia file in multimedia file library, when the note sequence of any multimedia file
Row have repetitive structure when, obtain the benchmark note subsequence of any multimedia file, the benchmark note sub-series of packets include to
A few note, and the number of note that the benchmark note sub-series of packets includes is less than the note that any multimedia file includes
Number.
Step 203:The benchmark note subsequence that sequence of notes and any multimedia file are referred to according to this, determines the language
Matching degree between sound signal and any multimedia file.
Step 204:According to the matching degree between the voice signal and any multimedia file, from the multimedia file library
The middle destination multimedia file for obtaining matching degree and meeting preset condition.
It in one possible implementation, should before the benchmark note subsequence of the acquisition any multimedia file
Method further includes:
The sequence of notes of any multimedia file is divided into multiple note subsequences, each note subsequence includes extremely
A few note;
Based on default multiplicity algorithm, the multiplicity between each note subsequence is determined;
If the multiplicity between each note subsequence is more than default multiplicity, any multimedia file is determined
Sequence of notes has repetitive structure.
In one possible implementation, it should be determined between each note subsequence based on default multiplicity algorithm
Multiplicity, including:
Based on similar matrix algorithm, at least one similar matrix between each note subsequence is determined, according to each
Similar matrix determines the characteristic value of each similar matrix, according to the characteristic value of each similar matrix, determines each note
Multiplicity between subsequence;Alternatively,
Based on cross correlation algorithm, at least one cross correlation measure between each note subsequence is determined, according to each mutual
The degree of correlation determines the multiplicity between each note subsequence;Alternatively,
Based on editing distance algorithm, determine that at least one editing distance between each note subsequence, root are each compiled
Distance is collected, determines the multiplicity between each note subsequence;Alternatively,
Based on EMD distance algorithms, at least one EMD distances between each note subsequence are determined, according to each EMD
Distance determines the multiplicity between each note subsequence.
In one possible implementation, the benchmark note subsequence of the acquisition any multimedia file, including:
Benchmark of the note subsequence as any multimedia file is randomly choosed from multiple note subsequence
Note subsequence;Alternatively,
It includes the most note subsequence of note number as any more matchmakers that one is selected from multiple note subsequence
The benchmark note subsequence of body file;Alternatively,
It includes the minimum note subsequence of note number as any more matchmakers that one is selected from multiple note subsequence
The benchmark note subsequence of body file.
In one possible implementation, the intersection between two neighboring note subsequence includes preset number sound
Symbol, the preset number are more than or equal to 0, and less than the integer of specified numerical value, which is any multimedia text
Part includes the quotient of the number of note and the number of the note subsequence of division.
In one possible implementation, which includes pitch and/or the duration of a sound, which is the absolute sound of the note
Relative pitch between high or two neighboring note.
In embodiments of the present invention, the multimedia file for sequence of notes with repetitive structure obtains multimedia text
The benchmark note subsequence of part, according to the benchmark note subsequence of the reference sequence of notes of voice signal and the multimedia file,
It determines the matching degree between the voice signal and the multimedia file, is based on matching degree, matching is obtained from multimedia file library
Degree meets the destination multimedia file of preset condition.Due to the number for the note that the benchmark note sub-series of packets of the multimedia file includes
Mesh is less than the number for the note that the multimedia file includes, therefore in embodiments of the present invention, according to the reference sound of voice signal
The benchmark note subsequence for according with sequence and the multimedia file, determines the matching between the voice signal and the multimedia file
Degree reduces and calculates the time, improves the efficiency for obtaining multimedia file.
Before obtaining multimedia file, multimedia server is it needs to be determined that multimedia file in multimedia file library
Whether sequence of notes has repetitive structure;If the sequence of notes of multimedia file has repetitive structure, just according to of the invention real
The method for applying example offer obtains multimedia file.Referring to Fig. 3, this method includes:
Step 301:For any multimedia file in multimedia file library, multimedia server is by any multimedia
The sequence of notes of file is divided into multiple note subsequences, and each note subsequence includes at least one note.
Sequence of notes includes pitch and/or the duration of a sound;Pitch can be perfect pitch, or between two neighboring pitch
Relative pitch.Correspondingly, each note subsequence includes the duration of a sound of at least one pitch and/or each pitch.Multimedia takes
The number for multiple note subsequences that business device divides the sequence of notes of any multimedia file can be any number more than 2
Value;The number of notes that each note subsequence includes can be identical, can not also be identical.Also, two neighboring note subsequence
Between intersection can be with preset number note.Wherein, preset number be more than or equal to 0, and it is whole less than specified numerical value
Number.Specified numerical value is the quotient that any multimedia file includes the number of note and the number of the note subsequence of division.It is multiple
The union of note subsequence is equal to the sequence of notes of any multimedia file.
It should be noted that in order to improve accuracy, the intersection between two neighboring note subsequence is not empty set, namely
Two neighboring note subsequence includes several identical notes.For example, the sequence of notes of any multimedia file includes N number of sound
Height, then sequence of notes M=[m1 m2 m3 …… mN-1 mN];The sequence of notes is divided into two notes by multimedia server
Subsequence, respectively X1 and X2, and X1=[m1 m2 m3 …… mN/2+K-1 mN/2+K], X2=[mN/2-K mN/2-K+1 mN/2-K+2
…… mN-1 mN].Two sequence of notes include identical K note, the value range of K be [0, N/2).
Step 302:Multimedia server is based on default multiplicity algorithm, determines the repetition between each note subsequence
Degree.
The default multiplicity algorithm can be similar matrix algorithm, cross correlation algorithm, editing distance algorithm or EMD distances
Algorithm etc..When default multiplicity algorithm is similar matrix algorithm, this step can be realized by following first way.When default
Multiplicity algorithm is cross correlation algorithm, this step can be realized by the following second way.When default multiplicity algorithm is to compile
Distance algorithm is collected, this step can be realized by the third following mode.When default multiplicity algorithm is EMD (earth
Mover's distance) distance algorithm, this step can be realized by following 4th kind of mode.
For the first realization method, this step can be realized by following steps (1) to (3), including:
(1):Multimedia server is based on similar matrix algorithm, determines at least one similar between each note subsequence
Matrix.
Multimedia server determines that at least one set of note subsequence, every group of note subsequence include two note subsequences,
The similar matrix between every group of note subsequence is calculated by similar matrix algorithm, obtains at least one similar matrix.
Wherein, for one group of note subsequence, multimedia server calculates this group of note subsequence by following formula one
Between similar matrix.
Formula one:
Wherein, XmAnd XnTwo note subsequences that respectively one group of note subsequence includes.xmiFor note subsequence Xm
In i-th of note, xnjFor note subsequence XnIn j-th of note.cmn[i] [j] is note subsequence XmAnd XnBetween
Similar matrix.
Wherein, for multimedia server when determining at least one set of note subsequence, multimedia server can be by adjacent two
A note subsequence is determined as one group of note subsequence, or any two note subsequence is determined as one group of sub- sequence of note
Row.
For example, any multimedia file is divided into two note subsequences, respectively note by multimedia server
Sequence X 1 and note subsequence X2;And note subsequence X1 and note subsequence X2 only include pitch.Wherein, note subsequence
X1=[52 53 54 55 56 57 58], note subsequence X2=[50 51 52 53 54 55 50 57 58].
Correspondingly, similar matrix algorithm is
It can show that the similar matrix between note subsequence X1 and X2 is based on the similar matrix algorithm:
(2):Multimedia server determines the characteristic value of each similar matrix according to each similar matrix.
In this step, for each similar matrix, note can be repeated the length of longest segment by multimedia server
Characteristic value as the similar matrix.Multimedia server can also regard the sum of length of note repeated fragment as the similar square
The characteristic value of battle array.Multimedia server can also by the maximum length of the sum of length of note repeated fragment on multiple diagonal lines it
With the characteristic value as the similar matrix.
It should be noted that the number of the continuous numerical value for not being 0 is that the note repeats on diagonal line in the similar matrix
The length of segment.
For example, when note is repeated the length of longest segment as when the characteristic value of the similar matrix by multimedia server,
Then note repeated fragment is [1 23 4], [1 2], [1 1] in the above similar matrix.The length of note repeated fragment is respectively
4,2 and 2, then multimedia server determine the similar matrix characteristic value be 4.
For another example, when multimedia server is by characteristic value of the sum of the length of note repeated fragment as the similar matrix,
The sum of length of note repeated fragment of the above similar matrix is 4+2+2=8;Then multimedia server determines the similar matrix
Characteristic value is 8.
For another example, when multimedia server is by the sum of the maximum length of the sum of length of note repeated fragment on multiple diagonal lines
When characteristic value as the similar matrix, the sum of length of note repeated fragment on two above diagonal line is respectively 4+2=6
And 2, then multimedia server determine the similar matrix characteristic value be 6.
(3):Multimedia server determines the repetition between each note subsequence according to the characteristic value of each similar matrix
Degree.
Multimedia server selects minimal eigenvalue from the characteristic value of each similar matrix, using the minimal eigenvalue as
Multiplicity between each note subsequence.
It should be noted that multimedia server can also select maximum feature from the characteristic value of each similar matrix
Value, using the maximum eigenvalue as the multiplicity between each note subsequence.Alternatively, multimedia server is to each similar square
The characteristic value of battle array is weighted, and obtains the multiplicity between each note subsequence.
For second of realization method, this step can be:
Multimedia server is based on cross correlation algorithm, determines at least one cross correlation measure between each note subsequence,
The multiplicity between each note subsequence is determined according to each cross correlation measure.
Multimedia server determines that at least one set of note subsequence, every group of note subsequence include two note subsequences,
The cross correlation measure between every group of note subsequence is calculated by cross correlation algorithm, obtains at least one cross correlation measure;From it is each mutually
Minimum cross correlation measure is selected in the degree of correlation, which is determined as the multiplicity between each note subsequence.
It should be noted that multimedia server can also select maximum cross correlation measure from multiple cross correlation measures, by this
Maximum cross correlation measure is determined as the multiplicity between each note subsequence.Alternatively, multimedia server is to each cross correlation measure
It is weighted, obtains the multiplicity between each note subsequence.
Wherein, for one group of note subsequence, multimedia server calculates this group of note subsequence by following formula two
Between cross correlation measure.
Formula two:
Wherein, XmAnd XnTwo note subsequences that respectively one group of note subsequence includes.xm(j) it is the sub- sequence of note
Arrange XmIn j-th of note, yn(j-i) it is note subsequence XnIn jth-i notes, cmn(i, j) is note subsequence XmWith
XnBetween cross correlation measure.
Equally, for multimedia server when determining at least one set of note subsequence, multimedia server can be by adjacent two
A note subsequence is determined as one group of note subsequence, or any two note subsequence is determined as one group of sub- sequence of note
Row.
For the third realization method, this step can be:
Multimedia server is based on editing distance algorithm, determine at least one editor between each note subsequence away from
From determining the multiplicity between each note subsequence according to each editing distance.
Multimedia server determines that at least one set of note subsequence, every group of note subsequence include two note subsequences,
By editing distance algorithm, the editing distance between every group of note subsequence is calculated, obtains at least one editing distance;From each
Smallest edit distance is selected in editing distance, which is determined as the multiplicity between each note subsequence.
It should be noted that multimedia server can also select maximum editing distance from multiple editing distances, by this
Maximum editing distance is determined as the multiplicity between each note subsequence.Alternatively, multimedia server is to each editing distance
It is weighted, obtains the multiplicity between each note subsequence.
Wherein, for one group of note subsequence, multimedia server calculates this group of note subsequence by following formula three
Between editing distance.
Formula three:
Wherein, XmAnd XnTwo note subsequences that respectively one group of note subsequence includes.cmn[i] [j] is two
Note subsequence XmAnd XnBetween editing distance, i be note subsequence XmIn note number, j be note subsequence XnIn
Note number.A, b and c is respectively weighting coefficient.And a, b and c can be configured and change as needed, in this hair
In bright embodiment, a, b and c are not especially limited.Also, a, the magnitude relationship between b and c can also be arbitrarily arranged.For
Raising accuracy, generally takes a>B, c>Magnitude relationship between b, a and c is not construed as limiting.
Equally, for multimedia server when determining at least one set of note subsequence, multimedia server can be by adjacent two
A note subsequence is determined as one group of note subsequence, or any two note subsequence is determined as one group of sub- sequence of note
Row.
For the 4th kind of realization method, this step can be:
Multimedia server is based on EMD distance algorithms, determines at least one EMD distances between each note subsequence,
The multiplicity between each note subsequence is determined according to each EMD distances.
Multimedia server determines that at least one set of note subsequence, every group of note subsequence include two note subsequences,
By EMD distance algorithms, the EMD distances between every group of note subsequence are calculated, obtain at least one EMD distances;From each EMD
Minimum EMD distances are selected in distance, and minimum EMD distances are determined as the multiplicity between each note subsequence.
It should be noted that multimedia server can also select maximum EMD distances from multiple EMD distances, most by this
Big EMD distances are determined as the multiplicity between each note subsequence.Alternatively, multimedia server carries out each EMD distances
Ranking operation obtains the multiplicity between each note subsequence.
Equally, for multimedia server when determining at least one set of note subsequence, multimedia server can be by adjacent two
A note subsequence is determined as one group of note subsequence, or any two note subsequence is determined as one group of sub- sequence of note
Row.
It should be noted that default multiplicity algorithm can also be longest common subsequence or Dynamic Time
Scaling, Earth Mover's Distance etc..Also, multimedia server determines the weight between each note subsequence
When multiplicity, the repetition between each note subsequence can be determined in conjunction with one or more of above four kinds of realization methods
Degree.When the multiplicity between each note subsequence of a variety of determinations in the above four kinds of realization methods of combination, it is based on each reality
The multiplicity that existing mode obtains is weighted, and obtains the multiplicity between each note subsequence.
For example, multimedia server combines the first realization method and second of realization method, the sub- sequence of each note is determined
Multiplicity between row, then multimedia server be based on similar matrix algorithm, determine the similar square between each note subsequence
Battle array, according to the similar matrix, determines the characteristic value of the similar matrix;Multimedia server is based on cross correlation algorithm, determines each
Cross correlation measure between note subsequence, the cross correlation measure between characteristic value and each note subsequence to the similar matrix
It is weighted, obtains the multiplicity between each note subsequence.
Step 303:Multimedia server determines whether the multiplicity between each note subsequence is more than default multiplicity,
If the multiplicity is more than default multiplicity, determine that the sequence of notes of any multimedia file has repetitive structure.
If the multiplicity is not more than default multiplicity, determines that the sequence of notes of any multimedia file does not have and repeat
Structure.Wherein, default multiplicity can be configured and change as needed, in embodiments of the present invention, to presetting multiplicity
It is not especially limited;For example, default multiplicity can be 8 or 5 etc..
Step 304:Multimedia server selects a note from multiple note subsequences of any multimedia file
Benchmark note subsequence of the subsequence as any multimedia file.
In this step, multimedia server is in the benchmark note subsequence for determining any multimedia file, in order to
It improves and determines efficiency, a note subsequence can be randomly choosed from multiple note subsequences of any multimedia file and is made
For the benchmark note subsequence of any multimedia file.
In order to improve the follow-up accuracy for obtaining multimedia file, multimedia server can be from multiple note subsequence
It is middle to select one to include benchmark note subsequence of the most note subsequence of note number as any multimedia file.
In order to improve the follow-up efficiency for obtaining multimedia file, multimedia server can be from multiple note subsequence
It includes benchmark note subsequence of the minimum note subsequence of note number as any multimedia file to select one.
Step 305:Multimedia server binds the base of the mark and any multimedia file of any multimedia file
Correspondence between quasi- note subsequence.
Multimedia server binds the mark of any multimedia file and benchmark note of any multimedia file
Correspondence between sequence, when in order to subsequent multimedia server search multimedia file, from the mark of multimedia file
Benchmark note subsequence with multimedia file is obtained in the correspondence of benchmark note subsequence, is based on benchmark note subsequence
It is retrieved.
It should be noted that multimedia server by each multimedia file in multimedia file library by walking above
Rapid 301-304 determines the benchmark note subsequence of each multimedia file.Do not have the more of repetitive structure for sequence of notes
Media file, multimedia server bind the correspondence between the mark of the multimedia file and the sequence of notes of the multimedia file
Relationship.
In embodiments of the present invention, before obtaining multimedia file, multimedia server determines the note of multimedia file
Whether sequence has repetitive structure;If having repetitive structure, the mark and the multimedia file of the multimedia file are bound
Correspondence between benchmark note subsequence, when in order to subsequent multimedia server search multimedia file, from multimedia
The benchmark note subsequence that multimedia file is obtained in the mark of file and the correspondence of benchmark note subsequence, is based on benchmark
Note subsequence is retrieved, to improve the follow-up efficiency for obtaining multimedia file.
An embodiment of the present invention provides a kind of method obtaining multimedia file, this method is applied in terminal and multimedia clothes
It is engaged between device, referring to Fig. 4, this method includes:
Step 401:Terminal obtains the voice signal of acquisition, is sent to multimedia server and obtains request, acquisition request
Carry the voice signal.
The current interface of terminal includes that song is listened to know bent recognition button, when user's searching multimedia files, Yong Huke
To click the recognition button;When terminal detects that the recognition button is triggered, terminal acquisition is input by user or other set
The standby voice signal played sends to multimedia server and obtains request, and acquisition request carries the voice signal.
Step 402:Multimedia server receives the acquisition request that terminal is sent, and extracts the reference note of the voice signal
Sequence.
Wherein, include multiple notes with reference to sequence of notes, which can only include pitch, can also only include the duration of a sound,
It can also both include pitch, and also include the duration of a sound.The pitch can be the perfect pitch of the note, or two neighboring note
Pitch between relative pitch.
Step 403:For the multimedia file with repetitive structure in multimedia file library, multimedia server obtains should
The benchmark note subsequence of multimedia file.
Multi-media tag library is stored in multimedia server, which includes sequence of notes, and there is repetition to tie
The mark of the multimedia file of structure.It is more with repetitive structure that multimedia server obtains sequence of notes from multi-media tag library
The mark of media file has the mark of the multimedia file of repetitive structure according to sequence of notes, from the mark of multimedia file
Have the benchmark note of the multimedia file of repetitive structure with sequence of notes is obtained in the correspondence of benchmark note subsequence
Sequence.
For sequence of notes in multimedia file library do not have repetitive structure multimedia file, multimedia server according to
The mark of the multimedia file, from obtaining the multimedia file in the correspondence of the mark of multimedia file and sequence of notes
Sequence of notes.It should be noted that benchmark note sub-series of packets includes at least one note, and the reference note of some multimedia file
The number for the note that the sequence of notes that the number for the note that symbol subsequence includes is less than the multimedia file includes.For example, some
The sequence of notes of multimedia file include 8 notes, then the benchmark note subsequence of the multimedia file may only include 4 or
5 notes of person.
Step 404:Multimedia server refers to the benchmark note subsequence of sequence of notes and the multimedia file according to this,
Determine the matching degree between the voice signal and the multimedia file.
Multimedia server calculates the reference by existing any algorithm for calculating the matching degree between sequence of notes
Matching degree between sequence of notes and the benchmark note subsequence of the multimedia file.For example, including between two sequence of notes
The number of identical note is as the matching degree between two sequence of notes.Then this step can be:
It is identical that multimedia server determines that the benchmark note sub-series of packets with reference to sequence of notes and the multimedia file includes
The number is determined as this with reference to the matching degree between sequence of notes and the multimedia file by the number of note.
For sequence of notes in multimedia file library do not have repetitive structure multimedia file, multimedia server according to
This refers to the sequence of notes of sequence of notes and the multimedia file, determines the matching between the voice signal and the multimedia file
Degree.
Step 405:Multimedia server according to each multimedia file in the voice signal and multimedia file library it
Between matching degree, select matching degree to meet the destination multimedia file of preset condition from multimedia file library.
Preset condition can be that matching degree is maximum or selection matching degree is more than preset matching degree.Wherein, preset matching degree
It can be configured and change as needed, in embodiments of the present invention, preset matching degree is not especially limited.For example, pre-
If matching degree can be 10 or 20 etc..
For example, when preset condition is that matching degree is maximum, then this step can be:
Multimedia server is according to the matching between each multimedia file in the voice signal and multimedia file library
Degree selects the maximum preset number destination multimedia file of matching degree from multimedia file library.
Preset number can be configured and change as needed, in embodiments of the present invention, not make to have to preset number
Body limits.For example, preset number can be 3 or 5 etc..
For another example, when preset condition is that matching degree is more than preset matching degree, then this step can be:
Multimedia server is according to the matching between each multimedia file in the voice signal and multimedia file library
Degree, it is more than the destination multimedia file of preset matching degree that matching degree is selected from multimedia file library.
Step 406:Multimedia server sends destination multimedia file to terminal.
Terminal sent to multimedia server acquisition request in carried terminal terminal iidentification, multimedia server from this
It obtains in request and obtains the terminal iidentification, according to the terminal iidentification, the destination multimedia file is sent to terminal.
In a possible realization method, in order to reduce the network resource consumption of terminal, multimedia server can not
The destination multimedia file is sent to terminal, the mark of the destination multimedia file is only sent to terminal, is receiving terminal hair
When the download request sent or playing request, just the destination multimedia file is sent to terminal.
Wherein, the mark of the terminal iidentification and the destination multimedia file can be configured and change as needed,
In the embodiment of the present invention, the mark of the terminal iidentification and the destination multimedia file is not especially limited;For example, the terminal mark
Knowledge can be the phone number of terminal or log in the user identifier of the application.The mark of the destination multimedia file can be should
Title or number of destination multimedia file etc..
It should be noted that if there is no literary with the matched destination multimedia of the voice signal in multimedia file library
Part unsuccessfully indicates that this unsuccessfully indicates to be used to indicate recognition failures to terminal transmission.Terminal receives the mistake that multimedia server is sent
Instruction is lost, shows that this is unsuccessfully indicated.Wherein, terminal receives after this unsuccessfully indicates, terminal can also resurvey voice signal,
It is sent again to multimedia server and obtains request, acquisition request carries the voice signal resurveyed.Multimedia service
Device receives acquisition request, based on the voice signal resurveyed, is obtained and the voice resurveyed by above step
The destination multimedia file of Signal Matching.
Step 407:Terminal receives the destination multimedia file that multimedia server is sent.
Terminal receives the destination multimedia file that multimedia server is sent, and stores the destination multimedia file, shows
The mark of the destination multimedia file, user can click the destination multimedia file and play the destination multimedia with triggering terminal
File;When terminal detects that the destination multimedia file is triggered, the stored destination multimedia file is obtained, the mesh is played
Mark multimedia file.
It should be noted that if multimedia server only sends the destination multimedia file to terminal in a step 406
Mark, then this step can be:
Terminal receives the mark for the destination multimedia file that multimedia server is sent, and shows the destination multimedia file
Mark;User can click the mark of the destination multimedia file and play the destination multimedia file with triggering terminal;Terminal
When detecting that the destination multimedia file is triggered, playing request is sent to multimedia server, which carries the mesh
Mark the mark of multimedia file.
Multimedia server receives the playing request that terminal is sent, and according to the mark of the destination multimedia file, obtaining should
Destination multimedia file sends the destination multimedia file to terminal;It is more that terminal receives the target that multimedia server is sent
Media file plays the destination multimedia file.
In embodiments of the present invention, the multimedia file for sequence of notes with repetitive structure obtains multimedia text
The benchmark note subsequence of part, according to the benchmark note subsequence of the reference sequence of notes of voice signal and the multimedia file,
It determines the matching degree between the voice signal and the multimedia file, is based on matching degree, matching is obtained from multimedia file library
Degree meets the destination multimedia file of preset condition.Due to the number for the note that the benchmark note sub-series of packets of the multimedia file includes
Mesh is less than the number for the note that the multimedia file includes, therefore in embodiments of the present invention, according to the reference sound of voice signal
The benchmark note subsequence for according with sequence and the multimedia file, determines the matching between the voice signal and the multimedia file
Degree reduces and calculates the time, improves the efficiency for obtaining multimedia file.
In addition, the method provided in an embodiment of the present invention for obtaining multimedia file can also be applied in the terminal.If should
The method of multimedia file is obtained using in the terminal, then multimedia file library includes multiple multimedias that terminal local has been downloaded
File.Also, the executive agent of above step 301-305 is terminal;Also, after terminal collects voice signal, it is not required to
It to be sent to multimedia server and obtain request, directly extract the reference sequence of notes of the voice signal;For multimedia file
The multimedia file with repetitive structure, multimedia server obtain the benchmark note subsequence of the multimedia file, root in library
The benchmark note subsequence that sequence of notes and the multimedia file are referred to according to this, determine the voice signal and the multimedia file it
Between matching degree, according to the matching degree between each multimedia file in the voice signal and multimedia file library, from more matchmakers
It selects matching degree to meet the destination multimedia file of preset condition in body library, shows the destination multimedia file of acquisition.
An embodiment of the present invention provides a kind of devices obtaining multimedia file, and referring to Fig. 5, which includes:
Extraction module 501, the reference sequence of notes of the voice signal for extracting acquisition, this includes more with reference to sequence of notes
A note;
First acquisition module 502 is used for for any multimedia file in multimedia file library, when any multimedia
When the sequence of notes of file has repetitive structure, the benchmark note subsequence of any multimedia file, the benchmark note are obtained
Subsequence includes at least one note, and the number of note that the benchmark note sub-series of packets includes is less than any multimedia file
Including note number;
Determining module 503, the benchmark note subsequence for referring to sequence of notes and any multimedia file according to this,
Determine the matching degree between the voice signal and any multimedia file;
Second acquisition module 504 is used for according to the matching degree between the voice signal and any multimedia file, from this
The destination multimedia file that matching degree meets preset condition is obtained in multimedia file library.
In a kind of possible realization method, which further includes:
Division module, for the sequence of notes of any multimedia file to be divided into multiple note subsequences, Mei Geyin
It includes at least one note to accord with subsequence;
The determining module 503 is additionally operable to, based on default multiplicity algorithm, determine the repetition between each note subsequence
Degree;
The determining module 503, if the multiplicity being additionally operable between each note subsequence is more than default multiplicity, really
The sequence of notes of fixed any multimedia file has repetitive structure.
In a kind of possible realization method, which is additionally operable to be based on similar matrix algorithm, determines that this is each
At least one similar matrix between note subsequence determines the characteristic value of each similar matrix according to each similar matrix,
According to the characteristic value of each similar matrix, the multiplicity between each note subsequence is determined;Alternatively,
The determining module 503 is additionally operable to be based on cross correlation algorithm, determines at least one between each note subsequence
Cross correlation measure determines the multiplicity between each note subsequence according to each cross correlation measure;Alternatively,
The determining module 503 is additionally operable to be based on editing distance algorithm, determines at least one between each note subsequence
A editing distance, each editing distance of root determine the multiplicity between each note subsequence;
The determining module 503 is additionally operable to be based on EMD distance algorithms, determines at least one between each note subsequence
A EMD distances determine the multiplicity between each note subsequence according to each EMD distances.
In a kind of possible realization method, which is additionally operable to from multiple note subsequence random
Select benchmark note subsequence of the note subsequence as any multimedia file;Alternatively,
First acquisition module 502 is additionally operable to select one from multiple note subsequence to include that note number is most
Benchmark note subsequence of the note subsequence as any multimedia file;Alternatively,
First acquisition module 502 is additionally operable to select one from multiple note subsequence to include that note number is minimum
Benchmark note subsequence of the note subsequence as any multimedia file.
In a kind of possible realization method, the intersection between two neighboring note subsequence includes preset number note,
The preset number is more than or equal to 0, and less than the integer of specified numerical value, which is any multimedia file packet
Include the quotient of the number of note and the number of the note subsequence of division.
In a kind of possible realization method, which includes pitch and/or the duration of a sound, which is the perfect pitch of the note
Or the relative pitch between two neighboring note.
In embodiments of the present invention, the multimedia file for sequence of notes with repetitive structure obtains multimedia text
The benchmark note subsequence of part, according to the benchmark note subsequence of the reference sequence of notes of voice signal and the multimedia file,
It determines the matching degree between the voice signal and the multimedia file, is based on matching degree, matching is obtained from multimedia file library
Degree meets the destination multimedia file of preset condition.Due to the number for the note that the benchmark note sub-series of packets of the multimedia file includes
Mesh is less than the number for the note that the multimedia file includes, therefore in embodiments of the present invention, according to the reference sound of voice signal
The benchmark note subsequence for according with sequence and the multimedia file, determines the matching between the voice signal and the multimedia file
Degree reduces and calculates the time, improves the efficiency for obtaining multimedia file.
It should be noted that:Above-described embodiment provide acquisition multimedia file device when obtaining multimedia file,
It only the example of the division of the above functional modules, can be as needed and by above-mentioned function distribution in practical application
It is completed by different function modules, i.e., the internal structure of device is divided into different function modules, it is described above to complete
All or part of function.In addition, the device for the acquisition multimedia file that above-described embodiment provides and acquisition multimedia file
Embodiment of the method belongs to same design, and specific implementation process refers to embodiment of the method, and which is not described herein again.
Fig. 6 is a kind of block diagram of multimedia server provided in an embodiment of the present invention.Referring to Fig. 6, multimedia server 600
Including processing component 622, further comprise one or more processors, and provided by the memory representated by memory 632
Source, can be by the instruction of the execution of processing component 622, such as application program for storing.The application program stored in memory 632
May include it is one or more each correspond to one group of instruction module.In addition, processing component 622 is configured as holding
Row instruction, the method to execute above-mentioned acquisition multimedia file.
Multimedia server 600 can also include that a power supply module 626 is configured as executing multimedia server 600
Power management, a wired or wireless network interface 650 are configured as multimedia server 600 being connected to network and one
Input and output (I/O) interface 658.Multimedia server 600 can be operated based on the operating system for being stored in memory 632, example
Such as Windows ServerTM, Mac OS XTM, UnixTM,LinuxTM, FreeBSDTMOr it is similar.
In the exemplary embodiment, it includes the computer readable storage medium instructed to additionally provide a kind of, such as including referring to
The memory of order, above-metioned instruction can be executed by the processor in terminal to complete the acquisition multimedia file in above-described embodiment
Method.For example, computer readable storage medium can be ROM, random access memory (RAM), CD-ROM, tape, floppy disk and
Optical data storage devices etc..
One of ordinary skill in the art will appreciate that realizing that all or part of step of above-described embodiment can pass through hardware
It completes, relevant hardware can also be instructed to complete by program, the program can be stored in a kind of computer-readable
In storage medium, storage medium mentioned above can be read-only memory, disk or CD etc..
The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all the present invention spirit and
Within principle, any modification, equivalent replacement, improvement and so on should all be included in the protection scope of the present invention.
Claims (10)
1. a kind of method obtaining multimedia file, which is characterized in that the method includes:
The reference sequence of notes of the voice signal of acquisition is extracted, the reference sequence of notes includes multiple notes;
For any multimedia file in multimedia file library, the sequence of notes of any multimedia file is divided into more
A note subsequence, each note subsequence include at least one note, and the intersection between two neighboring note subsequence includes
Preset number note, the preset number are more than or equal to 0, and less than the integer of specified numerical value, the specified numerical value
Include the quotient of the number of note and the number of the note subsequence of division for any multimedia file;
Based on default multiplicity algorithm, the multiplicity between each note subsequence is determined;
If the multiplicity between each note subsequence is more than default multiplicity, any multimedia file is determined
Sequence of notes has repetitive structure;
When the sequence of notes of any multimedia file has repetitive structure, the benchmark of any multimedia file is obtained
Note subsequence, the benchmark note sub-series of packets include at least one note, and the note that the benchmark note sub-series of packets includes
Number be less than the number of any multimedia file note for including;
According to described with reference to sequence of notes and the benchmark note subsequence of any multimedia file, the voice signal is determined
With the matching degree between any multimedia file;
According to the matching degree between the voice signal and any multimedia file, obtained from the multimedia file library
Matching degree meets the destination multimedia file of preset condition.
2. according to the method described in claim 1, it is characterized in that, described based on default multiplicity algorithm, determine described each
Multiplicity between note subsequence, including:
Based on similar matrix algorithm, at least one similar matrix between each note subsequence is determined, according to each phase
Like matrix, the characteristic value of each similar matrix is determined, according to the characteristic value of each similar matrix, determine described each
Multiplicity between note subsequence;Alternatively,
Based on cross correlation algorithm, at least one cross correlation measure between each note subsequence is determined, according to each mutual
Guan Du determines the multiplicity between each note subsequence;Alternatively,
Based on editing distance algorithm, at least one editing distance between each note subsequence is determined, according to each volume
Distance is collected, determines the multiplicity between each note subsequence;Alternatively,
Based on EMD distance algorithms, determine at least one EMD distances between each note subsequence, according to each EMD away from
From determining the multiplicity between each note subsequence.
3. according to the method described in claim 1, it is characterized in that, the benchmark note for obtaining any multimedia file
Subsequence, including:
Benchmark of the note subsequence as any multimedia file is randomly choosed from the multiple note subsequence
Note subsequence;Alternatively,
It includes the most note subsequence of note number as any more matchmakers that one is selected from the multiple note subsequence
The benchmark note subsequence of body file;Alternatively,
It includes the minimum note subsequence of note number as any more matchmakers that one is selected from the multiple note subsequence
The benchmark note subsequence of body file.
4. according to any methods of claim 1-3, which is characterized in that the note includes pitch and/or the duration of a sound, described
Pitch is the relative pitch between the perfect pitch or two neighboring note of the note.
5. a kind of device obtaining multimedia file, which is characterized in that described device includes:
Extraction module, the reference sequence of notes of the voice signal for extracting acquisition, the reference sequence of notes includes multiple sounds
Symbol;
Division module is used for for any multimedia file in multimedia file library, by the sound of any multimedia file
Symbol sequence is divided into multiple note subsequences, and each note subsequence includes at least one note, two neighboring note subsequence
Between intersection include preset number note, the preset number be more than or equal to 0, and it is whole less than specified numerical value
Number, the specified numerical value be any multimedia file include the number of note and the note subsequence of division number it
Quotient;
Determining module, for based on default multiplicity algorithm, determining the multiplicity between each note subsequence;
The determining module determines if the multiplicity being additionally operable between each note subsequence is more than default multiplicity
The sequence of notes of any multimedia file has repetitive structure;
First acquisition module, for when the sequence of notes of any multimedia file has repetitive structure, obtaining described appoint
The benchmark note subsequence of one multimedia file, the benchmark note sub-series of packets include at least one note, and the reference note
The number for the note that symbol subsequence includes is less than the number for the note that any multimedia file includes;
The determining module, is additionally operable to according to described with reference to sequence of notes and the sub- sequence of benchmark note of any multimedia file
Row, determine the matching degree between the voice signal and any multimedia file;
Second acquisition module, for according to the matching degree between the voice signal and any multimedia file, from described
The destination multimedia file that matching degree meets preset condition is obtained in multimedia file library.
6. device according to claim 5, which is characterized in that
The determining module is additionally operable to be based on similar matrix algorithm, determines at least one between each note subsequence
Similar matrix determines the characteristic value of each similar matrix according to each similar matrix, according to each similar matrix
Characteristic value determines the multiplicity between each note subsequence;Alternatively,
The determining module is additionally operable to be based on cross correlation algorithm, determine between each note subsequence it is at least one mutually
The degree of correlation determines the multiplicity between each note subsequence according to each cross correlation measure;Alternatively,
The determining module is additionally operable to be based on editing distance algorithm, determines at least one between each note subsequence
Editing distance determines the multiplicity between each note subsequence according to each editing distance;Alternatively,
The determining module is additionally operable to be based on EMD distance algorithms, determines at least one between each note subsequence
EMD distances determine the multiplicity between each note subsequence according to each EMD distances.
7. device according to claim 5, which is characterized in that
First acquisition module is additionally operable to randomly choose a note subsequence from the multiple note subsequence as institute
State the benchmark note subsequence of any multimedia file;Alternatively,
First acquisition module is additionally operable to select one from the multiple note subsequence to include the most note of note number
Benchmark note subsequence of the subsequence as any multimedia file;Alternatively,
First acquisition module is additionally operable to select one from the multiple note subsequence to include the minimum note of note number
Benchmark note subsequence of the subsequence as any multimedia file.
8. according to any devices of claim 5-7, which is characterized in that the note includes pitch and/or the duration of a sound, described
Pitch is the relative pitch between the perfect pitch or two neighboring note of the note.
9. a kind of device obtaining multimedia file, which is characterized in that described device includes:Processor and memory, it is described to deposit
At least one instruction is stored in reservoir, described instruction is loaded by the processor and executed to realize such as claim 1 to power
Profit requires the method described in any one of 4.
10. a kind of computer readable storage medium, which is characterized in that be stored at least one in the computer readable storage medium
Item instructs, and described instruction is loaded by processor and executed to realize the side as described in any one of claim 1 to claim 4
Method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710679015.9A CN107368609B (en) | 2017-08-10 | 2017-08-10 | Obtain the method, apparatus and computer readable storage medium of multimedia file |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710679015.9A CN107368609B (en) | 2017-08-10 | 2017-08-10 | Obtain the method, apparatus and computer readable storage medium of multimedia file |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107368609A CN107368609A (en) | 2017-11-21 |
CN107368609B true CN107368609B (en) | 2018-09-04 |
Family
ID=60309647
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710679015.9A Active CN107368609B (en) | 2017-08-10 | 2017-08-10 | Obtain the method, apparatus and computer readable storage medium of multimedia file |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107368609B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109801643B (en) * | 2019-01-30 | 2020-12-04 | 龙马智芯(珠海横琴)科技有限公司 | Processing method and device for reverberation suppression |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20070016750A (en) * | 2005-08-05 | 2007-02-08 | 모두스타 주식회사 | Ubiquitous music information retrieval system and method based on query pool with feedback of customer characteristics |
CN101689225A (en) * | 2007-06-29 | 2010-03-31 | 惠普开发有限公司 | Generating music thumbnails and identifying related song structure |
CN104598515A (en) * | 2014-12-03 | 2015-05-06 | 百度在线网络技术(北京)有限公司 | Song searching method, device and system |
CN106970950A (en) * | 2017-03-07 | 2017-07-21 | 腾讯音乐娱乐(深圳)有限公司 | The lookup method and device of similar audio data |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070106405A1 (en) * | 2005-08-19 | 2007-05-10 | Gracenote, Inc. | Method and system to provide reference data for identification of digital content |
EP1785891A1 (en) * | 2005-11-09 | 2007-05-16 | Sony Deutschland GmbH | Music information retrieval using a 3D search algorithm |
CN102541965B (en) * | 2010-12-30 | 2015-05-20 | 国际商业机器公司 | Method and system for automatically acquiring feature fragments from music file |
CN106708990B (en) * | 2016-12-15 | 2020-04-24 | 腾讯音乐娱乐(深圳)有限公司 | Music piece extraction method and equipment |
CN106844528A (en) * | 2016-12-29 | 2017-06-13 | 广州酷狗计算机科技有限公司 | The method and apparatus for obtaining multimedia file |
-
2017
- 2017-08-10 CN CN201710679015.9A patent/CN107368609B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20070016750A (en) * | 2005-08-05 | 2007-02-08 | 모두스타 주식회사 | Ubiquitous music information retrieval system and method based on query pool with feedback of customer characteristics |
CN101689225A (en) * | 2007-06-29 | 2010-03-31 | 惠普开发有限公司 | Generating music thumbnails and identifying related song structure |
CN104598515A (en) * | 2014-12-03 | 2015-05-06 | 百度在线网络技术(北京)有限公司 | Song searching method, device and system |
CN106970950A (en) * | 2017-03-07 | 2017-07-21 | 腾讯音乐娱乐(深圳)有限公司 | The lookup method and device of similar audio data |
Also Published As
Publication number | Publication date |
---|---|
CN107368609A (en) | 2017-11-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7487180B2 (en) | System and method for recognizing audio pieces via audio fingerprinting | |
CN103748579B (en) | Data are handled in MapReduce frame | |
JP4945877B2 (en) | System and method for recognizing sound / musical signal under high noise / distortion environment | |
CN105138541B (en) | The method and apparatus of audio-frequency fingerprint matching inquiry | |
Fu et al. | Privacy-preserving smart similarity search based on simhash over encrypted data in cloud computing | |
CN111161758B (en) | Song listening and song recognition method and system based on audio fingerprint and audio equipment | |
CN1983253A (en) | Method, apparatus and system for supplying musically searching service | |
CN106951527B (en) | Song recommendation method and device | |
CN106294564A (en) | A kind of video recommendation method and device | |
CN104067273A (en) | Grouping search results into a profile page | |
CN107315833A (en) | Method and apparatus of the retrieval with downloading based on application program | |
CN107368609B (en) | Obtain the method, apparatus and computer readable storage medium of multimedia file | |
CN109241360B (en) | Matching method and device of combined character strings and electronic equipment | |
CN111159464A (en) | Audio clip detection method and related equipment | |
CN111552831A (en) | Music recommendation method and server | |
CN112364222A (en) | Regional portrait method of user age, computer equipment and storage medium | |
CN106844504B (en) | A kind of method and apparatus for sending song and singly identifying | |
CN107402886B (en) | Storehouse analysis method and relevant apparatus | |
CN113674725B (en) | Audio mixing method, device, equipment and storage medium | |
CN104636474A (en) | Method and equipment for establishment of audio fingerprint database and method and equipment for retrieval of audio fingerprints | |
KR102255156B1 (en) | Device and method to manage plurality of music files | |
CN111966790A (en) | Method and equipment for searching knowledge base of cloud management platform | |
KR102183008B1 (en) | Apparatus and method for recommending music | |
CN110532419A (en) | A kind of processing method and processing device of audio | |
KR102666383B1 (en) | Method and system for detecting absence of multi device user |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: 510660 Guangzhou City, Guangzhou, Guangdong, Whampoa Avenue, No. 315, self - made 1-17 Applicant after: Guangzhou KuGou Networks Co., Ltd. Address before: 510000 B1, building, No. 16, rhyme Road, Guangzhou, Guangdong, China 13F Applicant before: Guangzhou KuGou Networks Co., Ltd. |
|
CB02 | Change of applicant information | ||
GR01 | Patent grant | ||
GR01 | Patent grant |