CN110400578A - The generation of Hash codes and its matching process, device, electronic equipment and storage medium - Google Patents
The generation of Hash codes and its matching process, device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN110400578A CN110400578A CN201910656276.8A CN201910656276A CN110400578A CN 110400578 A CN110400578 A CN 110400578A CN 201910656276 A CN201910656276 A CN 201910656276A CN 110400578 A CN110400578 A CN 110400578A
- Authority
- CN
- China
- Prior art keywords
- hash codes
- audio signal
- signal
- hash
- multimedia file
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 70
- 230000008569 process Effects 0.000 title claims abstract description 20
- 230000005236 sound signal Effects 0.000 claims abstract description 220
- 238000001228 spectrum Methods 0.000 claims abstract description 61
- 238000013507 mapping Methods 0.000 claims description 165
- 238000012545 processing Methods 0.000 claims description 37
- 238000006243 chemical reaction Methods 0.000 claims description 17
- 230000015654 memory Effects 0.000 claims description 17
- 230000006870 function Effects 0.000 claims description 11
- 238000005070 sampling Methods 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 4
- 238000004590 computer program Methods 0.000 claims description 4
- 230000000694 effects Effects 0.000 claims description 2
- 206010060766 Heteroplasia Diseases 0.000 claims 1
- 230000000875 corresponding effect Effects 0.000 description 8
- 208000001491 myopia Diseases 0.000 description 8
- 238000010586 diagram Methods 0.000 description 7
- 230000002596 correlated effect Effects 0.000 description 5
- 239000000284 extract Substances 0.000 description 5
- 238000004422 calculation algorithm Methods 0.000 description 4
- 238000004891 communication Methods 0.000 description 4
- 241000208340 Araliaceae Species 0.000 description 3
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 3
- 235000003140 Panax quinquefolius Nutrition 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 3
- 235000008434 ginseng Nutrition 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 238000000605 extraction Methods 0.000 description 2
- 238000010304 firing Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000029058 respiratory gaseous exchange Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Signal Processing For Digital Recording And Reproducing (AREA)
Abstract
The embodiment of the invention discloses a kind of generation of Hash codes and its matching process, device, electronic equipment and storage medium, the generation method of Hash codes includes: reception target audio signal;The collecting sample audio signal from the target audio signal;The sample audio signal is converted into spectrum signal, there is amplitude in the spectrum signal;The Hash codes for identifying the target audio signal entirety are generated according to the difference between the amplitude.It is subsequent directly to use distance versus Hash codes since Hash codes can identify target audio signal entirety, it avoids extracting sliding window expense brought by feature generation Hash codes, it is easy to operate, service speed is fast, in comparison Hash codes on a large scale, it is ensured that the performance of operation.
Description
Technical field
The present embodiments relate to the generation of audio signal processing technique more particularly to a kind of Hash codes and its matching process, dress
It sets, electronic equipment and storage medium.
Background technique
The relatively high multimedia files of temperatures such as similar short-sighted frequency, upload amount is big, and propagates rapidly, certain welcome sounds
Frequency can be used or be imitated by a large number of users, there is a large amount of repeat in the multimedia file for causing user to upload.
Currently, being directed to different business demands, especially large-scale multimedia file is calculated usually using hash (Hash)
Method carries out audio comparison, cluster to multimedia file, and hash algorithm can pass through by Tone Map to a binary Hash codes
The similarity degree of Hash codes compared to determine two audios, arithmetic speed are very fast.
But hash algorithm needs to find out the feature in audio in the time domain, then such as perceptual hash algorithm (phash)
Hash codes are asked to this feature, sliding window compares again when comparison, and complicated operation, in comparison Hash codes on a large scale, operation
There are performance bottlenecks, it is difficult to adapt to the needs of some scene applications.
Summary of the invention
The embodiment of the present invention provides generation and its matching process, device, electronic equipment and the storage medium of a kind of Hash codes,
Hash codes are generated to solve to extract audio frequency characteristics, sliding window compares in comparison, the problem of complicated operation.
In a first aspect, the embodiment of the invention provides a kind of generation methods of Hash codes, comprising:
Receive target audio signal;
The collecting sample audio signal from the target audio signal;
The sample audio signal is converted into spectrum signal, there is amplitude in the spectrum signal;
The Hash codes for identifying the target audio signal entirety are generated according to the difference between the amplitude.
Second aspect, the embodiment of the invention also provides a kind of matching process of multimedia file in, comprising:
Determine that destination multimedia file, the destination multimedia file have target audio signal;
Generate the Hash codes for identifying the target audio signal entirety;
It determines and refers to multimedia file, it is described that there is reference audio signal with reference to multimedia file, it is described to refer to multimedia
File association identifies the Hash codes of the reference audio signal entirety;
Calculate the Hash codes and the distance between the Hash codes with reference to multimedia file of the destination multimedia file;
If the distance is less than preset targets threshold, it is determined that the destination multimedia file refers to multimedia with described
File matching.
The third aspect, the embodiment of the invention also provides a kind of generating means of Hash codes, comprising:
Target audio signal receiving unit, for receiving target audio signal;
Sample audio signal acquisition unit, for the collecting sample audio signal from the target audio signal;
Spectrum signal converting unit, for the sample audio signal to be converted to spectrum signal, in the spectrum signal
With amplitude;
Difference generation unit identifies the target audio signal entirety for generating according to the difference between the amplitude
Hash codes.
Fourth aspect, the embodiment of the invention also provides a kind of coalignments of multimedia file, comprising:
Destination multimedia file determining module, for determining that destination multimedia file, the destination multimedia file have
Target audio signal;
Hash codes generation module, for generating the Hash codes for identifying the target audio signal entirety;
With reference to multimedia file determining module, multimedia file is referred to for determining, it is described to have with reference to multimedia file
Reference audio signal, the Hash codes of the reference audio signal entirety with reference to described in multimedia file association identification;
File distance calculation module, the Hash codes and the reference multimedia for calculating the destination multimedia file are literary
The distance between Hash codes of part;
File matches determining module, if being less than preset targets threshold for the distance, it is determined that the more matchmakers of target
Body file is matched with described with reference to multimedia file.
5th aspect, the embodiment of the invention also provides a kind of electronic equipment, the electronic equipment includes:
One or more processors;
Memory, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processing
Device realizes the generation method of Hash codes as described in relation to the first aspect or the match party of the multimedia file as described in second aspect
Method.
6th aspect, the embodiment of the invention also provides a kind of computer readable storage mediums, are stored thereon with computer
Program, which is characterized in that realized when the program is executed by processor Hash codes as described in relation to the first aspect generation method or
The matching process of multimedia file as described in second aspect.
The embodiment of the present invention receives target audio signal, the collecting sample audio signal from target audio signal, by sample
Audio signal is converted to spectrum signal, generates mark target audio signal entirety according to the difference in spectrum signal between amplitude
Hash codes, it is subsequent directly to use distance versus Hash codes since Hash codes can identify target audio signal entirety, it avoids
It extracts feature and generates sliding window expense brought by Hash codes, easy to operate, service speed is fast, in comparison Hash codes on a large scale,
It can guarantee the performance of operation.
Detailed description of the invention
Fig. 1 is a kind of flow chart of the generation method for Hash codes that the embodiment of the present invention one provides;
Fig. 2 is a kind of flow chart of the generation method of Hash codes provided by Embodiment 2 of the present invention;
Fig. 3 is a kind of flow chart of the matching process for multimedia file that the embodiment of the present invention three provides;
Fig. 4 is a kind of flow chart of the matching process for multimedia file that the embodiment of the present invention four provides;
The exemplary diagram of the first mapping table of one kind, the second mapping table that Fig. 5 provides for the embodiment of the present invention four;
Fig. 6 is a kind of structural schematic diagram of the generating means for Hash codes that the embodiment of the present invention five provides;
Fig. 7 is a kind of structural schematic diagram of the coalignment for multimedia file that the embodiment of the present invention six provides;
Fig. 8 is the structural schematic diagram for a kind of electronic equipment that the embodiment of the present invention seven provides.
Specific embodiment
The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining the present invention rather than limiting the invention.It also should be noted that in order to just
Only the parts related to the present invention are shown in description, attached drawing rather than entire infrastructure.In addition, in the absence of conflict, this
The feature in embodiment and embodiment in invention can be combined with each other.
Embodiment one
Fig. 1 is a kind of flow chart of the generation method for Hash codes that the embodiment of the present invention one provides, and the present embodiment is applicable
In by target audio signal overall conversion be Hash codes the case where, this method can be executed by the generating means of Hash codes, should
The generating means of Hash codes can be can configure by software and or hardware realization in the electronic device, for example, server, work station
Deng this method specifically comprises the following steps:
S101, target audio signal is received.
In the concrete realization, target audio signal can be user's input, independent audio signal, or some
Audio signal in multimedia file (such as short-sighted frequency), the present embodiment are without restriction to this.
For unused application scenarios, such as short-sighted frequency, live streaming, TV play, voice may be contained in target audio signal
Signal, mute signal, noise signal, background acoustical signal etc..
The target audio signal can be AAC (Advanced Audio Coding, Advanced Audio Coding), MP3
(Moving Picture Experts Group Audio Layer III, dynamic image expert's compression standard audio level 3)
Equal formats, for convenient for subsequent processing, decodable code is PCM (Pulse Code Modulation, pulse code modulation) format.
S102, the collecting sample audio signal from the target audio signal.
In the present embodiment, it collecting part audio signal can be used for as sample audio signal from target audio signal
Generate Hash codes.
It should be noted that the sample audio signal is typically equally distributed in target audio signal, target sound can be indicated
Frequency signal is whole.
In the concrete realization, sampling processing can be carried out to target audio signal, to sample the sample sound with target component
Frequency signal;
Wherein, target component includes following at least one:
1, frequency
The frequency of sample audio signal and the performance of Hash codes are positively correlated, are negatively correlated with arithmetic speed, i.e., sample audio is believed
Number frequency it is higher, Hash codes performances are higher, generate Hash codes speed it is slower, those skilled in the art can be according to difference
Application scenarios, different values is taken to the frequency of sample audio signal, for example, be directed to short-sighted frequency, the frequency of the sample audio signal
Rate is 8000Hz.
2, monophonic
For target audio signals more than two or two sound channels, two or two sound channels can be merged into sampling
One sound channel.
Further, the quantity of statistics available sample audio signal, if the quantity is less than preset amount threshold, in sample
Increase specified audio signal (such as zero padding) after audio signal, as new sample audio signal, until sample audio signal
Quantity arriving amt threshold value.
It should be noted that the quantity of sample audio signal and the performance of Hash codes are positively correlated, are negatively correlated with arithmetic speed,
I.e. frequency is higher, Hash codes performances are higher, the speed of generation Hash codes is slower, and those skilled in the art can be according to different
Application scenarios take different values to the quantity of sample audio signal, for example, being directed to short-sighted frequency, the quantity of the sample audio signal
It is 65536.
In addition, subsequent be converted to spectrum signal to sample audio signal, which belongs to analysis spectrum, is practical frequency
The approximation of spectrum.If sampling is improper, the signal energy of a certain frequency can be diffused on adjacent frequency, and it is existing spectrum leakage occur
As.
In order to reduce spectrum leakage, window function can be added to the sample audio signal sampled from target audio signal, for example,
Quarter window, Hanning window (hanning), Hamming window, Gaussian window etc..
S103, the sample audio signal is converted into spectrum signal.
In the present embodiment, the sample audio signal that will be indicated under time domain is converted to the spectrum signal indicated under frequency domain.
There are the parameters such as amplitude in spectrum signal.
It in the concrete realization, can FT (Fourier Transformation, Fourier transformation), FFT (FastFourier
Transformation, Fast Fourier Transform (FFT)) etc. modes, sample audio signal is transformed to spectrum signal.
Wherein, there is frequency point, frequency point has frequency and amplitude in spectrum signal.
At this point, the frequency linearity correlation in the spectrum signal, i.e. spectrum signal generally conform to frequency point and linearly increase.
Since the ear of people is insensitive to linearly related frequency, and it is more sensitive to the relevant frequency of logarithm, it therefore, can
Specific transition matrix, Meier frequency spectrum, asinh function etc. are multiplied to spectrum signal, to convert to spectrum signal, so that frequency
Rate logarithm is related, i.e., spectrum signal generally conforms to frequency point in logarithm growth.
S104, the Hash codes for identifying the target audio signal entirety are generated according to the difference between the amplitude.
In the present embodiment, can be in spectrum signal, difference between the amplitude of each adjacent frequency reflects frequency spectrum letter
Number certain characteristic, can for the difference generate Hash codes, to identify the characteristic of target audio signal entirety.
The embodiment of the present invention receives target audio signal, the collecting sample audio signal from target audio signal, by sample
Audio signal is converted to spectrum signal, generates mark target audio signal entirety according to the difference in spectrum signal between amplitude
Hash codes, it is subsequent directly to use distance versus Hash codes since Hash codes can identify target audio signal entirety, it avoids
It extracts feature and generates sliding window expense brought by Hash codes, easy to operate, service speed is fast, in comparison Hash codes on a large scale,
It can guarantee the performance of operation.
Embodiment two
Fig. 2 is a kind of flow chart of the generation method of Hash codes provided by Embodiment 2 of the present invention, and the present embodiment is with aforementioned
Based on embodiment, the primary processing behaviour for generating Hash codes, Hash codes Effective judgement, secondary generation Hash codes is further increased
Make.This method specifically comprises the following steps:
S201, target audio signal is received.
S202, the collecting sample audio signal from the target audio signal.
S203, the sample audio signal is converted into spectrum signal.
Wherein, there is amplitude in spectrum signal.
S204, difference processing is carried out to the amplitude, obtains the first signal difference value.
In the present embodiment, difference processing can refer to first-order difference, that is, calculate the difference between two neighboring frequency point, specifically
It may include forward difference, backward difference, intermediate differential etc..
In one example, one fewer than the quantity of amplitude of the length of the first signal difference value.
In this example, the difference between present bit amplitude and next bit amplitude can be assigned to the first signal of present bit
Difference value.
Assuming that the quantity of amplitude is t*n+1, the length of the first signal difference value is t*n, and t is constant, and such as 32, n is positive whole
It counts, then the first signal difference value are as follows:
Yi=Zi-Zi+1
Wherein, Y is the first signal difference value, and Z is amplitude, and i ∈ t*n, i are positive integer.
Certainly, above-mentioned difference processing is intended only as example, in implementing the embodiments of the present invention, can set according to the actual situation
Other difference processings are set, for example, Yi=Zi+1-Zi, Yi=Zi-Zi-1, etc., the embodiments of the present invention are not limited thereto.Separately
Outside, other than above-mentioned difference processing, those skilled in the art can also use other difference processings according to actual needs, the present invention
Embodiment is also without restriction to this.
S205, binary conversion treatment is carried out to the first signal difference value, obtains Hash codes.
In the present embodiment, binary conversion treatment is carried out to the first signal difference value, is converted to binary representation, can get and breathe out
Uncommon code.
In the concrete realization, if the first signal difference value is greater than 0, it is determined that Hash codes 1.
If the first signal difference value is less than or equal to 0, it is determined that Hash codes 0.
If the length of the first signal difference value is t*n, Hash codes can be considered as the n section character string that length is t.
S206, the attribute for determining the sample audio signal.
If S207, the attribute are non-mute signal and non-noise signal, it is determined that the Hash codes are effective.
If S208, the attribute are mute signal or noise signal, it is determined that the Hash codes are invalid.
In the concrete realization, after carrying out difference processing, binary conversion treatment to amplitude, the Hash codes and noise of mute signal
The Hash codes similarity of signal is high or even identical, for example, the Hash codes of mute signal and the Hash codes of noise signal all 0.
Therefore, the attribute of sample audio signal can be judged, after generating Hash codes if sample audio signal
For non-mute signal and non-noise signal, it is determined that Hash codes are effective, if sample audio signal is that mute signal or noise are believed
Number, it is determined that Hash codes are invalid.
In the determination mode of an attribute, Hash codes and the distance between 0 can be calculated, such as Hamming distance (Hamming
Distance)。
If distance be greater than or equal to preset distance threshold, it is determined that the attribute of sample audio signal be non-mute signal and
Non-noise signal.
If distance is less than preset distance threshold, it is determined that the attribute of affiliated sample audio signal is mute signal or noise
Signal.
The embodiment of the present invention carries out availability deciding to Hash codes by the attribute of sample audio signal, can be to avoid mute
The interference of signal and noise signal guarantees the accuracy of Hash codes.
S209, difference processing is carried out to the first signal difference value, obtains second signal difference value.
It, can be to the secondary carry out difference processing of the first signal difference value, difference for invalid Hash codes after S208
Processing can refer to first-order difference, that is, calculate the difference between two neighboring first signal difference, can specifically include forward difference,
Backward difference, intermediate differential etc..
In one example, the length of the first signal difference value is identical as the length of second signal difference value, to guarantee
The Hash codes obtained after first difference processing and the Hash code length obtained after second order difference processing are identical.
In this example, if present bit second signal difference value is last non-position, by the first signal difference of present bit
Difference between value and next bit the first signal difference value, is assigned to present bit second signal difference value.
If present bit second signal difference value is last position, present bit the first signal difference value is assigned to present bit
Second signal difference value.
Assuming that the length of the first signal difference value is t*n, the length of second signal difference value is t*n, and t is constant, such as 32,
I ∈ t*n, i, n are positive integer.
If i < n, then second signal difference value are as follows:
Xi=Yi-Yi+1
If i=n, then second signal difference value are as follows:
Xi=Yi
Wherein, T is the first signal difference value, and X is second signal difference value.
Certainly, above-mentioned difference processing is intended only as example, in implementing the embodiments of the present invention, can set according to the actual situation
Other difference processings are set, for example, Xi=Yi+1-Yi(as i=t*n, Xi=Yi), Xi=Yi-Yi-1(as i=1, Xi=Yi), etc.
Deng the embodiments of the present invention are not limited thereto.In addition, those skilled in the art can be with root other than above-mentioned difference processing
Other difference processings are used according to actual needs, the embodiment of the present invention is also without restriction to this.
S210, binary conversion treatment is carried out to the second signal difference value, obtains new Hash codes.
In the present embodiment, binary conversion treatment is carried out to second signal difference value, is converted to binary representation, can get new
Hash codes.
In the concrete realization, if second signal difference value is greater than 0, it is determined that new Hash codes are 1;
If second signal difference value is less than or equal to 0, it is determined that new Hash codes are 0.
After second order difference processing, the Hash codes of mute signal and the Hash codes of noise signal be can produce more obvious
Difference, for example, the Hash codes of mute signal all 0, the not all Hash codes of noise signal are 0.
If the length of second signal difference value is t*n, new Hash codes can be considered as the n section character string that length is t.
The situation invalid for Hash codes of the embodiment of the present invention can carry out difference processing, binary conversion treatment again, generate new
Hash codes further improve the accuracy of Hash codes to distinguish mute signal and noise signal.
Embodiment three
Fig. 3 is a kind of flow chart of the matching process for multimedia file that the embodiment of the present invention three provides, and the present embodiment can
The case where matching suitable for the Hash codes based on mark audio signal entirety to multimedia file, this method can be by more matchmakers
The coalignment of body file executes, and the generating means of the Hash codes can be configurable on electronics by software and or hardware realization
In equipment, for example, server, work station etc., this method specifically comprises the following steps:
S301, destination multimedia file is determined.
In practical applications, multimedia file is uploaded to business platform by user, such as short-sighted frequency, live video, speech text
Original text (PPT) etc., it is intended that in the business platform storing multimedia, alternatively, issuing the multimedia file, allow the public to pass round, is clear
It lookes at.
In business platform, multimedia file can be clustered by the audio signal in comparison multimedia file,
Meet different business demands.
For example, by cluster, can be found from video file emerging, hot for the operation of business platform
The audio signal of door, so that it may find hot topic using these audio signals, or excavate out outstanding material creation and use
Family.
In another example being labeled to part multimedia file, to carry out the training of machine learning model.If sending target
Containing a large amount of duplicate audio signals in multimedia file, a large amount of mark manpower, and a large amount of duplicate multimedia texts can be wasted
Part can generate adverse effect to the training of machine learning model, therefore, can reject audio letter by clustering to audio signal
Number duplicate multimedia file.
If higher for requirement of real-time, settable streaming real-time system, user pass through client in business platform
Multimedia file is uploaded to the streaming real-time system in real time, which in real time can be by the multimedia file transmission to being used to match
Electronic equipment.
If lower for requirement of real-time, the settable database in business platform, such as distributed data base, user
Multimedia file is uploaded to the database by client, more matchmakers can be read from the database for matched electronic equipment
Body file.
In the present embodiment, current multimedia file to be matched is considered as target audio signal, wraps in the multimedia file
The audio signal contained is considered as target audio signal.
For unused application scenarios, such as short-sighted frequency, live streaming, TV play, voice may be contained in target audio signal
Signal, mute signal, noise signal, background acoustical signal etc..
Destination multimedia file has target audio signal, such as AAC, MP3 format, in matching, for convenient for subsequent place
Reason, decodable code is PCM format.
S302, the Hash codes for identifying the target audio signal entirety are generated.
In the present embodiment, Hash codes can be used for identifying target audio signal entirety, and not in target audio signal
Partial Feature.
In the concrete realization, S302 includes:
The collecting sample audio signal from the target audio signal;
The sample audio signal is converted into spectrum signal, there is amplitude in the spectrum signal;
The Hash codes for identifying the target audio signal entirety are generated according to the difference between the amplitude.
Further, the collecting sample audio signal from the target audio signal, comprising:
Sampling processing is carried out to the target audio signal, to sample the sample audio signal with target component;
Count the quantity of the sample audio signal;
If the quantity is less than preset amount threshold, increase specified audio letter after the sample audio signal
Number, as new sample audio signal;
Window function is added to the sample audio signal sampled from the target audio signal;
Wherein, the target component includes following at least one:
Frequency, monophonic.
It is further, described that the sample audio signal is converted into spectrum signal, comprising:
The sample audio signal is transformed to spectrum signal, there is frequency point in the spectrum signal, the frequency point has
Frequency and amplitude, the frequency linearity are related;
The spectrum signal is converted, so that the frequency logarithm is related.
Further, the difference according between the amplitude generates the Hash for identifying the target audio signal entirety
Code, comprising:
Difference processing is carried out to the amplitude, obtains the first signal difference value;
Binary conversion treatment is carried out to the first signal difference value, obtains Hash codes.
Further, one fewer than the quantity of the amplitude of the length of the first signal difference value;
It is described that difference processing is carried out to the amplitude, obtain the first signal difference value, comprising:
By the difference between present bit amplitude and next bit amplitude, it is assigned to present bit the first signal difference value.
Further, described that binary conversion treatment is carried out to the first signal difference value, obtain Hash codes, comprising:
If the first signal difference value is greater than 0, it is determined that Hash codes 1;
If the first signal difference value is less than or equal to 0, it is determined that Hash codes 0.
Further, the difference according between the amplitude generates the Hash for identifying the target audio signal entirety
Code, further includes:
Determine the attribute of the sample audio signal;
If the attribute is non-mute signal and non-noise signal, it is determined that the Hash codes are effective;
If the attribute is mute signal or noise signal, it is determined that the Hash codes are invalid.
Further, the attribute of the determination sample audio signal, comprising:
Calculate the Hash codes and the distance between 0;
If the distance is greater than or equal to preset distance threshold, it is determined that the attribute of the sample audio signal is non-quiet
Sound signal and non-noise signal;
If the distance is less than preset distance threshold, it is determined that the attribute of sample audio signal belonging to described is mute letter
Number or noise signal.
Further, after the determination Hash codes are invalid, the difference according between the amplitude is generated
Identify the Hash codes of the target audio signal entirety, further includes:
Difference processing is carried out to the first signal difference value, obtains second signal difference value;
Binary conversion treatment is carried out to the second signal difference value, obtains new Hash codes.
Further, the length of the first signal difference value is identical as the length of the second signal difference value;
It is described that difference processing is carried out to the first signal difference value, obtain second signal difference value, comprising:
If present bit second signal difference value is last non-position, by present bit the first signal difference value and next bit the
Difference between one signal difference value is assigned to present bit second signal difference value;
If present bit second signal difference value is last position, present bit the first signal difference value is assigned to present bit
Second signal difference value.
Further, described that binary conversion treatment is carried out to the second signal difference value, obtain new Hash codes, comprising:
If the second signal difference value is greater than 0, it is determined that new Hash codes are 1;
If the second signal difference value is less than or equal to 0, it is determined that new Hash codes are 0.
In the present embodiment, since the mode and the application of embodiment one, two that generate Hash codes are substantially similar, so description
It is fairly simple, related place illustrates that the embodiment of the present invention is not described in detail herein referring to the part of embodiment one, two.
S303, it determines with reference to multimedia file.
Wherein, there is reference audio signal with reference to multimedia file.
In the present embodiment, currently matched multimedia file is considered as reference audio signal, wraps in the multimedia file
The audio signal contained is considered as reference audio signal.
In addition, referring to multimedia file with reference to the Hash codes of multimedia file association identification reference audio signal entirety
And its incidence relation has been established in Hash codes.
It should be noted that the mode for generating Hash codes to reference multimedia file is breathed out with to destination multimedia file generated
The mode of uncommon code is consistent.
S304, it calculates between the Hash codes and the Hash codes with reference to multimedia file of the destination multimedia file
Distance.
The Hash codes of destination multimedia file and the distance between the Hash codes of reference multimedia file, can indicate target
The target audio signal of media file and with reference to multimedia file reference audio signal between similarity, the distance to it is similar
Degree is negatively correlated, i.e., distance is closer, and similarity is higher, conversely, distance is remoter, similarity is lower.
By taking Hamming distance as an example, Hamming distance indicates the number of two isometric character string kinds of characters on corresponding position,
For binary character string a and b, Hamming distance is equal in aXORb 1 number, wherein XOR is exclusive or, and is called
Hamming weight, also referred to as population count or popcount.
For example, it is assumed that the Hash codes of target media file are 1011101, the Hash codes with reference to multimedia file are
1001001, then distance 2 between the two.
If S305, the distance are less than preset targets threshold, it is determined that the destination multimedia file and the reference
Multimedia file matching.
If the distance between the Hash codes of destination multimedia file and Hash codes of reference multimedia file are less than target
Threshold value indicates the similarity between the target audio signal of target media file and the reference audio signal of reference multimedia file
It is higher, it is believed that destination multimedia file is matched with reference to multimedia file, and multimedia file is used with reference to multimedia file
The same identical audio material.
Furthermore, target media file is matched with reference to multimedia file, can indicate target media file and ginseng
The same classification may be belonged to by examining multimedia file, hereafter, other clustering algorithms can be used to the multimedia file of successful match
It is clustered, so that it is determined that classification belonging to multimedia file.
The embodiment of the present invention determines destination multimedia file, generates the Hash codes of target audio signal entirety, determines reference
Multimedia file and it is associated mark reference audio signal entirety Hash codes, calculate destination multimedia file Hash codes and
With reference to the distance between the Hash codes of multimedia file, if distance is less than preset targets threshold, it is determined that destination multimedia text
Part is matched with reference to multimedia file, can since Hash codes can identify target audio signal, reference audio signal entirety
Directly to use distance versus Hash codes, avoid extracting sliding window expense brought by feature generation Hash codes, easy to operate, operation
Speed is fast, in comparison Hash codes on a large scale, it is ensured that the performance of operation.
Example IV
Fig. 4 is a kind of flow chart for the matching process of multimedia file that the embodiment of the present invention four provides, the present embodiment with
Based on previous embodiment, the processing operation of quick comparison Hash codes is further increased.This method specifically comprises the following steps:
S401, destination multimedia file is determined.
Wherein, destination multimedia file has target audio signal.
S402, the Hash codes for identifying the target audio signal entirety are generated.
S403, the part Hash codes for extracting the destination multimedia file, as index Hash codes.
For multimedia files such as short-sighted frequencies, user's upload amount is big, matched heavy workload, in the present embodiment, to mesh
It marks multimedia file and extracts part Hash codes, as index Hash codes.
In one embodiment of the invention, index Hash codes include the first Hash code block, the second Hash code block.
It is n sections by the Hash codes cutting of destination multimedia file, as the first Hash code block, wherein n is positive integer.
It furthermore, is m sections by the Hash codes cutting in addition to the first Hash code block, as the second Hash code block,
In, m is positive integer, and n may be the same or different with m.
If the length of Hash codes is t*n, t is constant, such as 32, then it can be first that n segment length is t by Hash codes cutting
Hash code block, m segment length are the second Hash code block of t* (n-1)/m.
For example, setting n, m is 8, the Hash codes of a destination multimedia file are ABCDEFGH, wherein A, B, C, D, E,
D, it can be totally 8 section first of A, B, C, D, E, D, F, H with cutting for the Hash codes that F, H, which respectively indicate the character string that length is t,
Hash code block.
It then can be 8 section of second Hash code block by ABCDFGH cutting, in sequence in addition, being directed to the first Hash code block E
It is denoted as E_1, E_2, E_3, E_4, E_5, E_6, E_7, E_8 respectively.
S404, Hash codes multimedia file identical with the index Hash codes is searched, as reference multimedia file.
In the concrete realization, index file can be generated with the Hash codes of multimedia file, to index Hash codes in the rope
It is indexed in quotation part, quickly finds Hash codes multimedia file identical with index Hash codes and be used as with reference to multimedia text
Part.
Wherein, there is reference audio signal with reference to multimedia file, believes with reference to multimedia file association identification reference audio
The Hash codes of number entirety.
It should be noted that it is so-called identical, other than indicating that Hash codes/index Hash codes character string is identical, the word
Symbol string is also identical in the location of Hash codes.
In one embodiment of the invention, index file includes n the first mapping tables, the second mapping table.
First mapping table and the first Hash code block are one-to-one relationships, can be used for storing the first Hash of designated position
Code block, in the first mapping table, the first Hash code block is as key (key), and m the second mapping tables are as value (value), the two tool
There are mapping relations.
Second mapping table and the second Hash code block are one-to-one relationships, can be used for storing the second Hash of designated position
Code block, in the second mapping table, the second Hash code block has mapping as value (value), the two as key (key), Hash codes
Relationship.
Wherein, which can refer to the first Hash code block/second Hash code block in cutting in location.
For the Hash codes in the second mapping table, it is believed that meet following condition:
1, this Hash codes is splitted into n sections of Hash codes, its jth section Hash codes are the in associated first mapping table
One Hash code block;
2, after this Hash codes being removed jth section Hash code block, then m sections of Hash code blocks are splitted into, t sections of Hash codes are closed for it
The second Hash code block in second mapping table of connection.
For example, n, m 8 is set, as shown in figure 5, the first mapping table totally 8, the mapping of respectively the first mapping table 501, first
Table 502, the first mapping table 503, the first mapping table 504, the first mapping table 505, the first mapping table 506, the first mapping table 507,
First mapping table 508.
First mapping table 501 is for storing first the first Hash code block, such as the A in Hash codes ABCDEFGH.
First mapping table 502 is for storing second the first Hash code block, such as the B in Hash codes ABCDEFGH.
First mapping table 503 is for storing the first Hash code block of third, such as the C in Hash codes ABCDEFGH.
First mapping table 504 is for storing the 4th the first Hash code block, such as the D in Hash codes ABCDEFGH.
First mapping table 505 is for storing the 5th the first Hash code block, such as the E in Hash codes ABCDEFGH.
First mapping table 506 is for storing the 6th the first Hash code block, such as the F in Hash codes ABCDEFGH.
First mapping table 507 is for storing the 7th the first Hash code block, such as the G in Hash codes ABCDEFGH.
First mapping table 508 is for storing the 8th the first Hash code block, such as the H in Hash codes ABCDEFGH.
Further, E in the first mapping table 505 maps 8 the second mapping tables, respectively the second mapping table 5051, the
Two mapping tables 5052, the second mapping table 5053, the second mapping table 5054, the second mapping table 5055, the second mapping table 5056, second
Mapping table 5057, the second mapping table 5058.
Second mapping table 5051 is for storing first the second Hash code block, such as E_1.
Second mapping table 5052 is for storing second the second Hash code block, such as E_2.
Second mapping table 5053 is for storing the second Hash code block of third, such as E_3.
Second mapping table 5054 is for storing the 4th the second Hash code block, such as E_4.
Second mapping table 5055 is for storing the 5th the second Hash code block, such as E_5.
Second mapping table 5056 is for storing the 6th the second Hash code block, such as E_6.
Second mapping table 5057 is for storing the 7th the second Hash code block, such as E_7.
Second mapping table 5058 is for storing the 8th the second Hash code block, such as E_8.
Further, E_4 in the second mapping table 5054 maps Hash codes set 50541, in the Hash codes set 50541
Hash codes, such as ABCDEFGH, when indicating to its 8 sections of Hash codes of cutting, the 5th section of Hash codes (the first Hash code block) is E, is gone
Except when cutting is 8 sections of Hash codes after E, the 4th section of Hash codes (the second Hash code block) is E_4.
In oneainstance, S404 includes the following steps:
The first mapping table that S40411, determination are adapted to the first Hash code block.
The position for determining current first Hash code block searches from the first preset mapping table and is in the position for storing
The first Hash code block the first mapping table, as the first mapping table being adapted to current first Hash code block.
S40412, the first Hash code block is searched in first mapping table.
It is collided by Hash, respectively in the first mapping table being adapted to the first Hash code block, searches first Hash codes
Block.
If S40413, finding the first Hash code block, first Hash is determined in first mapping table
Second mapping table of code block mapping.
If finding the first Hash code block in the first mapping table being adapted to the first Hash code block, can search
Second mapping table of the first Hash code block mapping.
S40414, with search the second Hash code block in the second mapping table of the second Hash codes Block- matching.
The position for determining current second Hash code block is searched from the second mapping table for storing second in the position
Second mapping table of Hash code block, as the second mapping table being adapted to current second Hash code block.
It is collided by Hash, respectively in the second mapping table being adapted to the second Hash code block, searches second Hash codes
Block.
If S40415, finding the second Hash code block, second Hash is extracted in first mapping table
The Hash codes of code block mapping.
S40416, determine that multimedia file belonging to the Hash codes is with reference to multimedia file.
If with the second Hash code block is found in the second mapping table of the second Hash codes Block- matching, can search
The Hash codes set of second Hash code block mapping, the Hash codes in the Hash codes set, partial character string and index Hash
Code (the first Hash code block, the second Hash code block) it is identical, belonging to multimedia file can be used as with reference to multimedia file.
In another scenario, S404 includes the following steps:
The first mapping table that S40421, determination are adapted to the first Hash code block.
S40422, the first Hash code block is searched in first mapping table.
If S40423, not finding the first Hash code block, the first Hash code block write-in described first is reflected
In firing table.
If not finding the first Hash code block in the first mapping table being adapted to the first Hash code block, can incite somebody to action
The first Hash code block is written in first mapping table.
S40424, the second mapping table is generated.
S40425, reflecting between the first Hash code block and second mapping table is being established in first mapping table
Penetrate relationship.
M the second mapping tables are generated, and are key with first Hash codes, which is in the first mapping table
Value establishes the mapping relations between the first Hash code block and the second mapping table.
S40426, the second Hash code block is written in the second mapping table with the second Hash codes Block- matching.
The position for determining the second Hash code block of each second mapping table storage, will be currently located at the second Hash of the position
Code block is written in second mapping table.
S40427, the Kazakhstan that the second Hash code block Yu the destination multimedia file are established in second mapping table
Mapping relations between uncommon code.
It is key with second Hash codes in the second mapping table, the Kazakhstan of the second Hash codes said target multimedia file
Uncommon code is value, establishes the mapping relations between the second Hash code block and the Hash codes of the destination multimedia file, multiple Kazakhstan
Uncommon code can form Hash codes set.
In embodiments of the present invention, in the case where not finding the first Hash code block, the first Hash code block is written suitable
In the first mapping table matched, and the second mapping table is generated, in the second mapping table that the write-in of the second Hash codes is adapted to, and typing
Target hash code constantly updates the first mapping table, the second mapping table during searching with reference to multimedia file, guarantees ginseng
Examine the comprehensive of multimedia file typing.
In another situation, S404 includes the following steps:
The first mapping table that S40431, determination are adapted to the first Hash code block.
S40432, the first Hash code block is searched in first mapping table.
If S40433, finding the first Hash code block, first Hash is determined in first mapping table
Second mapping table of code block mapping.
If S40434, not finding the second Hash code block, by the second Hash code block write-in and described second
In second mapping table of Hash codes Block- matching.
S40435, the Kazakhstan that the second Hash code block Yu the destination multimedia file are established in second mapping table
Mapping relations between uncommon code.
If with the second Hash code block is not found in the second mapping table of the second Hash codes Block- matching, can will
The second Hash code block is written in second mapping table.
It is key with second Hash codes in the second mapping table, the Kazakhstan of the second Hash codes said target multimedia file
Uncommon code is value, establishes the mapping relations between the second Hash code block and the Hash codes of the destination multimedia file, multiple Kazakhstan
Uncommon code can form Hash codes set.
In embodiments of the present invention, in the case where not finding the second Hash code block, the second Hash codes is written and are adapted to
The second mapping table in, and typing target hash code constantly updates the second mapping during searching with reference to multimedia file
Table guarantees with reference to the comprehensive of multimedia file typing.
Certainly, the mode that above-mentioned index Hash codes and its lookup refer to multimedia file is intended only as example, is implementing this
When inventive embodiments, other index Hash codes can be set according to the actual situation and its search the mode for referring to multimedia file,
For example, in the mapping relations established in the second mapping table between the second Hash code block and third mapping table, remove one of them
After two Hash code blocks, remaining second Hash code block is cut into r (r is positive integer) a third Hash code block, is reflected in third
The mapping relations, etc. between corresponding third Hash code block and Hash codes are established in firing table, this is not added in the embodiment of the present invention
With limitation.In addition, those skilled in the art are also in addition to above-mentioned index Hash codes and its other than searching the mode with reference to multimedia file
Can be according to actual needs in such a way that other index Hash codes and its lookup refer to multimedia file, the embodiment of the present invention pair
This is also without restriction.
S405, it calculates between the Hash codes and the Hash codes with reference to multimedia file of the destination multimedia file
Distance.
If S406, the distance are less than preset targets threshold, it is determined that the destination multimedia file and the reference
Multimedia file matching.
The case where for using the first Hash code block, the second Hash code block to compare Hash codes, targets threshold is breathed out less than first
The quantity of uncommon code block.
Furthermore, it is assumed that the length of Hash codes is t*n, using continuous t bit as a first Hash code block,
Then Hash codes can be split as n the first Hash code blocks, if the distance between two Hash codes are less than n, two Hash codes are deposited
The first Hash code block that and distance identical a position is 0.
Certainly, the case where comparing Hash codes for other, also can be set other targets thresholds, this is not added in the present embodiment
With limitation.
The embodiment of the present invention extracts the part Hash codes of destination multimedia file, as index Hash codes, searches Hash codes
Multimedia file identical with index Hash codes is cost by redundancy, reduces the ratio of Hash codes as reference multimedia file
To range, the comparison quantity of Hash codes is greatly reduced, the time of comparison is effectively reduced, suitable for extensive, ultra-large
Hash compares.
Further, the quantity of the first Hash code block is n, and the quantity of the second Hash code block is m, then with n*m times of redundancy
As cost, average specific has been arrived to the 1/n*m of full dose comparison to range shorter.
In addition, the second Hash code block and destination multimedia file can be established in the second mapping table if distance is greater than 0
Hash codes between mapping relations.
It is key with second Hash codes i.e. in the second mapping table, the second Hash codes said target multimedia file
Hash codes are value, establish the mapping relations between the second Hash code block and the Hash codes of the destination multimedia file.
If distance is equal to 0, indicate that the Hash codes of destination multimedia file are identical as with reference to the Hash codes of multimedia file, then
The Hash codes of the destination multimedia file can be ignored.
In embodiments of the present invention, in the case where distance is greater than 0, typing target hash code refers to multimedia searching
During file, the second mapping table is constantly updated, is guaranteed with reference to the comprehensive of multimedia file typing.
Embodiment five
Fig. 6 is a kind of structural schematic diagram of the generating means for Hash codes that the embodiment of the present invention five provides, and the device is specific
May include following module:
Target audio signal receiving unit 601, for receiving target audio signal;
Sample audio signal acquisition unit 602, for the collecting sample audio signal from target audio signal;
Spectrum signal converting unit 603 has in the spectrum signal for sample audio signal to be converted to spectrum signal
There is amplitude;
Difference generation unit 604, for generating the Hash of mark target audio signal entirety according to the difference between amplitude
Code.
In one embodiment of the invention, sample audio signal acquisition unit 602 includes:
Subelement is sampled, for carrying out sampling processing to target audio signal, to sample the sample sound with target component
Frequency signal;
Quantity statistics subelement, the quantity for statistical sample audio signal;
Audio signal increases subelement, if being less than preset amount threshold for quantity, after sample audio signal
Increase specified audio signal, as new sample audio signal;
Window function adds subelement, for adding window function to the sample audio signal sampled from target audio signal;
Wherein, target component includes following at least one:
Frequency, monophonic.
In one embodiment of the invention, spectrum signal converting unit 603 includes:
Subelement is converted, for sample audio signal to be transformed to spectrum signal, there is frequency point, frequency point tool in spectrum signal
There are frequency and amplitude, frequency linearity is related;
Conversion subunit, for being converted to spectrum signal, so that frequency logarithm is related.
In one embodiment of the invention, difference generation unit 604 includes:
First difference subelement obtains the first signal difference value for carrying out difference processing to amplitude;
First binarization unit obtains Hash codes for carrying out binary conversion treatment to the first signal difference value.
In one example of an embodiment of the present invention, the length of the first signal difference value fewer than the quantity of the amplitude one
Position;
First difference subelement is also used to:
By the difference between present bit amplitude and next bit amplitude, it is assigned to present bit the first signal difference value.
In one example of an embodiment of the present invention, the first binarization unit is also used to:
If the first signal difference value is greater than 0, it is determined that Hash codes 1;
If the first signal difference value is less than or equal to 0, it is determined that Hash codes 0.
In one embodiment of the invention, difference generation unit 604 further include:
Attribute determines subelement, for determining the attribute of sample audio signal;
Subelement is effectively determined, if being non-mute signal and non-noise signal for attribute, it is determined that Hash codes are effective;
It is invalid to determine subelement, if being mute signal or noise signal for attribute, it is determined that Hash codes are invalid.
In one example of an embodiment of the present invention, attribute determines that subelement is also used to:
Calculate Hash codes and the distance between 0;
If distance be greater than or equal to preset distance threshold, it is determined that the attribute of sample audio signal be non-mute signal and
Non-noise signal;
If distance is less than preset distance threshold, it is determined that the attribute of affiliated sample audio signal is mute signal or noise
Signal.
In one embodiment of the invention, difference generation unit 604 further include:
Second difference subelement obtains second signal difference value for carrying out difference processing to the first signal difference value;
Second binarization unit obtains new Hash codes for carrying out binary conversion treatment to second signal difference value.
In one example of an embodiment of the present invention, the length of the length of the first signal difference value and second signal difference value
It is identical;
Second difference subelement is also used to:
If present bit second signal difference value is last non-position, by present bit the first signal difference value and next bit the
Difference between one signal difference value is assigned to present bit second signal difference value;
If present bit second signal difference value is last position, present bit the first signal difference value is assigned to present bit
Second signal difference value.
In one example of an embodiment of the present invention, the second binarization unit is also used to:
If second signal difference value is greater than 0, it is determined that new Hash codes are 1;
If second signal difference value is less than or equal to 0, it is determined that new Hash codes are 0.
Kazakhstan provided by any embodiment of the invention can be performed in the generating means of Hash codes provided by the embodiment of the present invention
The generation method of uncommon code, has the corresponding functional module of execution method and beneficial effect.
Embodiment six
Fig. 7 is a kind of structural schematic diagram of the coalignment for multimedia file that the embodiment of the present invention six provides, the device
It can specifically include following module:
Destination multimedia file determining module 701, for determining that destination multimedia file, destination multimedia file have mesh
Mark audio signal;
Hash codes generation module 702, for generating the Hash codes of mark target audio signal entirety;
With reference to multimedia file determining module 703, multimedia file is referred to for determining, there is ginseng with reference to multimedia file
Audio signal is examined, with reference to the Hash codes of multimedia file association identification reference audio signal entirety;
File distance calculation module 704, for calculating the Hash codes of destination multimedia file and with reference to multimedia file
The distance between Hash codes;
File matches determining module 705, if being less than preset targets threshold for distance, it is determined that destination multimedia file
It is matched with reference multimedia file.
In one embodiment of the invention, Hash codes generation module 702 includes:
Sample audio signal acquisition unit, for the collecting sample audio signal from target audio signal;
Spectrum signal converting unit has amplitude in spectrum signal for sample audio signal to be converted to spectrum signal;
Difference generation unit, for generating the Hash codes of mark target audio signal entirety according to the difference between amplitude.
In one embodiment of the invention, include: with reference to multimedia file determining module 703
Hash codes extraction unit is indexed, for extracting the part Hash codes of destination multimedia file, as index Hash codes;
Hash codes searching unit, for searching Hash codes multimedia file identical with index Hash codes, as with reference to more
Media file.
In one embodiment of the invention, index Hash codes include the first Hash code block, the second Hash code block, target threshold
It is worth the quantity less than the first Hash code block;
Indexing Hash codes extraction unit includes:
First cutting subelement, for being n sections by the Hash codes cutting of destination multimedia file, as the first Hash codes
Block;
Second cutting subelement, for that will be m sections except the Hash codes cutting in addition to the first Hash code block, as the second Hash
Code block.
In one embodiment of the invention, Hash codes searching unit includes:
First mapping table determines subelement, for determining the first mapping table being adapted to the first Hash code block;
First Hash code block searches subelement, for searching the first Hash code block in the first mapping table;
Second mapping table determines subelement, if determining the in the first mapping table for finding the first Hash code block
Second mapping table of one Hash code block mapping;
Second Hash code block search subelement, for search second in the second mapping table of the second Hash codes Block- matching
Hash code block;
Hash codes extract subelement, if extracting second for finding the second Hash code block in the first mapping table and breathing out
The Hash codes of uncommon code block mapping;
Belong to and determine subelement, for determining that multimedia file belonging to Hash codes is with reference to multimedia file.
In one embodiment of the invention, Hash codes searching unit further include:
First piece of write-in subelement, if the first Hash code block is written for not finding the first Hash code block
In first mapping table;
First generates subelement, for generating the second mapping table;
First mapping relations establish subelement, for establishing the first Hash code block and the second mapping table in the first mapping table
Between mapping relations;
Second piece of write-in subelement, for the second Hash code block to be written to the second mapping table with the second Hash codes Block- matching
In;
Second mapping management establishes subelement, for establishing the second Hash code block and destination multimedia in the second mapping table
Mapping relations between the Hash codes of file.
In one embodiment of the invention, Hash codes searching unit further include:
Subelement is written in third block, if for not finding the second Hash code block, by the second Hash code block write-in and the
In second mapping table of two Hash codes Block- matchings;
Third mapping management establishes subelement, for establishing the second Hash code block and destination multimedia in the second mapping table
Mapping relations between the Hash codes of file.
In one embodiment of the invention, further includes:
Mapping block establishes the second Hash code block and destination multimedia if being greater than 0 for distance in the second mapping table
Mapping relations between the Hash codes of file.
The coalignment of multimedia file provided by the embodiment of the present invention can be performed any embodiment of that present invention and be provided
Multimedia file matching process, have the corresponding functional module of execution method and beneficial effect.
Embodiment seven
Fig. 8 is the structural schematic diagram for a kind of electronic equipment that the embodiment of the present invention seven provides.As shown in figure 8, the electronics is set
Standby includes processor 800, memory 801, communication module 802, input unit 803 and output device 804;It is handled in electronic equipment
The quantity of device 800 can be one or more, in Fig. 8 by taking a processor 800 as an example;Processor 800 in electronic equipment is deposited
Reservoir 801, communication module 802, input unit 803 and output device 804 can be connected by bus or other modes, in Fig. 8
For being connected by bus.
Memory 801 is used as a kind of computer readable storage medium, can be used for storing software program, journey can be performed in computer
Sequence and module, if the corresponding module of the generation method of the Hash codes in the present embodiment is (for example, Hash codes as shown in FIG. 6
Target audio signal receiving unit 601, sample audio signal acquisition unit 602 in generating means, spectrum signal converting unit
603, difference generation unit 604), alternatively, the corresponding module of the matching process of multimedia file is (for example, more matchmakers as shown in Figure 7
Destination multimedia file determining module 701, Hash codes generation module 702 in the coalignment of body file, with reference to multimedia text
Part determining module 703, file distance calculation module 704, file match determining module 705).Processor 800 passes through operation storage
Software program, instruction and module in memory 801, at the various function application and data of electronic equipment
Reason, that is, realize the generation method of above-mentioned Hash codes or the matching process of multimedia file.
Memory 801 can mainly include storing program area and storage data area, wherein storing program area can store operation system
Application program needed for system, at least one function;Storage data area, which can be stored, uses created data according to electronic equipment
Deng.In addition, memory 801 may include high-speed random access memory, it can also include nonvolatile memory, for example, at least
One disk memory, flush memory device or other non-volatile solid state memory parts.In some instances, memory 801 can
It further comprise the memory remotely located relative to processor 800, these remote memories can pass through network connection to electricity
Sub- equipment.The example of above-mentioned network includes but is not limited to internet, intranet, local area network, mobile radio communication and combinations thereof.
Communication module 802 for establishing connection with display screen, and realizes the data interaction with display screen.Input unit 803
It can be used for receiving the number or character information of input, and generate related with the user setting of electronic equipment and function control
Key signals input.
The generation side for the Hash codes that any embodiment of the present invention provides can be performed in a kind of electronic equipment provided in this embodiment
The matching process of method or multimedia file, specific corresponding function and beneficial effect.
Embodiment eight
The embodiment of the present invention eight also provides a kind of computer readable storage medium, is stored thereon with computer program.
In oneainstance, a kind of generation method of Hash codes is realized when which is executed by processor, this method comprises:
Receive target audio signal;
The collecting sample audio signal from target audio signal;
Sample audio signal is converted into spectrum signal, there is amplitude in spectrum signal;
The Hash codes of mark target audio signal entirety are generated according to the difference between amplitude.
In another scenario, a kind of matching process of multimedia file, the party are realized when which is executed by processor
Method includes:
Determine that destination multimedia file, destination multimedia file have target audio signal;
Generate the Hash codes of mark target audio signal entirety;
Multimedia file is referred to and is determined, and there is reference audio signal with reference to multimedia file, be associated with reference to multimedia file
Identify the Hash codes of reference audio signal entirety;
Calculate the Hash codes of destination multimedia file and the distance between the Hash codes with reference to multimedia file;
If distance is less than preset targets threshold, it is determined that destination multimedia file is matched with reference to multimedia file.
Certainly, computer readable storage medium provided by the embodiment of the present invention, computer program are not limited to institute as above
The method operation stated, can also be performed the generation method or multimedia file of Hash codes provided by any embodiment of the invention
Matching process in relevant operation.
By the description above with respect to embodiment, it is apparent to those skilled in the art that, the present invention
It can be realized by software and required common hardware, naturally it is also possible to which by hardware realization, but in many cases, the former is more
Good embodiment.Based on this understanding, technical solution of the present invention substantially in other words contributes to the prior art
Part can be embodied in the form of software products, which can store in computer readable storage medium
In, floppy disk, read-only memory (Read-Only Memory, ROM), random access memory (Random such as computer
Access Memory, RAM), flash memory (FLASH), hard disk or CD etc., including some instructions are with so that a computer is set
Standby (can be personal computer, server or the network equipment etc.) executes method described in each embodiment of the present invention.
It is worth noting that, in the embodiment of the coalignment of the generating means or multimedia file of above-mentioned Hash codes,
Included each unit and module is only divided according to the functional logic, but is not limited to the above division, as long as
Corresponding functions can be realized;In addition, the specific name of each functional unit is also only for convenience of distinguishing each other, and do not have to
In limiting the scope of the invention.
Note that the above is only a better embodiment of the present invention and the applied technical principle.It will be appreciated by those skilled in the art that
The invention is not limited to the specific embodiments described herein, be able to carry out for a person skilled in the art it is various it is apparent variation,
It readjusts and substitutes without departing from protection scope of the present invention.Therefore, although being carried out by above embodiments to the present invention
It is described in further detail, but the present invention is not limited to the above embodiments only, without departing from the inventive concept, also
It may include more other equivalent embodiments, and the scope of the invention is determined by the scope of the appended claims.
Claims (23)
1. a kind of generation method of Hash codes characterized by comprising
Receive target audio signal;
The collecting sample audio signal from the target audio signal;
The sample audio signal is converted into spectrum signal, there is amplitude in the spectrum signal;
The Hash codes for identifying the target audio signal entirety are generated according to the difference between the amplitude.
2. the method according to claim 1, wherein the collecting sample audio from the target audio signal
Signal, comprising:
Sampling processing is carried out to the target audio signal, to sample the sample audio signal with target component;
Count the quantity of the sample audio signal;
If the quantity is less than preset amount threshold, increase specified audio signal after the sample audio signal,
As new sample audio signal;
Window function is added to the sample audio signal sampled from the target audio signal;
Wherein, the target component includes following at least one:
Frequency, monophonic.
3. the method according to claim 1, wherein described be converted to frequency spectrum letter for the sample audio signal
Number, comprising:
The sample audio signal is transformed to spectrum signal, there is frequency point, the frequency point has frequency in the spectrum signal
With amplitude, the frequency linearity correlation;
The spectrum signal is converted, so that the frequency logarithm is related.
4. method according to claim 1 or 2 or 3, which is characterized in that the difference according between the amplitude generates
Identify the Hash codes of the target audio signal entirety, comprising:
Difference processing is carried out to the amplitude, obtains the first signal difference value;
Binary conversion treatment is carried out to the first signal difference value, obtains Hash codes.
5. according to the method described in claim 4, it is characterized in that, the length of the first signal difference value is than the amplitude
Quantity is one few;
It is described that difference processing is carried out to the amplitude, obtain the first signal difference value, comprising:
By the difference between present bit amplitude and next bit amplitude, it is assigned to present bit the first signal difference value.
6. according to the method described in claim 4, it is characterized in that, described carry out at binaryzation the first signal difference value
Reason obtains Hash codes, comprising:
If the first signal difference value is greater than 0, it is determined that Hash codes 1;
If the first signal difference value is less than or equal to 0, it is determined that Hash codes 0.
7. according to the method described in claim 4, it is characterized in that, which is characterized in that the difference according between the amplitude
Heteroplasia is at the Hash codes for identifying the target audio signal entirety, further includes:
Determine the attribute of the sample audio signal;
If the attribute is non-mute signal and non-noise signal, it is determined that the Hash codes are effective;
If the attribute is mute signal or noise signal, it is determined that the Hash codes are invalid.
8. the method according to the description of claim 7 is characterized in that it is characterized in that, the determination sample audio signal
Attribute, comprising:
Calculate the Hash codes and the distance between 0;
If the distance is greater than or equal to preset distance threshold, it is determined that the attribute of the sample audio signal is non-mute letter
Number and non-noise signal;
If the distance be less than preset distance threshold, it is determined that it is described belonging to sample audio signal attribute be mute signal or
Noise signal.
9. method according to claim 7 or 8, which is characterized in that it is characterized in that, the determination Hash codes without
After effect, the difference according between the amplitude generates the Hash codes for identifying the target audio signal entirety, further includes:
Difference processing is carried out to the first signal difference value, obtains second signal difference value;
Binary conversion treatment is carried out to the second signal difference value, obtains new Hash codes.
10. according to the method described in claim 9, it is characterized in that, which is characterized in that the length of the first signal difference value
It is identical as the length of the second signal difference value;
It is described that difference processing is carried out to the first signal difference value, obtain second signal difference value, comprising:
If present bit second signal difference value is last non-position, present bit the first signal difference value and next bit first are believed
Difference between number difference value, is assigned to present bit second signal difference value;
If present bit second signal difference value is last position, present bit the first signal difference value is assigned to present bit second
Signal difference value.
11. according to the method described in claim 9, it is characterized in that, which is characterized in that it is described to the second signal difference value
Binary conversion treatment is carried out, new Hash codes are obtained, comprising:
If the second signal difference value is greater than 0, it is determined that new Hash codes are 1;
If the second signal difference value is less than or equal to 0, it is determined that new Hash codes are 0.
12. a kind of matching process of multimedia file characterized by comprising
Determine that destination multimedia file, the destination multimedia file have target audio signal;
Generate the Hash codes for identifying the target audio signal entirety;
It determines and refers to multimedia file, it is described that there is reference audio signal with reference to multimedia file, it is described to refer to multimedia file
The Hash codes of reference audio signal entirety described in association identification;
Calculate the Hash codes and the distance between the Hash codes with reference to multimedia file of the destination multimedia file;
If the distance is less than preset targets threshold, it is determined that the destination multimedia file refers to multimedia file with described
Matching.
13. according to the method for claim 12, which is characterized in that it is characterized in that, the generation identifies the target sound
The Hash codes of frequency signal entirety, comprising:
The collecting sample audio signal from the target audio signal;
The sample audio signal is converted into spectrum signal, there is amplitude in the spectrum signal;
The Hash codes for identifying the target audio signal entirety are generated according to the difference between the amplitude.
14. according to the method for claim 12, which is characterized in that it is characterized in that, the determination refers to multimedia file,
Include:
The part Hash codes for extracting the destination multimedia file, as index Hash codes;
Hash codes multimedia file identical with the index Hash codes is searched, as reference multimedia file.
15. according to the method for claim 14, which is characterized in that it is characterized in that, the index Hash codes include first
Hash code block, the second Hash code block, the targets threshold are less than the quantity of the first Hash code block;
The part Hash codes for extracting the destination multimedia file, as index Hash codes, comprising:
It is n sections by the Hash codes cutting of the destination multimedia file, as the first Hash code block;
It is m sections by the Hash codes cutting in addition to the first Hash code block, as the second Hash code block.
16. according to the method for claim 14, which is characterized in that it is characterized in that, the index Hash codes include first
Hash code block, the second Hash code block, lookup Hash codes multimedia file identical with the index Hash codes, as reference
Multimedia file, comprising:
Determine the first mapping table being adapted to the first Hash code block;
The first Hash code block is searched in first mapping table;
If finding the first Hash code block, the of the first Hash code block mapping is determined in first mapping table
Two mapping tables;
With search the second Hash code block in the second mapping table of the second Hash codes Block- matching;
If finding the second Hash code block, the Kazakhstan of the second Hash code block mapping is extracted in first mapping table
Uncommon code;
Determine that multimedia file belonging to the Hash codes is with reference to multimedia file.
17. according to the method for claim 16, which is characterized in that it is characterized in that, the lookup Hash codes and the rope
Draw the identical multimedia file of Hash codes, as reference multimedia file, further includes:
If not finding the first Hash code block, the first Hash code block is written in first mapping table;
Generate the second mapping table;
In the mapping relations established in first mapping table between the first Hash code block and second mapping table;
It will be in the second mapping table of the second Hash code block write-in and the second Hash codes Block- matching;
It is being established between the second Hash code block and the Hash codes of the destination multimedia file in second mapping table
Mapping relations.
18. according to the method for claim 16, which is characterized in that it is characterized in that, the lookup Hash codes and the rope
Draw the identical multimedia file of Hash codes, as reference multimedia file, further includes:
If not finding the second Hash code block, by the second Hash code block write-in and the second Hash codes Block- matching
The second mapping table in;
It is being established between the second Hash code block and the Hash codes of the destination multimedia file in second mapping table
Mapping relations.
19. according to the method for claim 16, which is characterized in that if it is characterized in that, being less than in the distance pre-
If targets threshold, it is determined that the destination multimedia file with it is described match with reference to multimedia file after, further includes:
If the distance is greater than 0, the second Hash code block and the destination multimedia are established in second mapping table
Mapping relations between the Hash codes of file.
20. a kind of generating means of Hash codes characterized by comprising
Target audio signal receiving unit, for receiving target audio signal;
Sample audio signal acquisition unit, for the collecting sample audio signal from the target audio signal;
Spectrum signal converting unit has in the spectrum signal for the sample audio signal to be converted to spectrum signal
Amplitude;
Difference generation unit, for generating the Hash for identifying the target audio signal entirety according to the difference between the amplitude
Code.
21. a kind of coalignment of multimedia file characterized by comprising
Destination multimedia file determining module, for determining that destination multimedia file, the destination multimedia file have target
Audio signal;
Hash codes generation module, for generating the Hash codes for identifying the target audio signal entirety;
With reference to multimedia file determining module, multimedia file is referred to for determining, it is described that there is reference with reference to multimedia file
Audio signal, the Hash codes of the reference audio signal entirety with reference to described in multimedia file association identification;
File distance calculation module, for calculating the Hash codes of the destination multimedia file and described with reference to multimedia file
The distance between Hash codes;
File matches determining module, if being less than preset targets threshold for the distance, it is determined that the destination multimedia text
Part is matched with described with reference to multimedia file.
22. a kind of electronic equipment, which is characterized in that the electronic equipment includes:
One or more processors;
Memory, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processors are real
The now generation method of the Hash codes as described in any in claim 1-11 or more as described in any in claim 12-19
The matching process of media file.
23. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor
It is realized when execution any in the generation method or such as claim 12-19 of the Hash codes as described in any in claim 1-11
The matching process of the multimedia file.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910656276.8A CN110400578B (en) | 2019-07-19 | 2019-07-19 | Hash code generation and matching method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910656276.8A CN110400578B (en) | 2019-07-19 | 2019-07-19 | Hash code generation and matching method and device, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110400578A true CN110400578A (en) | 2019-11-01 |
CN110400578B CN110400578B (en) | 2022-05-17 |
Family
ID=68324629
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910656276.8A Active CN110400578B (en) | 2019-07-19 | 2019-07-19 | Hash code generation and matching method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110400578B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112802494A (en) * | 2021-04-12 | 2021-05-14 | 北京世纪好未来教育科技有限公司 | Voice evaluation method, device, computer equipment and medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140025386A1 (en) * | 2012-07-20 | 2014-01-23 | Qualcomm Incorporated | Systems, methods, apparatus, and computer-readable media for audio object clustering |
CN103581023A (en) * | 2013-11-06 | 2014-02-12 | 盛科网络(苏州)有限公司 | Method and device for realizing longest mask matching |
CN106782575A (en) * | 2011-06-01 | 2017-05-31 | 三星电子株式会社 | Audio coding method and equipment, audio-frequency decoding method and equipment |
CN107578452A (en) * | 2017-07-31 | 2018-01-12 | 华南理工大学 | A kind of jpeg image encryption method with compatible format and constant size |
CN108763492A (en) * | 2018-05-29 | 2018-11-06 | 四川远鉴科技有限公司 | A kind of audio template extracting method and device |
CN108962239A (en) * | 2018-06-08 | 2018-12-07 | 四川斐讯信息技术有限公司 | A kind of quick distribution method and system based on voice masking |
-
2019
- 2019-07-19 CN CN201910656276.8A patent/CN110400578B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106782575A (en) * | 2011-06-01 | 2017-05-31 | 三星电子株式会社 | Audio coding method and equipment, audio-frequency decoding method and equipment |
US20140025386A1 (en) * | 2012-07-20 | 2014-01-23 | Qualcomm Incorporated | Systems, methods, apparatus, and computer-readable media for audio object clustering |
CN103581023A (en) * | 2013-11-06 | 2014-02-12 | 盛科网络(苏州)有限公司 | Method and device for realizing longest mask matching |
CN107578452A (en) * | 2017-07-31 | 2018-01-12 | 华南理工大学 | A kind of jpeg image encryption method with compatible format and constant size |
CN108763492A (en) * | 2018-05-29 | 2018-11-06 | 四川远鉴科技有限公司 | A kind of audio template extracting method and device |
CN108962239A (en) * | 2018-06-08 | 2018-12-07 | 四川斐讯信息技术有限公司 | A kind of quick distribution method and system based on voice masking |
Non-Patent Citations (1)
Title |
---|
李建松等: "《地理信息系统原理》", 31 January 2015, 武汉大学出版社 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112802494A (en) * | 2021-04-12 | 2021-05-14 | 北京世纪好未来教育科技有限公司 | Voice evaluation method, device, computer equipment and medium |
CN112802494B (en) * | 2021-04-12 | 2021-07-16 | 北京世纪好未来教育科技有限公司 | Voice evaluation method, device, computer equipment and medium |
Also Published As
Publication number | Publication date |
---|---|
CN110400578B (en) | 2022-05-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6820058B2 (en) | Speech recognition methods, devices, devices, and storage media | |
CN112115706B (en) | Text processing method and device, electronic equipment and medium | |
US10261965B2 (en) | Audio generation method, server, and storage medium | |
Haitsma et al. | A highly robust audio fingerprinting system with an efficient search strategy | |
Haitsma et al. | A highly robust audio fingerprinting system. | |
US20180374491A1 (en) | Systems and Methods for Recognizing Sound and Music Signals in High Noise and Distortion | |
CN107293307B (en) | Audio detection method and device | |
EP3255633B1 (en) | Audio content recognition method and device | |
CN108304424B (en) | Text keyword extraction method and text keyword extraction device | |
CN113254620B (en) | Response method, device and equipment based on graph neural network and storage medium | |
CN110209809B (en) | Text clustering method and device, storage medium and electronic device | |
CN106959976B (en) | Search processing method and device | |
CN106713111B (en) | Processing method for adding friends, terminal and server | |
CN112053691A (en) | Conference assisting method and device, electronic equipment and storage medium | |
CN111079386A (en) | Address recognition method, device, equipment and storage medium | |
US20080091427A1 (en) | Hierarchical word indexes used for efficient N-gram storage | |
CN110400578A (en) | The generation of Hash codes and its matching process, device, electronic equipment and storage medium | |
CN1987852A (en) | Method and device for determining communication object attribute according to news content | |
CN106782612B (en) | reverse popping detection method and device | |
CN103247316B (en) | The method and system of index building in a kind of audio retrieval | |
CN107918606B (en) | Method and device for identifying avatar nouns and computer readable storage medium | |
CN115691503A (en) | Voice recognition method and device, electronic equipment and storage medium | |
CN110517671B (en) | Audio information evaluation method and device and storage medium | |
CN114138986A (en) | Customer management platform with enhanced content and method thereof | |
CN110322883B (en) | Voice-to-text effect evaluation optimization method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20221202 Address after: 31a, 15 / F, building 30, maple mall, bangrang Road, Brazil, Singapore Patentee after: Baiguoyuan Technology (Singapore) Co.,Ltd. Address before: 511400 floor 5-13, West Tower, building C, 274 Xingtai Road, Shiqiao street, Panyu District, Guangzhou City, Guangdong Province Patentee before: GUANGZHOU BAIGUOYUAN INFORMATION TECHNOLOGY Co.,Ltd. |
|
TR01 | Transfer of patent right |