CN112420023B - Music infringement detection method - Google Patents
Music infringement detection method Download PDFInfo
- Publication number
- CN112420023B CN112420023B CN202011352226.XA CN202011352226A CN112420023B CN 112420023 B CN112420023 B CN 112420023B CN 202011352226 A CN202011352226 A CN 202011352226A CN 112420023 B CN112420023 B CN 112420023B
- Authority
- CN
- China
- Prior art keywords
- music
- vectors
- frequency spectrum
- library
- spectrum signal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 17
- 239000013598 vector Substances 0.000 claims abstract description 34
- 238000001228 spectrum Methods 0.000 claims abstract description 21
- 238000000034 method Methods 0.000 claims abstract description 16
- 238000012549 training Methods 0.000 claims abstract description 11
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 9
- 238000001914 filtration Methods 0.000 claims abstract description 9
- 230000006835 compression Effects 0.000 claims abstract description 8
- 238000007906 compression Methods 0.000 claims abstract description 8
- 238000000605 extraction Methods 0.000 claims abstract description 8
- 238000012360 testing method Methods 0.000 claims abstract description 8
- 238000012216 screening Methods 0.000 claims abstract description 4
- 210000002569 neuron Anatomy 0.000 claims description 10
- 238000004590 computer program Methods 0.000 claims description 9
- 238000013528 artificial neural network Methods 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000012163 sequencing technique Methods 0.000 claims description 6
- 238000012544 monitoring process Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 241000239290 Araneae Species 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/10—Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
- G06F21/12—Protecting executable software
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/16—Speech classification or search using artificial neural networks
Abstract
The invention relates to a music infringement detection method, which comprises the following steps: s1: sequentially carrying out short-time Fourier transform on each piece of music in the music library to obtain a frequency spectrum signal corresponding to the music ID; s2: performing dynamic resolution compression on the frequency spectrum signal; s3: calculating an extreme point of each frequency band interval according to the compressed frequency spectrum signal; s4: filtering the extreme points, and subtracting every two extreme points to obtain music vectors of the music library; s5: compressing music vectors of a music library into int32 according to bits; s6: establishing a hash table by taking int32 as Key and music ID as Value, wherein the music ID progressively marks each piece of music according to the time sequence of music entering a music library; s7: inputting training audio to obtain infringement probability; s8: and inputting test audio to obtain infringement probability. The method and the device perform feature extraction on the frequency spectrum information of the music through the convolutional neural network and the full-connection network, can extract useful features in multiple dimensions, do not need manual screening, and improve the accuracy and the efficiency of detection.
Description
Technical Field
The invention belongs to the technical field of data processing, and particularly relates to a music infringement detection method.
Background
The popularization of the internet also brings the wide popularity of music at present, and people can conveniently listen to and use the music in various modes to create videos. But the music is copyrighted and if it is randomly used for commercial video it will cause infringement problems, compromising the rights and interests of the music creators.
A patent with a patent publication number of CN101493918A discloses a music infringing method, which specifically comprises the following steps:
the invention relates to an online music piracy monitoring method and a system, which sequentially comprise the following steps: the audio fingerprint extraction module acquires an audio download address from the Internet: the audio fingerprint extraction module reads an audio file from the audio download address, and the audio file is processed to obtain an audio fingerprint: the monitoring analysis module compares the audio fingerprint with the audio fingerprint of the genuine audio file: if the comparison result is larger than the set wide value, the infringement positioning module further acquires the information of the suspected infringement person and sends out a warning to the suspected infringement person. Compared with the prior art, the invention has the technical effects that: through the technical means of network spiders, audio fingerprint extraction, feature code extraction and the like, the network digital music resources are effectively monitored, and the evidence and the warning of the infringement behavior are obtained, so that the whole process is automatic, the cost is greatly saved, the time is saved, and the timeliness of the right maintenance is ensured.
Although the above patent can judge the music infringement, the accuracy and performance of infringement detection cannot be guaranteed.
Disclosure of Invention
In order to solve the problems, the invention provides a music infringement detection method which can greatly improve the accuracy and performance of infringement detection based on deep learning on the basis of judging infringement.
The technical scheme of the invention is as follows:
a music infringement detection method comprises the following steps:
s1: carrying out short-time Fourier transform on music in a music library to obtain a frequency spectrum signal;
s2: performing dynamic resolution compression on the frequency spectrum signal;
s3: calculating an extreme point of each frequency band interval according to the compressed frequency spectrum signal;
s4: filtering the extreme points, and subtracting every two extreme points to obtain music vectors of the music library;
s5: compressing music vectors of a music library into int32 according to bits;
s6: repeating the steps S1-S5 aiming at all music in the music library, constructing a hash table by taking int32 as Key and music ID as Value, wherein the music ID progressively marks each piece of music according to the time sequence of the music entering the music library;
s7: inputting training audio, acquiring a frequency spectrum signal of the training audio by using short-time Fourier transform, repeating the steps S2-S5, acquiring training music vectors, colliding with a hash table containing all music library music vectors, sequencing according to the time of collision, calculating the Euclidean distance between the two vectors, and normalizing to obtain infringement probability;
s8: inputting a test audio, acquiring a frequency spectrum signal of the test audio by using short-time Fourier transform, repeating the steps S2-S5, acquiring a test music vector, colliding with a hash table containing all music library music vectors, sequencing according to the time of collision, calculating the Euclidean distance between the two vectors, and normalizing to obtain the infringement probability.
Preferably, the specific process of the dynamic resolution compression in step S2 is as follows:
s2.1: forming a spectrogram by using the input Fourier transformed spectrum signal;
s2.2: vertically and uniformly dividing the spectrogram into a plurality of regions;
s2.3: performing feature extraction on the region in the step S2.2 through a convolutional neural network;
s2.4: judging whether the region belongs to useful features or not, and rejecting partial regions not containing the useful features;
s2.5: and splicing the rest regions into a new spectrogram again.
Preferably, the convolutional neural network comprises six convolutional layers and three fully-connected layers, the convolutional layers comprise eight 1 × 1 convolutional kernels, two layers of the fully-connected layers comprise 1024 neurons, and one layer comprises 2 neurons.
Preferably, the calculation of the extreme point in step S3 is to find a maximum value and a minimum value, and the calculation formula of the maximum value is:the calculation formula of the minimum value is as follows:。
preferably, the filtering step in step S4 includes:
s4.1: screening all extreme points through a multilayer fully-connected neural network;
s4.2: eliminating the extreme point which is output as 0 after passing through the multilayer fully-connected neural network, and reserving the extreme point which is output as 1;
s4.3: and splicing the residual extreme points of different frequency bands and outputting.
Preferably, the fully-connected neural network comprises three layers, wherein each of the first and second layers comprises 1024 neurons, and the third layer comprises 2 neurons.
The invention also provides a computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the music infringement detection method when executing the computer program.
The present invention also provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the music infringement detection method.
The invention has the beneficial effects that: the method and the device perform feature extraction on the frequency spectrum information of the music through the convolutional neural network and the full-connection network, can extract useful features in multiple dimensions, do not need manual screening, and improve the accuracy and the efficiency of detection.
Drawings
Fig. 1 is a flowchart of a method provided in an embodiment of the present invention.
Fig. 2 is a detailed flowchart of dynamic resolution compression.
FIG. 3 is a flow chart of extreme point filtering.
Detailed Description
The embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
As shown in fig. 1, an embodiment of the present invention provides a music infringement detection method, which includes the following specific steps:
1. and carrying out short-time Fourier transform on the music in the music library to obtain a spectrogram.
The method comprises the steps of adding a sliding time window to an input music signal by utilizing short-time Fourier change, and carrying out Fourier transformation on the signal in the window to obtain time-varying frequency spectrum processing of the signal so as to convert a time domain signal into a frequency domain signal.
2. And performing dynamic resolution compression on the acquired spectrogram.
3. And calculating all extreme points of each frequency band interval.
4. Filtering the extreme points.
5. And obtaining vectors by pairwise subtraction of the extreme points.
In the step 3-5, a piece of music corresponds to a vector, specifically, a minimum value is subtracted from a maximum value of each frequency band to obtain a number, and then the obtained numbers of each frequency band are combined to obtain the vector.
6. Repeating the steps 1-5 for each piece of music in the music library to obtain vectors corresponding to all pieces of music, compressing the vectors into int32 according to bits, and constructing a HashTable by taking int32 as Key and music ID as Value.
7. Inputting training audio, repeating the steps 2-5 to obtain vectors, colliding with the HashTable, sequencing the collision of each piece of music according to time, calculating infringement probability, and comparing infringement results with the labels to obtain a training model.
8. Inputting test audio, repeating the steps 2-5 to obtain vectors, colliding with HashTable, sequencing the collision of each piece of music according to time, calculating infringement probability, and outputting whether infringement exists or not.
As an embodiment of the present invention, as shown in fig. 2, the specific process in step 2 is:
2.1, inputting a spectrogram.
2.2, vertically dividing the spectrogram into a plurality of regions, wherein the regions are divided into 256 regions in the embodiment.
And 2.3, extracting the features of the region in the step S2.2 through a convolutional neural network.
And 2.4, judging whether the area belongs to the useful features or not, and rejecting the partial area not containing the useful features.
And 2.5, splicing the rest areas into a new spectrogram again.
As an embodiment of the present invention, the calculation of the extreme point in step 3 is to find a maximum value and a minimum value, and the maximum value is calculated by the following formula:the calculation formula of the minimum value is as follows:。
as an embodiment of the present invention, the convolutional neural network includes six convolutional layers and three fully-connected layers, wherein the convolutional layers include eight convolutional cores of 1 × 1, two layers of the fully-connected layers include 1024 neurons, and one layer includes 2 neurons.
As an embodiment of the present invention, as shown in fig. 3, the specific process of filtering in step 4 is:
4.1, inputting an extreme point;
4.2, through the full-connection neural network;
4.3, outputting whether an extreme point is reserved;
and 4.4, splicing and outputting the residual extreme points.
As an embodiment of the present invention, the fully-connected neural network comprises three layers, wherein the first layer and the second layer each comprise 1024 neurons, and the third layer comprises 2 neurons.
The invention also provides a computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the music infringement detection method when executing the computer program.
The present invention also provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the music infringement detection method.
The practical examples of the method are as follows: the music library has music A, B, C, D, music IDs are 1, 2, 3 and 4 respectively, for example, A is firstly carried out short-time Fourier transform on A to obtain frequency spectrum signals [1.6393873e-05, -2.2720376e-05, -1.9727035e-05,.;, 0.0000000e +00, 0.0000000e +00, 0.0000000e +00], then dynamic resolution compression is carried out on the obtained frequency spectrum signals by a convolutional neural network to obtain [1.6393873e-05, -1.3622489e-05, -3.8468256e-05,.;, 1.4637652e-05, -2.58741654e-05, -1.8945687ee-05], then maximum values and minimum values are found out from the frequency spectrum signals according to regions, and then extreme value filtering is carried out by the neural network to obtain a maximum value sequence [1.6393873e-05, 2.9647521e-05,;. 3.7123548,.; 1.9647581e-05 ], 2.4874165e-05, 1.5512479e-05] and a minimum sequence [ -1.3222547e-05, -1.39852657e-05, -3.7988510e-05,.;, -1.3347891e-05, -2.6955249e-05, -2.58741654e-05], and then subtracting the two sequences to obtain a vector of music library music A [2.96164200e-05, 4.36327867e-05, 3.71239279e +00,;. 3.29954720e-05, 5.18294140e-05, 4.13866444e-05 ]. BCD is also performed as above, resulting in the corresponding vector. Then compressing the data in the vector into int32 according to bits, and constructing a hash table as follows:
Key | (music A corresponding vector) | (vector corresponding to music B) | (music C corresponding vector) | (vector corresponding to music D) |
Value | 1 | 2 | 3 | 4 |
Then, the vector of the training music T is calculated according to the method, the training music T collides with the upper table, namely the music vector of the library is compared with the T in pairs, the Euclidean distance is calculated, and then collision results (infringement probability) are sequenced according to the time when the collision occurs, so that the infringement probability of each music in the library of the T is obtained.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the present invention in its spirit and scope. Are intended to be covered by the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (5)
1. A music infringement detection method is characterized by comprising the following steps:
s1: carrying out short-time Fourier transform on music in a music library to obtain a frequency spectrum signal;
s2: performing dynamic resolution compression on the frequency spectrum signal;
s3: calculating an extreme point of each frequency band interval according to the compressed frequency spectrum signal;
s4: filtering the extreme points, and subtracting every two extreme points to obtain music vectors of the music library;
s5: compressing music vectors of a music library into int32 according to bits;
s6: repeating the steps S1-S5 aiming at all music in the music library, constructing a hash table by taking int32 as Key and music ID as Value, wherein the music ID progressively marks each piece of music according to the time sequence of the music entering the music library;
s7: inputting training audio, acquiring a frequency spectrum signal of the training audio by using short-time Fourier transform, repeating the steps S2-S5, acquiring training music vectors, colliding with a hash table containing all music library music vectors, sequencing according to the time of collision, calculating the Euclidean distance between the two vectors, and normalizing to obtain infringement probability;
s8: inputting a test audio, acquiring a frequency spectrum signal of the test audio by using short-time Fourier transform, repeating the steps S2-S5 to acquire a test music vector, colliding with a hash table containing all music library music vectors, sequencing according to the time of collision, calculating the Euclidean distance between the two vectors, and normalizing to obtain infringement probability; the specific process of dynamic resolution compression in step S2 is:
s2.1: forming a spectrogram by using the input Fourier transformed spectrum signal;
s2.2: vertically and uniformly dividing the spectrogram into a plurality of regions;
s2.3: performing feature extraction on the region in the step S2.2 through a convolutional neural network;
s2.4: judging whether the region belongs to useful features or not, and rejecting partial regions not containing the useful features;
s2.5: splicing the rest areas into a new spectrogram again;
the convolutional neural network comprises six convolutional layers and three fully-connected layers, wherein the convolutional layers comprise eight 1x1 convolutional kernels, two layers of the fully-connected layers comprise 1024 neurons, and one layer comprises 2 neurons.
3. the music infringement detection method of claim 1, wherein the filtering step in step S4 includes:
s4.1: screening all extreme points through a multilayer fully-connected neural network;
s4.2: eliminating the extreme point which is output as 0 after passing through the multilayer fully-connected neural network, and reserving the extreme point which is output as 1;
s4.3: and splicing the residual extreme points of different frequency bands and outputting.
4. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 3 when executing the computer program.
5. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011352226.XA CN112420023B (en) | 2020-11-26 | 2020-11-26 | Music infringement detection method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011352226.XA CN112420023B (en) | 2020-11-26 | 2020-11-26 | Music infringement detection method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112420023A CN112420023A (en) | 2021-02-26 |
CN112420023B true CN112420023B (en) | 2022-03-25 |
Family
ID=74843766
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011352226.XA Active CN112420023B (en) | 2020-11-26 | 2020-11-26 | Music infringement detection method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112420023B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101014953A (en) * | 2003-09-23 | 2007-08-08 | 音乐Ip公司 | Audio fingerprinting system and method |
CN101493918A (en) * | 2008-10-21 | 2009-07-29 | 深圳市牧笛科技有限公司 | On-line music pirate monitoring method and system |
CN104567674A (en) * | 2014-12-29 | 2015-04-29 | 北京理工大学 | Bilateral fitting confocal measuring method |
CN108899037A (en) * | 2018-07-05 | 2018-11-27 | 平安科技(深圳)有限公司 | Animal vocal print feature extracting method, device and electronic equipment |
CN109918539A (en) * | 2019-02-28 | 2019-06-21 | 华南理工大学 | A kind of mutual search method of sound, video for clicking behavior based on user |
CN111652177A (en) * | 2020-06-12 | 2020-09-11 | 中国计量大学 | Signal feature extraction method based on deep learning |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190385610A1 (en) * | 2017-12-08 | 2019-12-19 | Veritone, Inc. | Methods and systems for transcription |
EP3608918A1 (en) * | 2018-08-08 | 2020-02-12 | Tata Consultancy Services Limited | Parallel implementation of deep neural networks for classifying heart sound signals |
-
2020
- 2020-11-26 CN CN202011352226.XA patent/CN112420023B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101014953A (en) * | 2003-09-23 | 2007-08-08 | 音乐Ip公司 | Audio fingerprinting system and method |
CN101493918A (en) * | 2008-10-21 | 2009-07-29 | 深圳市牧笛科技有限公司 | On-line music pirate monitoring method and system |
CN104567674A (en) * | 2014-12-29 | 2015-04-29 | 北京理工大学 | Bilateral fitting confocal measuring method |
CN108899037A (en) * | 2018-07-05 | 2018-11-27 | 平安科技(深圳)有限公司 | Animal vocal print feature extracting method, device and electronic equipment |
CN109918539A (en) * | 2019-02-28 | 2019-06-21 | 华南理工大学 | A kind of mutual search method of sound, video for clicking behavior based on user |
CN111652177A (en) * | 2020-06-12 | 2020-09-11 | 中国计量大学 | Signal feature extraction method based on deep learning |
Non-Patent Citations (1)
Title |
---|
多媒体感知哈希算法及应用研究;赵玉鑫;《硕士学位论文》;20101231;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN112420023A (en) | 2021-02-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106778241B (en) | Malicious file identification method and device | |
CN106484837A (en) | The detection method of similar video file and device | |
CN106845516B (en) | Footprint image recognition method based on multi-sample joint representation | |
CN113257255B (en) | Method and device for identifying forged voice, electronic equipment and storage medium | |
KR101841985B1 (en) | Method and Apparatus for Extracting Audio Fingerprint | |
CN111079816A (en) | Image auditing method and device and server | |
CN108154099B (en) | Figure identification method and device and electronic equipment | |
CN112632609A (en) | Abnormality detection method, abnormality detection device, electronic apparatus, and storage medium | |
CN111553241A (en) | Method, device and equipment for rejecting mismatching points of palm print and storage medium | |
CN115393760A (en) | Method, system and equipment for detecting Deepfake composite video | |
Muthusamy et al. | Trilateral Filterative Hermitian feature transformed deep perceptive fuzzy neural network for finger vein verification | |
CN112420023B (en) | Music infringement detection method | |
CN106663102B (en) | Method and apparatus for generating a fingerprint of an information signal | |
CN114266740A (en) | Quality inspection method, device, equipment and storage medium for traditional Chinese medicine decoction pieces | |
CN114168788A (en) | Audio audit processing method, device, equipment and storage medium | |
CN111130794B (en) | Identity verification method based on iris and private key certificate chain connection storage structure | |
CN116662186A (en) | Log playback assertion method and device based on logistic regression and electronic equipment | |
CN115643065A (en) | Network attack event detection method and system | |
CN113421590B (en) | Abnormal behavior detection method, device, equipment and storage medium | |
CN111159588B (en) | Malicious URL detection method based on URL imaging technology | |
CN113113051A (en) | Audio fingerprint extraction method and device, computer equipment and storage medium | |
CN112084489A (en) | Suspicious application detection method and device | |
CN111402921A (en) | Voice copy paste tamper detection method and system | |
CN111581640A (en) | Malicious software detection method, device and equipment and storage medium | |
CN112330632B (en) | Digital photo camera fingerprint attack detection method based on countermeasure generation network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |