CN112420023B - Music infringement detection method - Google Patents

Music infringement detection method Download PDF

Info

Publication number
CN112420023B
CN112420023B CN202011352226.XA CN202011352226A CN112420023B CN 112420023 B CN112420023 B CN 112420023B CN 202011352226 A CN202011352226 A CN 202011352226A CN 112420023 B CN112420023 B CN 112420023B
Authority
CN
China
Prior art keywords
music
vectors
frequency spectrum
library
spectrum signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011352226.XA
Other languages
Chinese (zh)
Other versions
CN112420023A (en
Inventor
方煌锖
俞挺
张德华
黄静惠
柯登峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Yindu Artificial Intelligence Co ltd
Original Assignee
Hangzhou Yindu Artificial Intelligence Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Yindu Artificial Intelligence Co ltd filed Critical Hangzhou Yindu Artificial Intelligence Co ltd
Priority to CN202011352226.XA priority Critical patent/CN112420023B/en
Publication of CN112420023A publication Critical patent/CN112420023A/en
Application granted granted Critical
Publication of CN112420023B publication Critical patent/CN112420023B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/10Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
    • G06F21/12Protecting executable software
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks

Abstract

The invention relates to a music infringement detection method, which comprises the following steps: s1: sequentially carrying out short-time Fourier transform on each piece of music in the music library to obtain a frequency spectrum signal corresponding to the music ID; s2: performing dynamic resolution compression on the frequency spectrum signal; s3: calculating an extreme point of each frequency band interval according to the compressed frequency spectrum signal; s4: filtering the extreme points, and subtracting every two extreme points to obtain music vectors of the music library; s5: compressing music vectors of a music library into int32 according to bits; s6: establishing a hash table by taking int32 as Key and music ID as Value, wherein the music ID progressively marks each piece of music according to the time sequence of music entering a music library; s7: inputting training audio to obtain infringement probability; s8: and inputting test audio to obtain infringement probability. The method and the device perform feature extraction on the frequency spectrum information of the music through the convolutional neural network and the full-connection network, can extract useful features in multiple dimensions, do not need manual screening, and improve the accuracy and the efficiency of detection.

Description

Music infringement detection method
Technical Field
The invention belongs to the technical field of data processing, and particularly relates to a music infringement detection method.
Background
The popularization of the internet also brings the wide popularity of music at present, and people can conveniently listen to and use the music in various modes to create videos. But the music is copyrighted and if it is randomly used for commercial video it will cause infringement problems, compromising the rights and interests of the music creators.
A patent with a patent publication number of CN101493918A discloses a music infringing method, which specifically comprises the following steps:
the invention relates to an online music piracy monitoring method and a system, which sequentially comprise the following steps: the audio fingerprint extraction module acquires an audio download address from the Internet: the audio fingerprint extraction module reads an audio file from the audio download address, and the audio file is processed to obtain an audio fingerprint: the monitoring analysis module compares the audio fingerprint with the audio fingerprint of the genuine audio file: if the comparison result is larger than the set wide value, the infringement positioning module further acquires the information of the suspected infringement person and sends out a warning to the suspected infringement person. Compared with the prior art, the invention has the technical effects that: through the technical means of network spiders, audio fingerprint extraction, feature code extraction and the like, the network digital music resources are effectively monitored, and the evidence and the warning of the infringement behavior are obtained, so that the whole process is automatic, the cost is greatly saved, the time is saved, and the timeliness of the right maintenance is ensured.
Although the above patent can judge the music infringement, the accuracy and performance of infringement detection cannot be guaranteed.
Disclosure of Invention
In order to solve the problems, the invention provides a music infringement detection method which can greatly improve the accuracy and performance of infringement detection based on deep learning on the basis of judging infringement.
The technical scheme of the invention is as follows:
a music infringement detection method comprises the following steps:
s1: carrying out short-time Fourier transform on music in a music library to obtain a frequency spectrum signal;
s2: performing dynamic resolution compression on the frequency spectrum signal;
s3: calculating an extreme point of each frequency band interval according to the compressed frequency spectrum signal;
s4: filtering the extreme points, and subtracting every two extreme points to obtain music vectors of the music library;
s5: compressing music vectors of a music library into int32 according to bits;
s6: repeating the steps S1-S5 aiming at all music in the music library, constructing a hash table by taking int32 as Key and music ID as Value, wherein the music ID progressively marks each piece of music according to the time sequence of the music entering the music library;
s7: inputting training audio, acquiring a frequency spectrum signal of the training audio by using short-time Fourier transform, repeating the steps S2-S5, acquiring training music vectors, colliding with a hash table containing all music library music vectors, sequencing according to the time of collision, calculating the Euclidean distance between the two vectors, and normalizing to obtain infringement probability;
s8: inputting a test audio, acquiring a frequency spectrum signal of the test audio by using short-time Fourier transform, repeating the steps S2-S5, acquiring a test music vector, colliding with a hash table containing all music library music vectors, sequencing according to the time of collision, calculating the Euclidean distance between the two vectors, and normalizing to obtain the infringement probability.
Preferably, the specific process of the dynamic resolution compression in step S2 is as follows:
s2.1: forming a spectrogram by using the input Fourier transformed spectrum signal;
s2.2: vertically and uniformly dividing the spectrogram into a plurality of regions;
s2.3: performing feature extraction on the region in the step S2.2 through a convolutional neural network;
s2.4: judging whether the region belongs to useful features or not, and rejecting partial regions not containing the useful features;
s2.5: and splicing the rest regions into a new spectrogram again.
Preferably, the convolutional neural network comprises six convolutional layers and three fully-connected layers, the convolutional layers comprise eight 1 × 1 convolutional kernels, two layers of the fully-connected layers comprise 1024 neurons, and one layer comprises 2 neurons.
Preferably, the calculation of the extreme point in step S3 is to find a maximum value and a minimum value, and the calculation formula of the maximum value is:
Figure 100002_DEST_PATH_IMAGE001
the calculation formula of the minimum value is as follows:
Figure 671073DEST_PATH_IMAGE002
preferably, the filtering step in step S4 includes:
s4.1: screening all extreme points through a multilayer fully-connected neural network;
s4.2: eliminating the extreme point which is output as 0 after passing through the multilayer fully-connected neural network, and reserving the extreme point which is output as 1;
s4.3: and splicing the residual extreme points of different frequency bands and outputting.
Preferably, the fully-connected neural network comprises three layers, wherein each of the first and second layers comprises 1024 neurons, and the third layer comprises 2 neurons.
The invention also provides a computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the music infringement detection method when executing the computer program.
The present invention also provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the music infringement detection method.
The invention has the beneficial effects that: the method and the device perform feature extraction on the frequency spectrum information of the music through the convolutional neural network and the full-connection network, can extract useful features in multiple dimensions, do not need manual screening, and improve the accuracy and the efficiency of detection.
Drawings
Fig. 1 is a flowchart of a method provided in an embodiment of the present invention.
Fig. 2 is a detailed flowchart of dynamic resolution compression.
FIG. 3 is a flow chart of extreme point filtering.
Detailed Description
The embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
As shown in fig. 1, an embodiment of the present invention provides a music infringement detection method, which includes the following specific steps:
1. and carrying out short-time Fourier transform on the music in the music library to obtain a spectrogram.
The method comprises the steps of adding a sliding time window to an input music signal by utilizing short-time Fourier change, and carrying out Fourier transformation on the signal in the window to obtain time-varying frequency spectrum processing of the signal so as to convert a time domain signal into a frequency domain signal.
2. And performing dynamic resolution compression on the acquired spectrogram.
3. And calculating all extreme points of each frequency band interval.
4. Filtering the extreme points.
5. And obtaining vectors by pairwise subtraction of the extreme points.
In the step 3-5, a piece of music corresponds to a vector, specifically, a minimum value is subtracted from a maximum value of each frequency band to obtain a number, and then the obtained numbers of each frequency band are combined to obtain the vector.
6. Repeating the steps 1-5 for each piece of music in the music library to obtain vectors corresponding to all pieces of music, compressing the vectors into int32 according to bits, and constructing a HashTable by taking int32 as Key and music ID as Value.
7. Inputting training audio, repeating the steps 2-5 to obtain vectors, colliding with the HashTable, sequencing the collision of each piece of music according to time, calculating infringement probability, and comparing infringement results with the labels to obtain a training model.
8. Inputting test audio, repeating the steps 2-5 to obtain vectors, colliding with HashTable, sequencing the collision of each piece of music according to time, calculating infringement probability, and outputting whether infringement exists or not.
As an embodiment of the present invention, as shown in fig. 2, the specific process in step 2 is:
2.1, inputting a spectrogram.
2.2, vertically dividing the spectrogram into a plurality of regions, wherein the regions are divided into 256 regions in the embodiment.
And 2.3, extracting the features of the region in the step S2.2 through a convolutional neural network.
And 2.4, judging whether the area belongs to the useful features or not, and rejecting the partial area not containing the useful features.
And 2.5, splicing the rest areas into a new spectrogram again.
As an embodiment of the present invention, the calculation of the extreme point in step 3 is to find a maximum value and a minimum value, and the maximum value is calculated by the following formula:
Figure 726753DEST_PATH_IMAGE001
the calculation formula of the minimum value is as follows:
Figure 481083DEST_PATH_IMAGE002
as an embodiment of the present invention, the convolutional neural network includes six convolutional layers and three fully-connected layers, wherein the convolutional layers include eight convolutional cores of 1 × 1, two layers of the fully-connected layers include 1024 neurons, and one layer includes 2 neurons.
As an embodiment of the present invention, as shown in fig. 3, the specific process of filtering in step 4 is:
4.1, inputting an extreme point;
4.2, through the full-connection neural network;
4.3, outputting whether an extreme point is reserved;
and 4.4, splicing and outputting the residual extreme points.
As an embodiment of the present invention, the fully-connected neural network comprises three layers, wherein the first layer and the second layer each comprise 1024 neurons, and the third layer comprises 2 neurons.
The invention also provides a computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the music infringement detection method when executing the computer program.
The present invention also provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the music infringement detection method.
The practical examples of the method are as follows: the music library has music A, B, C, D, music IDs are 1, 2, 3 and 4 respectively, for example, A is firstly carried out short-time Fourier transform on A to obtain frequency spectrum signals [1.6393873e-05, -2.2720376e-05, -1.9727035e-05,.;, 0.0000000e +00, 0.0000000e +00, 0.0000000e +00], then dynamic resolution compression is carried out on the obtained frequency spectrum signals by a convolutional neural network to obtain [1.6393873e-05, -1.3622489e-05, -3.8468256e-05,.;, 1.4637652e-05, -2.58741654e-05, -1.8945687ee-05], then maximum values and minimum values are found out from the frequency spectrum signals according to regions, and then extreme value filtering is carried out by the neural network to obtain a maximum value sequence [1.6393873e-05, 2.9647521e-05,;. 3.7123548,.; 1.9647581e-05 ], 2.4874165e-05, 1.5512479e-05] and a minimum sequence [ -1.3222547e-05, -1.39852657e-05, -3.7988510e-05,.;, -1.3347891e-05, -2.6955249e-05, -2.58741654e-05], and then subtracting the two sequences to obtain a vector of music library music A [2.96164200e-05, 4.36327867e-05, 3.71239279e +00,;. 3.29954720e-05, 5.18294140e-05, 4.13866444e-05 ]. BCD is also performed as above, resulting in the corresponding vector. Then compressing the data in the vector into int32 according to bits, and constructing a hash table as follows:
Key (music A corresponding vector) (vector corresponding to music B) (music C corresponding vector) (vector corresponding to music D)
Value 1 2 3 4
Then, the vector of the training music T is calculated according to the method, the training music T collides with the upper table, namely the music vector of the library is compared with the T in pairs, the Euclidean distance is calculated, and then collision results (infringement probability) are sequenced according to the time when the collision occurs, so that the infringement probability of each music in the library of the T is obtained.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the present invention in its spirit and scope. Are intended to be covered by the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (5)

1. A music infringement detection method is characterized by comprising the following steps:
s1: carrying out short-time Fourier transform on music in a music library to obtain a frequency spectrum signal;
s2: performing dynamic resolution compression on the frequency spectrum signal;
s3: calculating an extreme point of each frequency band interval according to the compressed frequency spectrum signal;
s4: filtering the extreme points, and subtracting every two extreme points to obtain music vectors of the music library;
s5: compressing music vectors of a music library into int32 according to bits;
s6: repeating the steps S1-S5 aiming at all music in the music library, constructing a hash table by taking int32 as Key and music ID as Value, wherein the music ID progressively marks each piece of music according to the time sequence of the music entering the music library;
s7: inputting training audio, acquiring a frequency spectrum signal of the training audio by using short-time Fourier transform, repeating the steps S2-S5, acquiring training music vectors, colliding with a hash table containing all music library music vectors, sequencing according to the time of collision, calculating the Euclidean distance between the two vectors, and normalizing to obtain infringement probability;
s8: inputting a test audio, acquiring a frequency spectrum signal of the test audio by using short-time Fourier transform, repeating the steps S2-S5 to acquire a test music vector, colliding with a hash table containing all music library music vectors, sequencing according to the time of collision, calculating the Euclidean distance between the two vectors, and normalizing to obtain infringement probability; the specific process of dynamic resolution compression in step S2 is:
s2.1: forming a spectrogram by using the input Fourier transformed spectrum signal;
s2.2: vertically and uniformly dividing the spectrogram into a plurality of regions;
s2.3: performing feature extraction on the region in the step S2.2 through a convolutional neural network;
s2.4: judging whether the region belongs to useful features or not, and rejecting partial regions not containing the useful features;
s2.5: splicing the rest areas into a new spectrogram again;
the convolutional neural network comprises six convolutional layers and three fully-connected layers, wherein the convolutional layers comprise eight 1x1 convolutional kernels, two layers of the fully-connected layers comprise 1024 neurons, and one layer comprises 2 neurons.
2. The method for detecting music piracy according to claim 1, wherein the extreme points in step S3 are calculated by finding the maximum value and the minimum value, and the maximum value is calculated by the formula:
Figure DEST_PATH_IMAGE001
the calculation formula of the minimum value is as follows:
Figure 447369DEST_PATH_IMAGE002
3. the music infringement detection method of claim 1, wherein the filtering step in step S4 includes:
s4.1: screening all extreme points through a multilayer fully-connected neural network;
s4.2: eliminating the extreme point which is output as 0 after passing through the multilayer fully-connected neural network, and reserving the extreme point which is output as 1;
s4.3: and splicing the residual extreme points of different frequency bands and outputting.
4. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 3 when executing the computer program.
5. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 3.
CN202011352226.XA 2020-11-26 2020-11-26 Music infringement detection method Active CN112420023B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011352226.XA CN112420023B (en) 2020-11-26 2020-11-26 Music infringement detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011352226.XA CN112420023B (en) 2020-11-26 2020-11-26 Music infringement detection method

Publications (2)

Publication Number Publication Date
CN112420023A CN112420023A (en) 2021-02-26
CN112420023B true CN112420023B (en) 2022-03-25

Family

ID=74843766

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011352226.XA Active CN112420023B (en) 2020-11-26 2020-11-26 Music infringement detection method

Country Status (1)

Country Link
CN (1) CN112420023B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101014953A (en) * 2003-09-23 2007-08-08 音乐Ip公司 Audio fingerprinting system and method
CN101493918A (en) * 2008-10-21 2009-07-29 深圳市牧笛科技有限公司 On-line music pirate monitoring method and system
CN104567674A (en) * 2014-12-29 2015-04-29 北京理工大学 Bilateral fitting confocal measuring method
CN108899037A (en) * 2018-07-05 2018-11-27 平安科技(深圳)有限公司 Animal vocal print feature extracting method, device and electronic equipment
CN109918539A (en) * 2019-02-28 2019-06-21 华南理工大学 A kind of mutual search method of sound, video for clicking behavior based on user
CN111652177A (en) * 2020-06-12 2020-09-11 中国计量大学 Signal feature extraction method based on deep learning

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190385610A1 (en) * 2017-12-08 2019-12-19 Veritone, Inc. Methods and systems for transcription
EP3608918A1 (en) * 2018-08-08 2020-02-12 Tata Consultancy Services Limited Parallel implementation of deep neural networks for classifying heart sound signals

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101014953A (en) * 2003-09-23 2007-08-08 音乐Ip公司 Audio fingerprinting system and method
CN101493918A (en) * 2008-10-21 2009-07-29 深圳市牧笛科技有限公司 On-line music pirate monitoring method and system
CN104567674A (en) * 2014-12-29 2015-04-29 北京理工大学 Bilateral fitting confocal measuring method
CN108899037A (en) * 2018-07-05 2018-11-27 平安科技(深圳)有限公司 Animal vocal print feature extracting method, device and electronic equipment
CN109918539A (en) * 2019-02-28 2019-06-21 华南理工大学 A kind of mutual search method of sound, video for clicking behavior based on user
CN111652177A (en) * 2020-06-12 2020-09-11 中国计量大学 Signal feature extraction method based on deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
多媒体感知哈希算法及应用研究;赵玉鑫;《硕士学位论文》;20101231;全文 *

Also Published As

Publication number Publication date
CN112420023A (en) 2021-02-26

Similar Documents

Publication Publication Date Title
CN106778241B (en) Malicious file identification method and device
CN106484837A (en) The detection method of similar video file and device
CN106845516B (en) Footprint image recognition method based on multi-sample joint representation
CN113257255B (en) Method and device for identifying forged voice, electronic equipment and storage medium
KR101841985B1 (en) Method and Apparatus for Extracting Audio Fingerprint
CN111079816A (en) Image auditing method and device and server
CN108154099B (en) Figure identification method and device and electronic equipment
CN112632609A (en) Abnormality detection method, abnormality detection device, electronic apparatus, and storage medium
CN111553241A (en) Method, device and equipment for rejecting mismatching points of palm print and storage medium
CN115393760A (en) Method, system and equipment for detecting Deepfake composite video
Muthusamy et al. Trilateral Filterative Hermitian feature transformed deep perceptive fuzzy neural network for finger vein verification
CN112420023B (en) Music infringement detection method
CN106663102B (en) Method and apparatus for generating a fingerprint of an information signal
CN114266740A (en) Quality inspection method, device, equipment and storage medium for traditional Chinese medicine decoction pieces
CN114168788A (en) Audio audit processing method, device, equipment and storage medium
CN111130794B (en) Identity verification method based on iris and private key certificate chain connection storage structure
CN116662186A (en) Log playback assertion method and device based on logistic regression and electronic equipment
CN115643065A (en) Network attack event detection method and system
CN113421590B (en) Abnormal behavior detection method, device, equipment and storage medium
CN111159588B (en) Malicious URL detection method based on URL imaging technology
CN113113051A (en) Audio fingerprint extraction method and device, computer equipment and storage medium
CN112084489A (en) Suspicious application detection method and device
CN111402921A (en) Voice copy paste tamper detection method and system
CN111581640A (en) Malicious software detection method, device and equipment and storage medium
CN112330632B (en) Digital photo camera fingerprint attack detection method based on countermeasure generation network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant