WO2019137115A1 - 音乐分类方法及节拍点检测方法、存储设备及计算机设备 - Google Patents

音乐分类方法及节拍点检测方法、存储设备及计算机设备 Download PDF

Info

Publication number
WO2019137115A1
WO2019137115A1 PCT/CN2018/119112 CN2018119112W WO2019137115A1 WO 2019137115 A1 WO2019137115 A1 WO 2019137115A1 CN 2018119112 W CN2018119112 W CN 2018119112W WO 2019137115 A1 WO2019137115 A1 WO 2019137115A1
Authority
WO
WIPO (PCT)
Prior art keywords
beat
music
sub
beat point
band
Prior art date
Application number
PCT/CN2018/119112
Other languages
English (en)
French (fr)
Inventor
吴晓婕
Original Assignee
广州市百果园信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广州市百果园信息技术有限公司 filed Critical 广州市百果园信息技术有限公司
Priority to EP18900195.1A priority Critical patent/EP3723080A4/en
Priority to RU2020126263A priority patent/RU2743315C1/ru
Priority to US16/960,692 priority patent/US11715446B2/en
Publication of WO2019137115A1 publication Critical patent/WO2019137115A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/0008Associated control or indicating means
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/36Accompaniment arrangements
    • G10H1/40Rhythm
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/21Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/036Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal of musical genre, i.e. analysing the style of musical pieces, usually for selection, filtering or classification
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/056Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for extraction or identification of individual instrumental parts, e.g. melody, chords, bass; Identification or separation of instrumental parts by their characteristic voices or timbres
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/076Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for extraction of timing, tempo; Beat detection

Definitions

  • the present invention relates to the field of Internet technologies, and in particular, to a music classification method, a beat point detection method, a storage device, and a computer device.
  • the beat point of playing music cannot be obtained, so that the corresponding video special effect cannot be triggered according to the beat point of the played music. Therefore, in the video special effects processing, the special effects cannot be personalized according to the music played in the video, thereby affecting the user's experience satisfaction.
  • An object of the present invention is to provide a music classification method, a beat point detection method, a storage device, and a computer device, to obtain a beat point in music, so that a certain video effect in the special effect group can be triggered according to the position of the beat point. Improve user satisfaction.
  • a method for detecting a beat point of music comprising the steps of: performing frame processing on a music signal to obtain a frame signal; acquiring a power spectrum of the frame signal; and decomposing the power spectrum into sub-bands and dividing into at least two sub-bands And performing frequency-frequency domain joint filtering on the signal of each sub-band according to the beat type corresponding to each sub-band; obtaining a beat point to be confirmed from the frame signal of the music signal according to the result of joint filtering in the time-frequency domain; The power value of the beat point is confirmed to acquire the beat point of the music signal.
  • the obtaining, according to the result of the joint filtering of the time-frequency domain, the beat point to be confirmed from the frame signal of the music signal comprising: acquiring the signal of each sub-band according to the result of joint filtering in the time-frequency domain a beat confidence level of each frequency; a weighted summation value of power values corresponding to all frequencies in each subband is calculated according to the beat confidence of each frequency; and the beat point to be confirmed is obtained according to the weighted summation value.
  • the acquiring a beat point of the music signal according to the power value of the beat point to be confirmed includes: acquiring a beat point to be confirmed that the weighted summation value is greater than a threshold power value, and determining the beat to be confirmed The point is the beat point of the music signal.
  • the threshold power value is determined by: obtaining an average value and a variance of power values of all the beat points to be confirmed; calculating a sum of the mean and the double variance, and using the sum value as The threshold power value.
  • the method further includes: acquiring a strong beat point of the music signal according to a strong beat point threshold power value, the strong beat point
  • the threshold power value is determined by: obtaining an average value and a variance of power values of all the beat points to be confirmed; calculating a sum of the mean value and the triple variance, and using the sum value as a strong beat point threshold power value; Determining a weak beat point of the music signal, wherein the weak beat point is determined by acquiring a beat point of the music signal, a power value less than or equal to the strong beat point threshold power value, and a beat greater than the threshold power value Point, the beat point is used as a weak beat point of the music signal.
  • the sub-band decomposition of the power spectrum into at least two sub-bands includes: sub-band decomposition of the power spectrum into four sub-bands; wherein the four sub-bands
  • the belt includes a first sub-band for detecting a bottom drum beat point, a second sub-band for detecting a snare beat point, a third sub-band for detecting a snare beat point, and a high-frequency beat instrument beat The fourth sub-band of points.
  • the frequency band of the first sub-band is 120 Hz to 3 kHz
  • the frequency band of the second sub-band is 3 kHz to 10 kHz
  • the frequency band of the third sub-band is 10 kHz to fs/ 2 Hz; where fs is the sampling frequency of the signal.
  • the performing frequency-domain joint filtering on the signal of each sub-band according to the tempo type corresponding to each sub-band includes: according to the first sub-band, the second sub-band, The third sub-band and the fourth sub-band correspond to the detected beat type, and the time-frequency domain joint filtering is performed on the signals of each sub-band by using parameters corresponding to the beat type.
  • the parameters corresponding to the type of beat are determined by the temporal characteristics of the beat point of the beat-like instrument for detecting and other interference signals different from the beat point in each sub-band. And the characteristics on the harmonic distribution set the parameters of the subband.
  • a music classification method based on music beat points comprising the steps of: detecting a beat point of a music using the detection method of a music beat point according to any of the above embodiments; and the music according to the number of beat points in each sub-band sort.
  • the classifying the music according to the number of beat points in each sub-band includes: counting the number of snare beat points in the music signal according to the number of beat points in each sub-band And the number of bottom drum beat points; if the number of snout beat points is greater than the first threshold, and the number of the bottom drum beat points is greater than the first threshold, the music is classified into strong rhythm-like music; The number of the bottom drum beat points is less than the second threshold, and the music is classified into lyric music.
  • a storage device having stored thereon a plurality of instructions; the instructions being adapted to be loaded and executed by a processor: framing a music signal to obtain a frame signal; acquiring a power spectrum of the frame signal; The spectrum performs subband decomposition and is divided into at least two subbands; according to the beat type corresponding to each subband, the time-frequency domain joint filtering is performed on the signals of each subband; the frame of the music signal is obtained according to the result of joint filtering in the time-frequency domain Obtaining a beat point to be confirmed in the signal; acquiring a beat point of the music signal according to the power value of the beat point to be confirmed; or the instruction is adapted to be loaded and executed by the processor: using any of the above embodiments
  • the detection method of the music beat point detects the beat point of the music; the music is classified according to the number of beat points in each sub-band.
  • a computer device comprising: one or more processors; a memory; one or more applications, wherein the one or more applications are stored in the memory and configured to be by the one or more Executing by the processor; the one or more applications are configured to perform a method of detecting a music beat point according to any of the above embodiments; or the one or more applications are configured to perform according to the above A music classification method according to an embodiment.
  • the invention provides a method for detecting a beat point of a music, first performing frame processing on the music signal, and acquiring a power spectrum of each frame signal, and then performing subband decomposition on the power spectrum.
  • Each sub-band corresponds to one beat type, and time-frequency domain joint filtering is performed for different sub-bands.
  • the beat point to be confirmed can be obtained, and then the beat point of the music signal is determined according to the power value of each beat point to be confirmed. Therefore, the beat point of the music signal can be obtained by the detection method of the music beat point of the present invention, so that a certain video effect in the special effect group can be triggered in combination with the beat point to improve the satisfaction of the user experience.
  • the method for detecting the beat point of the music acquires the beat confidence of each frequency in each subband signal, and calculates a weighted summation value of the power values corresponding to all the frequencies in each subband by the beat confidence to be summed according to the weight The value gets the beat point to be confirmed. Therefore, the accuracy of the beat point to be confirmed can be further improved.
  • the power spectrum of each frame signal is divided into a first sub-band for detecting a beat point of the bottom drum, and a second sub-band for detecting a beat point of the snare drum is used for detecting The third sub-band of the snare beat point and the fourth sub-band for detecting the beat point of the high-frequency beat instrument. Therefore, the detection method can perform sub-band decomposition according to the type of the specific beat point in the music, thereby more accurately detecting the beat point in the music signal.
  • FIG. 1 is a schematic diagram of interaction between a server and a client according to an embodiment of the present invention
  • FIG. 2 is a flowchart of a method for detecting a music beat point according to an embodiment of the present invention
  • FIG. 3 is a flowchart of step S500 according to an embodiment of the present invention.
  • step S500 is a signal diagram of a snare drum obtained after step S500 according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of an embodiment of a computer device structure according to the present invention.
  • the invention provides a method for detecting a music beat point and a music classification method based on a music beat point, which are applied to an application environment as shown in FIG. 1.
  • the server 100 and the client 300 are located in the same network 200 environment, and the server 100 and the client 300 exchange data information through the network 200.
  • the number of servers 100 and clients 300 is not limited, and FIG. 1 is only illustrated as an example.
  • An application is installed in the client 300. The user can interact with the corresponding server 100 through the APP in the client 300.
  • Server 100 can be, but is not limited to, a web server, a management server, an application server, a database server, a cloud server, and the like.
  • the client 300 can be, but is not limited to, a smart phone, a personal computer (PC), a tablet computer, a personal digital assistant (PDA), a mobile internet device (MID), and the like.
  • the operating system of the client 300 may be, but not limited to, an Android system, an IOS (iPhone operating system) system, a Windows phone system, a Windows system, and the like.
  • the server 100 After the user clicks on a piece of music (song) or uploads a piece of music (song) in the video class APP of the client 300, the server 100 analyzes and estimates the music, and further gives the user the location according to the estimated music type.
  • the client 300 issues a video effect group suitable for the music (song), and triggers one of the video effects in the effect group at the estimated beat time position.
  • the invention provides a method for detecting a beat point of a music, which detects a beat point of a music uploaded or selected by a user, so that the corresponding video special effect can be triggered according to the beat point of the music, thereby improving the satisfaction of the user experience.
  • a method for detecting a music beat point of the present invention includes the steps of:
  • the server acquires the music signal to be detected, and performs frame-by-frame processing on the music signal to obtain a plurality of frame signals of the music signal.
  • the music signal can be a music signal uploaded by the user or a music signal in a server database.
  • the server first preprocesses the input music signal.
  • the pre-processing process includes decoding the input music signal, two-channel to single channel, sampling rate conversion, and removing the DC component and the necessary pre-processing operations.
  • the pretreatment process here is a normal operation and will not be described in detail here. Further, the server performs frame processing on the music signal to obtain a plurality of frame signals.
  • the power spectrum of each frame signal is further acquired.
  • FFT Fast Fourier Transformation
  • the server decomposes the power spectrum corresponding to each frame signal into at least two sub-bands.
  • Each sub-band corresponds to detecting one type of beat point.
  • the server analyzes the spectrum of the music signal, and combines the frequency response characteristics of the beat-like musical instrument commonly used in music to perform sub-band decomposition on the music signal.
  • the power spectrum is sub-band decomposed into four sub-bands; the four sub-bands include a first sub-band for detecting a bottom drum beat point, and are used for detecting a snare beat point.
  • the second sub-band is used to detect the third sub-band of the snare beat point and the fourth sub-band for detecting the beat point of the high-frequency beat instrument.
  • the frequency band of the first sub-band is 0 Hz to 120 Hz
  • the frequency band of the second sub-band is 120 Hz to 3 K Hz
  • the frequency band of the third sub-band is 3 kHz to 10 K Hz
  • the fourth The frequency band of the subband is 10K Hz ⁇ fs/2 Hz; where fs is the sampling frequency of the signal.
  • the decomposition of the subband band of the power spectrum is mainly due to the fact that the kick drum, the snare drum and other beat instruments (high frequency beat instrument beat points) have different differences in frequency response, and different The duration of the beat-like instruments is also very different.
  • the bottom drum energy is mainly concentrated in the low-frequency sub-bands, but there are often non-beat instruments such as bass in the low-frequency sub-bands.
  • the duration of the bass is much longer than the duration of the kick drum. .
  • the energy of the snare drum is mainly concentrated in the intermediate frequency sub-band, but the sub-band below 3 kHz is interfered by signals such as vocals.
  • the sub-bands above 3 kHz are mainly interfered by other accompaniment instruments.
  • the snare drum continues.
  • the time is obviously shorter than other interference signals, and the duration of sub-band interference signals below 3 kHz is significantly different from the duration of sub-band interference signals above 3 kHz. Therefore, different strategies are needed for joint filtering in time-frequency domain.
  • the high-frequency sub-bands are often the sounds of some long-lasting melody-like instruments, which are different from the accompaniment instruments and vocal characteristics of the IF sub-bands.
  • S400 Perform time-frequency domain joint filtering on the signals of each sub-band according to the tempo type corresponding to each sub-band.
  • the server decomposes the power spectrum corresponding to each frame signal, and further performs time-frequency domain joint filtering on the signals of each sub-band according to the beat type corresponding to each sub-band. Specifically, when the server decomposes the power spectrum of the frame signal into the four sub-bands described in step S300, the corresponding beats are respectively determined according to the first sub-band, the second sub-band, the third sub-band, and the fourth sub-band. Type, the frequency-domain joint filtering of the signals of each sub-band is performed by using parameters corresponding to the beat type.
  • the corresponding parameter of the beat type is determined by: according to each of the sub-bands, the beat signal of the beat-like musical instrument for detecting and other interference signals different from the beat point, the temporal characteristics and the harmonic distribution The property sets the parameters of this subband.
  • the corresponding parameters of the beat type may be detected according to the detection method of the music beat point according to the present invention.
  • the corresponding parameter of the beat type may also be in the time of implementing the detection method of the music beat point according to the present invention, the server according to the detected beat-like instrument beat point and other interference signals different from the beat point, in time The characteristics and characteristics obtained from the characteristics of the harmonic distribution.
  • the operation steps of the above-mentioned time-frequency domain joint filtering are the same, but the parameter values of hi and hj are different.
  • the parameter selection of hi and hj is determined by the characteristics of the duration of the beat-like instrument and other melody-like interference signals falling within different sub-bands and the characteristics of the harmonic distribution. For each frequency Bin k, to see which subband it belongs to, select the parameters set by the subband for filtering.
  • the smoothing windows wi and wj can select mean filtering or median filtering, or select Gaussian window filtering.
  • the embodiments of the present invention mainly perform smoothing (low-pass filtering) on the frame signal in the time-frequency domain, and other filtering methods may be employed in other embodiments.
  • step S500 includes the following steps:
  • the confidence of each beat in the signal of each sub-band and the confidence of other non-beat melody types may be calculated according to the following manner.
  • B(t,k) P_smf(t,k)*P_smf(t,k)/(P_smf(t,k)*P_smf(t,k)+P_smt(t,k)*P_smt(t,k)) .
  • the current frame signal P(t, k) is weighted and summed separately according to the type of the beat point, according to the following manner.
  • Kick(t) sum(P(t,k)*B(t,k))k ⁇ 1 for detecting the kick drum;
  • Snare(t) sum(P(t,k)*B(t,k))k ⁇ sub-band 2 and sub-band 3 for detecting snare drums;
  • Beat(t) sum(P(t,k)*B(t,k))k ⁇ subband 4 is used to detect other beat points.
  • P(t,k) is the power spectrum obtained after the signal is STFT (short-time Fourier transform), P(t,k)*B(t,k) embodies the weighting of the power spectrum, B(t,k) is Whether the signal is a confidence level of the beat at the kth frequency of the tth frame. Confidence is a value between 0 and 1. The confidence is multiplied by the power spectrum of the signal. The power spectrum P(t,k) belonging to the beat is preserved. The power spectrum P(t,k) that is not part of the beat is retained. Will be suppressed (small and small).
  • the weighted power spectra are summed, and k is summed according to the subband division.
  • the range of k after STFT analysis is 1 ⁇ N/2+1, that is, there are P(t1,1), P(t1,2)...P (t1, N/2+1)
  • the frequency corresponding to each frequency k is k*fs/N, so it is also possible to know which subband k belongs to.
  • the beat point of the music signal is acquired according to the power value corresponding to the beat point. Specifically, after the server calculates the weighted summation value of the power value corresponding to all the frequencies in each subband, the server further obtains the to-be-acknowledged beat point whose weighted summation value is greater than the threshold power value, and the to-be-confirmed The beat point serves as a beat point of the music signal.
  • the threshold power value is determined by: obtaining an average value and a variance of power values of all the beat points to be confirmed; calculating a sum of the mean value and the double variance, and using the sum value as the threshold power value.
  • Kick, Snare, and Beat acquired in step S500 are abbreviations for Kick(t), Snare(t), and Beat(t), respectively.
  • Kick, Snare, and Beat are scanned separately to find all peak points.
  • the detected beat point If it is detected in Kick, it is marked as a kick drum. If it is detected in Snare, it is marked as a snare drum. If it is detected in Beat, it is marked as another beat point (high-frequency beat instrument beat point).
  • the invention provides a method for detecting a beat point of a music, first performing frame processing on the music signal, and acquiring a power spectrum of each frame signal, and then performing subband decomposition on the power spectrum.
  • Each sub-band corresponds to one beat type, and time-frequency domain joint filtering is performed for different sub-bands.
  • the beat point to be confirmed can be obtained, and then the beat point of the music signal is determined according to the power value of each beat point to be confirmed. Therefore, the beat point of the music signal can be obtained by the detection method of the music beat point of the present invention, so that a certain video effect in the special effect group can be triggered in combination with the beat point, thereby improving the satisfaction of the user experience.
  • the method for detecting the beat point of the music acquires the beat confidence of each frequency in each subband signal, and calculates a weighted summation value of the power values corresponding to all the frequencies in each subband by the beat confidence to be summed according to the weight The value gets the beat point to be confirmed. Therefore, the accuracy of the beat point to be confirmed can be further improved.
  • the power spectrum of each frame signal is divided into a first sub-band for detecting a beat point of the bottom drum, and a second sub-band for detecting a beat point of the snare drum is used for detecting The third sub-band of the snare beat point and the fourth sub-band for detecting the beat point of the high-frequency beat instrument. Therefore, the detection method can perform sub-band decomposition according to the type of the specific beat point in the music, thereby more accurately detecting the beat point in the music signal.
  • step S600 the method further includes:
  • the weak beat point being determined by:
  • the beat point position is the frame t corresponding to the found peak point.
  • the present invention provides a snare drum signal diagram obtained after step S500 in an embodiment.
  • the horizontal axis is time t
  • the vertical axis is power P
  • power P is the weighted summation value obtained in step S500.
  • P1 represents a strong beat point threshold power value
  • P2 represents the threshold power value.
  • the peak value obtained by scanning must be greater than P2 to be detected.
  • the beat corresponding to the peak point of P2 less than or equal to P1 belongs to the weak beat point, and the beat corresponding to the peak point of P1 belongs to the strong beat point. Peak points with power values less than P2 will be discarded.
  • the invention provides a solution for analyzing the beat point position, the beat type and the music type in the music (song), and automatically extracting a skeleton which is very important in the music - beat, using the extracted beat point position and beat point Types and music types guide the trigger timing and trigger type of video effects, making music and video special effects well combined, in line with the habits of people watching music.
  • This part of the work originally required people to manually mark the beat points and their types in the music, very cumbersome.
  • the type of beat point machine in the music can be automatically marked, and the accuracy can reach more than 90%.
  • the invention also provides a music classification method based on music beat points.
  • the method includes the steps of: detecting a beat point of the music using the method of detecting a beat point of the music described in any of the above embodiments; classifying the music according to the number of beat points in each subband.
  • the classifying the music according to the number of beat points in each sub-band includes: counting the number of snare beat points and the bottom drum beat point in the music signal according to the number of beat points in each sub-band If the number of snout points is greater than a first threshold, and the number of bottom drum beats is greater than the first threshold, classifying the music as a strong rhythm-like music; if the kick drum beats The number of points is less than the second threshold, and the music is classified as lyric music.
  • the number of types of beat points mentioned in the above-described method of detecting beat points can be used to classify the music types.
  • the snare beat point > threshold 1 while the bottom drum beat point > threshold 1 music type is rhythmic music.
  • the bottom drum beat point ⁇ threshold 2 music type is lyric music.
  • Threshold 1 and threshold 2 are set according to the number of snare beat points and kick drum beat points in the music classification.
  • the music type is roughly divided into two categories: music with strong rhythm and lyrical music. It can be distinguished from completely different special effects classes, avoiding a large number of overly intense special effects in lyric music. People's audio-visual habits remain unified.
  • the present invention also provides a storage device having stored thereon a plurality of instructions; the instructions being adapted to be loaded and executed by a processor: framing a music signal to obtain a frame signal; and acquiring a power spectrum of the frame signal; Decomposing the power spectrum into sub-bands and dividing into at least two sub-bands; performing time-frequency domain joint filtering on the signals of each sub-band according to the tempo type corresponding to each sub-band; A beat point to be confirmed is obtained in a frame signal of the music signal; and a beat point of the music signal is acquired according to the power value of the beat point to be confirmed.
  • the instructions are adapted to be loaded and executed by the processor: detecting a beat point of the music using the detection method of the music beat point according to any of the above embodiments; and the music is based on the number of beat points in each sub-band sort.
  • the storage device may be a medium that can store program codes, such as a USB flash drive, a removable hard disk, a Read-Only Memory (ROM), a RAM, a magnetic disk, or an optical disk.
  • program codes such as a USB flash drive, a removable hard disk, a Read-Only Memory (ROM), a RAM, a magnetic disk, or an optical disk.
  • the instructions in the storage device provided by the present invention are loaded by the processor and perform the steps described in the method of detecting music beat points described in any of the above embodiments.
  • the instructions in the storage device provided by the present invention are loaded by the processor and execute the music classification method in any of the above embodiments.
  • the invention also provides a computer device.
  • the computer device includes: one or more processors; a memory; one or more applications, wherein the one or more applications are stored in the memory and configured to be configured by the one or more processors Executing, the one or more applications are configured to execute a method for detecting a music beat point or a music classification method according to any of the above embodiments in the device.
  • FIG. 5 is a schematic structural diagram of a computer device according to an embodiment of the present invention.
  • the device described in this embodiment may be a computer device.
  • the device includes a processor 503, a memory 505, an input unit 507, and a display unit 509.
  • the device structure device illustrated in Fig. 5 does not constitute a limitation on all devices, and may include more or less components than those illustrated, or may combine certain components.
  • the memory 505 can be used to store an application 501 and various functional modules, and the processor 503 runs an application 501 stored in the memory 505 to perform various functional applications and data processing of the device.
  • the memory can be internal or external, or both internal and external.
  • the internal memory may include a read only memory (ROM), a programmable ROM (PROM), an electrically programmable ROM (EPROM), an electrically erasable programmable ROM (EEPROM), a flash memory, or a random access memory.
  • ROM read only memory
  • PROM programmable ROM
  • EPROM electrically programmable ROM
  • EEPROM electrically erasable programmable ROM
  • flash memory or a random access memory.
  • the external storage may include a hard disk, a floppy disk, a ZIP disk, a USB disk, a magnetic tape, and the like.
  • the memories disclosed herein include, but are not limited to, these types of memories.
  • the memory disclosed herein is by way of example only, and not limitation.
  • the input unit 507 is for receiving an input of a signal and receiving a keyword input by the user.
  • the input unit 507 can include a touch panel as well as other input devices.
  • the touch panel can collect touch operations on or near the user (such as the user using any suitable object or accessory such as a finger or a stylus on the touch panel or near the touch panel), and according to a preset
  • the program drives the corresponding connection device; other input devices may include, but are not limited to, one or more of a physical keyboard, function keys (such as play control buttons, switch buttons, etc.), trackballs, mice, joysticks, and the like.
  • the display unit 509 can be used to display information input by the user or information provided to the user as well as various menus of the computer device.
  • the display unit 509 can take the form of a liquid crystal display, an organic light emitting diode, or the like.
  • the processor 503 is a control center of the computer device that connects various parts of the entire computer using various interfaces and lines, executes or executes software programs and/or modules stored in the memory 503, and calls data stored in the memory to execute Various functions and processing data.
  • the device includes one or more processors 503, and one or more memories 505, one or more applications 501.
  • the one or more applications 501 are stored in a memory 505 and configured to be executed by the one or more processors 503, the one or more applications 501 configured to perform the above embodiments A method of detecting a beat of a music or a method of classifying music.
  • each functional unit in each embodiment of the present invention may be integrated into one processing module, or each unit may exist physically separately, or two or more units may be integrated into one module.
  • the above integrated modules can be implemented in the form of hardware or in the form of software functional modules.
  • the integrated modules, if implemented in the form of software functional modules and sold or used as separate products, may also be stored in a computer readable storage medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Auxiliary Devices For Music (AREA)

Abstract

一种音乐分类方法及节拍点检测方法、存储设备及计算机设备。该音乐节拍点的检测方法包括:对音乐信号进行分帧处理,得到帧信号(S100);获取帧信号的功率谱(S200);把功率谱进行子带分解,分为至少两个子带(S300);根据每个子带对应的节拍类型,对每个子带的信号进行时频域联合滤波(S400);根据时频域联合滤波的结果从音乐信号的帧信号中得到待确认节拍点(S500);根据待确认节拍点的功率值获取音乐信号的节拍点(S600)。因此,通过音乐节拍点的检测方法可获取到音乐信号的节拍点,从而可结合节拍点触发特效组中的某一个视频特效,提高用户体验的满意度。

Description

音乐分类方法及节拍点检测方法、存储设备及计算机设备 技术领域
本发明涉及互联网技术领域,具体而言,本发明涉及一种音乐分类方法及节拍点检测方法、存储设备及计算机设备。
背景技术
随着互联网技术及视频直播技术的高速发展,在播放短视频或者视频直播时增加了音乐效果。为了提高用户体验,可以根据视频中的音乐类型给用户推荐适合这首音乐的视频特效组,以增强视频在听觉和视觉上的感染力。
然而,在传统的视频特效处理过程中,无法获取播放音乐的节拍点,从而无法根据该播放音乐的节拍点触发相应的视频特效。因此,在视频特效处理中,无法实现特效根据视频中播放音乐进行个性化设置,进而影响了用户的体验满意度。
发明内容
本发明的目的旨在提供一种音乐分类方法及节拍点检测方法、存储设备及计算机设备,以获取音乐中的节拍点,从而可实现根据节拍点的位置触发特效组中的某一个视频特效,提高用户体验的满意度。
本发明提出以下技术方案:
一种音乐节拍点的检测方法,包括以下步骤:对音乐信号进行分帧处理,得到帧信号;获取所述帧信号的功率谱;把所述功率谱进行子带分解,分为至少两个子带;根据每个子带对应的节拍类型,对每个子带的信号进行时频域联合滤波;根据时频域联合滤波的结果从所述音乐信号的帧信号中得到待确认节拍点;根据所述待确认节拍点的功率值获取所述音乐信号的节拍点。
在其中一个实施例中,所述根据时频域联合滤波的结果从所述音乐信号的帧信号中得到待确认节拍点,包括:根据时频域联合滤波的结果,获取每个子带的信号中每个频率的节拍置信度;根据所述每个频率的节拍置信度计算每个子带中所有频率对应的功率值的加权求和值;根据该加权求和值得到 所述待确认节拍点。
在其中一个实施例中,所述根据所述待确认节拍点的功率值获取所述音乐信号的节拍点,包括:获取加权求和值大于门限功率值的待确认节拍点,将该待确认节拍点作为所述音乐信号的节拍点。
在其中一个实施例中,所述门限功率值通过以下方式确定:获取所有所述待确认节拍点的功率值的均值以及方差;计算所述均值与两倍方差的和值,将该和值作为所述门限功率值。
在其中一个实施例中,所述将该待确认节拍点作为所述音乐信号的节拍点之后,还包括:根据强节拍点门限功率值获取所述音乐信号的强节拍点,所述强节拍点门限功率值通过以下方式确定:获取所有所述待确认节拍点的功率值的均值以及方差;计算所述均值与三倍方差的和值,将该和值作为强节拍点门限功率值;获取所述音乐信号的弱节拍点,所述弱节拍点通过以下方式确定:获取所述音乐信号的节拍点中,功率值小于等于所述强节拍点门限功率值,且大于所述门限功率值的节拍点,将该节拍点作为所述音乐信号的弱节拍点。
在其中一个实施例中,所述把所述功率谱进行子带分解,分为至少两个子带,包括:把所述功率谱进行子带分解,分为四个子带;其中,所述四个子带包括用于检测底鼓节拍点的第一子带,用于检测军鼓节拍点的第二子带,用于检测军鼓节拍点的第三子带,以及用于检测高频节拍乐器节拍点的第四子带。
在其中一个实施例中,所述第一子带的频段为120Hz~3K Hz,所述第二子带的频段为3K Hz~10K Hz,所述第三子带的频段为10K Hz~fs/2 Hz;其中,fs为信号的采样频率。
在其中一个实施例中,所述根据每个子带对应的节拍类型,对每个子带的信号进行时频域联合滤波,包括:根据所述第一子带、所述第二子带、所述第三子带及所述第四子带对应检测的节拍类型,采用节拍类型相应的参数对每个子带的信号进行时频域联合滤波。
在其中一个实施例中,所述节拍类型相应的参数由以下方式确定:根据每个子带中,用于检测的节拍类乐器节拍点与其他不同于该节拍点的干扰信 号,在时间上的特性及在谐波分布上的特性设置该子带的参数。
一种基于音乐节拍点的音乐分类方法,包括步骤:使用上述任一实施例所述的音乐节拍点的检测方法检测出音乐的节拍点;根据每个子带内节拍点的数量,对所述音乐进行分类。
在其中一个实施例中,所述根据每个子带内节拍点的数量,对所述音乐进行分类,包括:根据每个子带内节拍点的数量,统计所述音乐信号中军鼓节拍点的数量以及底鼓节拍点的数量;若所述军鼓节拍点的数量大于第一阈值,且所述底鼓节拍点的数量大于所述第一阈值,将该音乐分类为强节奏感类音乐;若所述底鼓节拍点的数量小于第二阈值,将该音乐分类为抒情类音乐。
一种存储设备,其上存储有多条指令;所述指令适于由处理器加载并执行:对音乐信号进行分帧处理,得到帧信号;获取所述帧信号的功率谱;把所述功率谱进行子带分解,分为至少两个子带;根据每个子带对应的节拍类型,对每个子带的信号进行时频域联合滤波;根据时频域联合滤波的结果从所述音乐信号的帧信号中得到待确认节拍点;根据所述待确认节拍点的功率值获取所述音乐信号的节拍点;或,所述指令适于由处理器加载并执行:使用上述任一实施例所述的音乐节拍点的检测方法检测出音乐的节拍点;根据每个子带内节拍点的数量,对所述音乐进行分类。
一种计算机设备,其包括:一个或多个处理器;存储器;一个或多个应用程序,其中所述一个或多个应用程序被存储在所述存储器中并被配置为由所述一个或多个处理器执行;所述一个或多个应用程序配置用于执行根据上述任一实施例所述的音乐节拍点的检测方法;或,所述一个或多个应用程序配置用于执行根据上述任一实施例所述的音乐分类方法。
相比现有技术,本发明的方案具有以下优点:
本发明提供的一种音乐节拍点的检测方法,先对音乐信号进行分帧处理,并获取每一帧信号的功率谱,进而对功率谱进行子带分解。每个子带对应一个节拍类型,针对不同的子带进行时频域联合滤波。根据滤波结果可得到待确认的节拍点,进而根据每个待确认的节拍点的功率值确定出该音乐信号的节拍点。因此,通过本发明的音乐节拍点的检测方法可获取到 音乐信号的节拍点,从而可结合节拍点触发特效组中的某一个视频特效,提高用户体验的满意度。
进一步地,该音乐节拍点的检测方法获取每个子带信号中每个频率的节拍置信度,通过节拍置信度计算每个子带中所有频率对应的功率值的加权求和值,以根据加权求和值得到待确认节拍点。因此,可进一步提高待确认节拍点的准确性。
同时,该音乐节拍点的检测方法中,将每一帧信号的功率谱分为用于检测底鼓节拍点的第一子带,用于检测军鼓节拍点的第二子带,用于检测军鼓节拍点的第三子带,以及用于检测高频节拍乐器节拍点的第四子带。因此,该检测方法可根据音乐中具体节拍点的类型进行子带分解,从而更加精准地检测出音乐信号中的节拍点。
附图说明
本发明上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:
图1为本发明一实施例提供的服务器与客户端之间的交互示意图;
图2为本发明一实施例提供的一种音乐节拍点的检测方法的流程图;
图3为本发明一实施例提供的步骤S500的流程图;
图4为本发明一实施例提供的步骤S500之后得到的军鼓信号图;
图5为本发明计算机设备结构一实施例中的示意图。
具体实施方式
下面详细描述本发明的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,仅用于解释本发明,而不能解释为对本发明的限制。
本发明提供的一种音乐节拍点的检测方法以及一种基于音乐节拍点的音乐分类方法,应用于如图1所示的应用环境。
如图1所示,服务器100与客户端300位于同一个网络200环境中,服务器100与客户端300通过网络200进行数据信息的交互。服务器100与客户端300的数量不作限定,图1所示只作为示例说明。客户端300中 安装有APP(Application,应用程序)。用户可以通过客户端300中的APP与对应的服务器100进行信息交互。
服务器100可以是,但不限于,网络服务器、管理服务器、应用程序服务器、数据库服务器、云端服务器等等。客户端300可以是,但不限于智能手机、个人电脑(personal computer,PC)、平板电脑、个人数字助理(personal digital assistant,PDA)、移动上网设备(mobile Internet device,MID)等。客户端300的操作系统可以是,但不限于,安卓(Android)系统、IOS(iPhone operating system)系统、Windows phone系统、Windows系统等。
用户在客户端300的视频类APP中点击选择了一首音乐(歌曲)或者上传了一首音乐(歌曲)之后,服务器100对该音乐进行分析估计,进一步根据估计出来的音乐类型给用户所在的客户端300下发推荐适合这首音乐(歌曲)的视频特效组,并在所估计出来的节拍点时间位置处触发特效组中的某一个视频特效。本发明提供一种音乐节拍点的检测方法,对用户上传或者选择的音乐的节拍点进行检测,从而可以根据该音乐的节拍点触发对应的视频特效,提高用户体验的满意度。
本发明提供一种音乐节拍点的检测方法。在一实施例中,如图2所示,本发明的一种音乐节拍点的检测方法,包括步骤:
S100,对音乐信号进行分帧处理,得到帧信号。
在本实施例中,服务器获取待检测的音乐信号,并对该音乐信号进行分帧处理,得到该音乐信号的多个帧信号。音乐信号可以是用户上传的音乐信号,或者服务器数据库中的音乐信号。
在一实施方式中,服务器首先对输入的音乐信号进行预处理。所述预处理的过程包括对输入的音乐信号进行解码,双通道转单通道,采样率转换,去除直流分量等必要的预处理操作。其中,此处的预处理过程属于常规操作,此处不做详细说明。进一步地,服务器对音乐信号进行分帧处理,得到多个帧信号。
S200,获取所述帧信号的功率谱。
在本实施例中,服务器获取到所述音乐信号的多个帧信号之后,进一 步获取每个帧信号的功率谱。具体地,服务器对音乐信号进行分帧处理时,N点为一帧,每次更新M点(M<N,M/N=0.25~0.5),overlap=N–M。分帧后对每一帧长为N点的信号加窗处理,然后对每个信号做FFT(Fast Fourier Transformation,快速傅立叶变换),得每个帧信号的功率谱P(t,k)。上述获取功率谱的过程是信号处理中的常规操作,此处不做详细说明。
S300,把所述功率谱进行子带分解,分为至少两个子带。
在本实施例中,服务器将每个帧信号对应的功率谱进行子带分解,分解为至少两个子带。每个子带对应用于检测一种类型的节拍点。具体地,服务器对所述音乐信号的频谱进行分析,结合音乐中常用的节拍类乐器的频响特点,对所述音乐信号进行子带分解。
在一实施方式中,把所述功率谱进行子带分解,分为四个子带;所述四个子带包括用于检测底鼓节拍点的第一子带,用于检测军鼓节拍点的第二子带,用于检测军鼓节拍点的第三子带,以及用于检测高频节拍乐器节拍点的第四子带。其中,所述第一子带的频段为0 Hz~120 Hz,所述第二子带的频段为120Hz~3K Hz,所述第三子带的频段为3K Hz~10K Hz,所述第四子带的频段为10K Hz~fs/2 Hz;其中,fs为信号的采样频率。
在本实施方式中,对于功率谱的子带频段的分解,主要是由于底鼓、军鼓与其它节拍类乐器(高频节拍乐器节拍点)除了在频响上有很大差别外,不同的节拍类乐器的持续时间也有很大差异,底鼓能量主要集中在低频子带,但低频子带中往往还有贝斯这一类非节拍性乐器,贝斯的持续时间比底鼓的持续时间长很多。军鼓的能量主要集中在中频子带,但3kHz以下的子带受人声等信号的干扰,3kHz以上的子带主要受其它伴奏乐器的干扰,在中频的2个子带上,军鼓的持续时间都明显短于其它的干扰信号,而3kHz以下的子带干扰信号的持续时间与3kHz以上的子带干扰信号持续时间有明显不同,因此做时频域联合滤波时需要采用不同的策略。高频子带往往是一些持续时间非常长的旋律类伴奏乐器的声音,这一点和中频子带的出现的伴奏乐器及人声特性又有所不同。
S400,根据每个子带对应的节拍类型,对每个子带的信号进行时频域 联合滤波。
在本实施例中,服务器将每个帧信号对应的功率谱进行子带分解后,进一步根据每个子带对应的节拍类型,对每个子带的信号进行时频域联合滤波。具体地,当服务器将帧信号的功率谱分解为步骤S300中所述的四个子带时,将根据第一子带、第二子带、第三子带及第四子带分别对应检测的节拍类型,采用节拍类型相应的参数对每个子带的信号进行时频域联合滤波。其中,节拍类型相应的参数由以下方式确定:根据每个子带中,用于检测的节拍类乐器节拍点与其他不同于该节拍点的干扰信号,在时间上的特性及在谐波分布上的特性设置该子带的参数。
在该步骤中,服务器采用节拍类型相应的参数对每个子带的信号进行时频域联合滤波时,该节拍类型相应的参数可以在实施本发明所述的音乐节拍点的检测方法之前,根据检测的节拍类乐器节拍点与其他不同于该节拍点的干扰信号,在时间上的特性及在谐波分布上的特性获取到的参数。或者,该节拍类型相应的参数也可以在实施本发明所述的音乐节拍点的检测方法的同时,服务器根据检测的节拍类乐器节拍点与其他不同于该节拍点的干扰信号,在时间上的特性及在谐波分布上的特性获取到的参数。
本实施方式中,时频域联合滤波的具体步骤可描述为:
对当前帧信号P(t,k),取其前hi帧信号和后hi帧信号,对每一个频率Bin k,组成一个时域窗口[P(t-hi,k),…,P(t+hi,k)],在这个窗口上选取合适的平滑窗wi对其进行平滑,得到P_smt(t,k)。
对当前帧信号P(t,k),对每一个频率Bin k,取其前hj个Bin和后hj个Bin,组成一个频域窗口[P(t,k-hj),…,P(t,k+hj)],在这个窗口上选取合适的平滑窗wj对其进行平滑,得到P_smf(t,k)。
对于不同的子带,上述时频域联合滤波的操作步骤是一样的,但hi和hj的参数值是不同的。hi和hj的参数选择由落在不同子带内节拍类乐器和其它旋律类干扰信号在持续时间上的特性和在谐波分布上的特性共同决定。对每一个频率Bin k,看它归属于哪一个子带,就选择该子带设定的参数进行滤波。
平滑窗wi和wj可以选择均值滤波或中值滤波,或选择高斯窗滤波等。本发明实施方式主要是对帧信号在时频域联合进行平滑(低通滤波),在其他 实施方式中也可采用其他滤波方式。
S500,根据时频域联合滤波的结果从所述音乐信号的帧信号中得到待确认节拍点。
在本实施例中,服务器根据时频域联合滤波的结果,可从所述音乐信号的帧信号中得到待确认的节拍点。在一实施方式中,如图3所示,步骤S500包括以下步骤:
S510,根据时频域联合滤波的结果,获取每个子带的信号中每个频率的节拍置信度。
S530,根据所述每个频率的节拍置信度计算每个子带中所有频率对应的功率值的加权求和值。
S550,根据该加权求和值得到所述待确认节拍点。
在一实施方式中,可根据以下方式计算每个子带的信号中每个频率的节拍置信度以及其他非节拍的旋律类的置信度。
对当前帧信号P(t,k),对每一个频率k,可以根据时频域联合滤波的结果给出它是否为一个节拍的置信度(即维纳滤波)。其中,k表示频率。
B(t,k)=P_smf(t,k)*P_smf(t,k)/(P_smf(t,k)*P_smf(t,k)+P_smt(t,k)*P_smt(t,k))。
相应的,它是否为一个旋律类成分的置信度为:
H(t,k)=P_smt(t,k)*P_smt(t,k)/(P_smf(t,k)*P_smf(t,k)+P_smt(t,k)*P_smt(t,k))=1–B(t,k)。
进一步地,根据以下方式,按节拍点的类型分别对当前帧信号P(t,k)进行加权求和。
Kick(t)=sum(P(t,k)*B(t,k))k∈子带1,用于检测底鼓;
Snare(t)=sum(P(t,k)*B(t,k))k∈子带2和子带3,用于检测军鼓;
Beat(t)=sum(P(t,k)*B(t,k))k∈子带4,用于检测其它节拍点。
P(t,k)是信号做STFT(短时傅立叶变换)之后得到的功率谱,P(t,k)*B(t,k)体现了对功率谱的加权,B(t,k)是信号在第t帧第k个频率上是不是节拍的置信度。置信度是一个0~1之间的数值,将置信度和信号的功率谱相乘,属于节拍的功率谱P(t,k)会保留下来,不属于节拍的功率谱P(t,k)会被抑制(乘完变小了)。
加权之后,对加权后的功率谱进行求和,按子带划分情况,对k进行求和。比如,对时间t=t1,P(t1,k),STFT分析后k的取值范围是1~N/2+1,也即有P(t1,1),P(t1,2)…P(t1,N/2+1)这么多个数,每一个频率k对应的频率是k*fs/N,因此也可以知道k属于哪一个子带。比如举个例子:k=1~10属于子带1(底鼓子带),k=20~50属于子带2(军鼓子带),以此类推,那么把P(t1,1)*B(t1,1),P(t1,2)*B(t1,2)…P(t1,10)*B(t1,10)加起来就是对子带1(底鼓)进行加权求和,得到kick(t1)。对所有的帧都进行上述处理就得到了kick(1),kick(2)…kick(L),L的大小由这个音乐信号具体有多长决定。
S600,根据所述待确认节拍点的功率值获取所述音乐信号的节拍点。
在本实施例中,服务器获取到待确认节拍点之后,根据节拍点对应的功率值获取所述音乐信号的节拍点。具体地,如步骤S500所述的,服务器计算得到每个子带中所有频率对应的功率值的加权求和值之后,进一步获取加权求和值大于门限功率值的待确认节拍点,将该待确认节拍点作为所述音乐信号的节拍点。其中,门限功率值通过以下方式确定:获取所有所述待确认节拍点的功率值的均值以及方差;计算所述均值与两倍方差的和值,将该和值作为所述门限功率值。
在一具体实施方式中,对于步骤S500中获取的Kick、Snare和Beat(Kick、Snare和Beat分别为Kick(t)、Snare(t)和Beat(t)的缩写表述)。对Kick、Snare和Beat分别扫描找到所有峰值点,功率值大于门限功率值T1=mean+std*2(mean所有峰值点功率值的均值、std所有峰值点功率值 的方差)的峰值点为所检测到的节拍点。若在Kick中被检出则标记为底鼓,若在Snare中被检出则标记为军鼓,若在Beat中被检出则标记为其它节拍点(高频节拍乐器节拍点)。
本发明提供的一种音乐节拍点的检测方法,先对音乐信号进行分帧处理,并获取每一帧信号的功率谱,进而对功率谱进行子带分解。每个子带对应一个节拍类型,针对不同的子带进行时频域联合滤波。根据滤波结果可得到待确认的节拍点,进而根据每个待确认的节拍点的功率值确定出该音乐信号的节拍点。因此,通过本发明的音乐节拍点的检测方法可获取到音乐信号的节拍点,从而可结合节拍点触发特效组中的某一个视频特效,提高用户体验的满意度。
进一步地,该音乐节拍点的检测方法获取每个子带信号中每个频率的节拍置信度,通过节拍置信度计算每个子带中所有频率对应的功率值的加权求和值,以根据加权求和值得到待确认节拍点。因此,可进一步提高待确认节拍点的准确性。
同时,该音乐节拍点的检测方法中,将每一帧信号的功率谱分为用于检测底鼓节拍点的第一子带,用于检测军鼓节拍点的第二子带,用于检测军鼓节拍点的第三子带,以及用于检测高频节拍乐器节拍点的第四子带。因此,该检测方法可根据音乐中具体节拍点的类型进行子带分解,从而更加精准地检测出音乐信号中的节拍点。
在一实施例中,步骤S600之后,还包括:
根据强节拍点门限功率值获取所述音乐信号的强节拍点,所述强节拍点门限功率值通过以下方式确定:
获取所有所述待确认节拍点的功率值的均值以及方差;
计算所述均值与三倍方差的和值,将该和值作为强节拍点门限功率值;
获取所述音乐信号的弱节拍点,所述弱节拍点通过以下方式确定:
获取所述音乐信号的节拍点中,功率值小于等于所述强节拍点门限功率值,且大于所述门限功率值的节拍点,将该节拍点作为所述音乐信号的弱节拍点。
具体地,如步骤S600所述,峰值点功率值大于强节拍点门限功率值 T2=mean+std*3的节拍点为强节拍点。峰值点功率值小于强节拍点门限功率值,其大于等于门限功率值T1=mean+std*2的节拍点为弱节拍点。节拍点位置为所找到的峰值点所对应的帧t。
综上所述,如图4所示,本发明给出一实施例中步骤S500之后得到的军鼓信号图。横轴是时间t,纵轴是功率P,此处的功率P为按照步骤S500得到的加权求和值。如图4所示,信号曲线上有很多峰,可通过扫描得到曲线上所有的峰值点。P1代表强节拍点门限功率值,P2代表所述门限功率值。扫描得到的峰值点,其功率值必须大于P2才能被检测出来,大于P2小于等于P1的峰值点对应的节拍属于弱节拍点,大于P1的峰值点对应的节拍属于强节拍点。功率值小于P2的峰值点将被丢弃。
本发明提供的方案,对音乐(歌曲)中的节拍点位置和节拍类型和音乐类型进行分析,自动化的提取出音乐中非常重要的骨架——节拍,用所提取到的节拍点位置和节拍点类型及音乐类型指导视频特效的触发时机和触发类型,使得音乐与视频特效能很好的结合起来,符合人视听音乐时的习惯。这部分工作,原来需要人来手工标注出音乐中的节拍点及其类型,非常繁琐。使用本发明所描述的方法,可自动化标注出音乐中的节拍点机器类型,准确率可以达到90%以上。
本发明还提供一种基于音乐节拍点的音乐分类方法。该方法包括步骤:使用上述任一实施例所述的音乐节拍点的检测方法检测出音乐的节拍点;根据每个子带内节拍点的数量,对所述音乐进行分类。
其中,所述根据每个子带内节拍点的数量,对所述音乐进行分类,包括:根据每个子带内节拍点的数量,统计所述音乐信号中军鼓节拍点的数量以及底鼓节拍点的数量;若所述军鼓节拍点的数量大于第一阈值,且所述底鼓节拍点的数量大于所述第一阈值,将该音乐分类为强节奏感类音乐;若所述底鼓节拍点的数量小于第二阈值,将该音乐分类为抒情类音乐。
具体地,利用上述音乐节拍点的检测方法所提及的获取3类节拍点的个数可以对音乐类型进行分类。军鼓节拍点>阈值1,同时底鼓节拍点>阈值1的音乐类型为节奏感强的音乐。底鼓节拍点<阈值2的音乐类型为抒情类音乐。阈值1和阈值2根据音乐分类中军鼓节拍点及底鼓节拍点的数 量设置。
在应用中,将音乐类型粗分为节奏感强的音乐和抒情类音乐两大类,可以区别使用完全不同的特效类,避免在抒情类的音乐中大量触发过于激烈的特效,有助于和人的视听习惯保持统一。
本发明还提供一种存储设备,其上存储有多条指令;所述指令适于由处理器加载并执行:对音乐信号进行分帧处理,得到帧信号;获取所述帧信号的功率谱;把所述功率谱进行子带分解,分为至少两个子带;根据每个子带对应的节拍类型,对每个子带的信号进行时频域联合滤波;根据时频域联合滤波的结果从所述音乐信号的帧信号中得到待确认节拍点;根据所述待确认节拍点的功率值获取所述音乐信号的节拍点。
或者,所述指令适于由处理器加载并执行:使用上述任一实施例所述的音乐节拍点的检测方法检测出音乐的节拍点;根据每个子带内节拍点的数量,对所述音乐进行分类。
进一步地,该存储设备可以是U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
在其他实施例中,本发明提供的存储设备中的指令,由处理器加载并执行上述任一实施例设备上所述的音乐节拍点的检测方法中所述的步骤。或者,本发明提供的存储设备中的指令,由处理器加载并执行上述任一实施例所述音乐分类方法。
本发明还提供一种计算机设备。该计算机设备包括:一个或多个处理器;存储器;一个或多个应用程序,其中所述一个或多个应用程序被存储在所述存储器中并被配置为由所述一个或多个处理器执行,所述一个或多个应用程序配置用于执行设备中上述任一实施例所述的音乐节拍点的检测方法或者音乐分类方法。
图5为本发明一实施例中的计算机设备的结构示意图。本实施例所述的设备可以是计算机设备。例如服务器、个人计算机以及网络设备。如图5所示,设备包括处理器503、存储器505、输入单元507以及显示单元509等器件。本领域技术人员可以理解,图5示出的设备结构器件并不构 成对所有设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件。存储器505可用于存储应用程序501以及各功能模块,处理器503运行存储在存储器505的应用程序501,从而执行设备的各种功能应用以及数据处理。存储器可以是内存储器或外存储器,或者包括内存储器和外存储器两者。内存储器可以包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦写可编程ROM(EEPROM)、快闪存储器、或者随机存储器。外存储器可以包括硬盘、软盘、ZIP盘、U盘、磁带等。本发明所公开的存储器包括但不限于这些类型的存储器。本发明所公开的存储器只作为例子而非作为限定。
输入单元507用于接收信号的输入,以及接收用户输入的关键字。输入单元507可包括触控面板以及其它输入设备。触控面板可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板上或在触控面板附近的操作),并根据预先设定的程序驱动相应的连接装置;其它输入设备可以包括但不限于物理键盘、功能键(比如播放控制按键、开关按键等)、轨迹球、鼠标、操作杆等中的一种或多种。显示单元509可用于显示用户输入的信息或提供给用户的信息以及计算机设备的各种菜单。显示单元509可采用液晶显示器、有机发光二极管等形式。处理器503是计算机设备的控制中心,利用各种接口和线路连接整个电脑的各个部分,通过运行或执行存储在存储器503内的软件程序和/或模块,以及调用存储在存储器内的数据,执行各种功能和处理数据。
在一实施方式中,设备包括一个或多个处理器503,以及一个或多个存储器505,一个或多个应用程序501。其中所述一个或多个应用程序501被存储在存储器505中并被配置为由所述一个或多个处理器503执行,所述一个或多个应用程序501配置用于执行以上实施例所述音乐节拍点的检测方法或者音乐分类方法。
此外,在本发明各个实施例中的各功能单元可以集成在一个处理模块中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现 并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,该程序可以存储于一计算机可读存储介质中,存储介质可以包括存储器、磁盘或光盘等。
以上所述仅是本发明的部分实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。

Claims (13)

  1. 一种音乐节拍点的检测方法,其特征在于,包括以下步骤:
    对音乐信号进行分帧处理,得到帧信号;
    获取所述帧信号的功率谱;
    把所述功率谱进行子带分解,分为至少两个子带;
    根据每个子带对应的节拍类型,对每个子带的信号进行时频域联合滤波;
    根据时频域联合滤波的结果从所述音乐信号的帧信号中得到待确认节拍点;
    根据所述待确认节拍点的功率值获取所述音乐信号的节拍点。
  2. 根据权利要求1所述的音乐节拍点的检测方法,其特征在于,所述根据时频域联合滤波的结果从所述音乐信号的帧信号中得到待确认节拍点,包括:
    根据时频域联合滤波的结果,获取每个子带的信号中每个频率的节拍置信度;
    根据所述每个频率的节拍置信度计算每个子带中所有频率对应的功率值的加权求和值;
    根据该加权求和值得到所述待确认节拍点。
  3. 根据权利要求2所述的音乐节拍点的检测方法,其特征在于,所述根据所述待确认节拍点的功率值获取所述音乐信号的节拍点,包括:
    获取加权求和值大于门限功率值的待确认节拍点,将该待确认节拍点作为所述音乐信号的节拍点。
  4. 根据权利要求3所述的音乐节拍点的检测方法,其特征在于,所述门限功率值通过以下方式确定:
    获取所有所述待确认节拍点的功率值的均值以及方差;
    计算所述均值与两倍方差的和值,将该和值作为所述门限功率值。
  5. 根据权利要求4所述的音乐节拍点的检测方法,其特征在于,所述将该待确认节拍点作为所述音乐信号的节拍点之后,还包括:
    根据强节拍点门限功率值获取所述音乐信号的强节拍点,所述强节拍点 门限功率值通过以下方式确定:
    获取所有所述待确认节拍点的功率值的均值以及方差;
    计算所述均值与三倍方差的和值,将该和值作为强节拍点门限功率值;
    获取所述音乐信号的弱节拍点,所述弱节拍点通过以下方式确定:
    获取所述音乐信号的节拍点中,功率值小于等于所述强节拍点门限功率值,且大于所述门限功率值的节拍点,将该节拍点作为所述音乐信号的弱节拍点。
  6. 根据权利要求1所述的音乐节拍点的检测方法,其特征在于,所述把所述功率谱进行子带分解,分为至少两个子带,包括:
    把所述功率谱进行子带分解,分为四个子带;
    其中,所述四个子带包括用于检测底鼓节拍点的第一子带,用于检测军鼓节拍点的第二子带,用于检测军鼓节拍点的第三子带,以及用于检测高频节拍乐器节拍点的第四子带。
  7. 根据权利要求6所述的音乐节拍点的检测方法,其特征在于,所述第一子带的频段为120Hz~3K Hz,所述第二子带的频段为3K Hz~10K Hz,所述第三子带的频段为10K Hz~fs/2Hz;其中,fs为信号的采样频率。
  8. 根据权利要求6所述的音乐节拍点的检测方法,其特征在于,所述根据每个子带对应的节拍类型,对每个子带的信号进行时频域联合滤波,包括:
    根据所述第一子带、所述第二子带、所述第三子带及所述第四子带对应检测的节拍类型,采用节拍类型相应的参数对每个子带的信号进行时频域联合滤波。
  9. 根据权利要求8所述的音乐节拍点的检测方法,其特征在于,所述节拍类型相应的参数由以下方式确定:
    根据每个子带中,用于检测的节拍类乐器节拍点与其他不同于该节拍点的干扰信号,在时间上的特性及在谐波分布上的特性设置该子带的参数。
  10. 一种基于音乐节拍点的音乐分类方法,其特征在于,包括步骤:
    使用权利要求1-9中任意一项的音乐节拍点的检测方法检测出音乐的节拍点;
    根据每个子带内节拍点的数量,对所述音乐进行分类。
  11. 根据权利要求10所述的音乐分类方法,其特征在于,所述根据每个子带内节拍点的数量,对所述音乐进行分类,包括:
    根据每个子带内节拍点的数量,统计所述音乐信号中军鼓节拍点的数量以及底鼓节拍点的数量;
    若所述军鼓节拍点的数量大于第一阈值,且所述底鼓节拍点的数量大于所述第一阈值,将该音乐分类为强节奏感类音乐;
    若所述底鼓节拍点的数量小于第二阈值,将该音乐分类为抒情类音乐。
  12. 一种存储设备,其特征在于,其上存储有多条指令;所述指令适于由处理器加载并执行:
    对音乐信号进行分帧处理,得到帧信号;
    获取所述帧信号的功率谱;
    把所述功率谱进行子带分解,分为至少两个子带;
    根据每个子带对应的节拍类型,对每个子带的信号进行时频域联合滤波;
    根据时频域联合滤波的结果从所述音乐信号的帧信号中得到待确认节拍点;
    根据所述待确认节拍点的功率值获取所述音乐信号的节拍点;或,
    所述指令适于由处理器加载并执行:
    使用权利要求1-9中任意一项的音乐节拍点的检测方法检测出音乐的节拍点;
    根据每个子带内节拍点的数量,对所述音乐进行分类。
  13. 一种计算机设备,其特征在于,其包括:
    一个或多个处理器;
    存储器;
    一个或多个应用程序,其中所述一个或多个应用程序被存储在所述存储器中并被配置为由所述一个或多个处理器执行;
    所述一个或多个应用程序配置用于执行根据权利要求1至9任一项所 述的音乐节拍点的检测方法;或,所述一个或多个应用程序配置用于执行根据权利要求10至11任一项所述的音乐分类方法。
PCT/CN2018/119112 2018-01-09 2018-12-04 音乐分类方法及节拍点检测方法、存储设备及计算机设备 WO2019137115A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP18900195.1A EP3723080A4 (en) 2018-01-09 2018-12-04 METHOD OF CLASSIFYING MUSIC AND METHOD FOR DETECTING RHYTHM POINTS, STORAGE DEVICE AND COMPUTER DEVICE
RU2020126263A RU2743315C1 (ru) 2018-01-09 2018-12-04 Способ классификации музыки и способ детектирования долей музыкального такта, носитель данных и компьютерное устройство
US16/960,692 US11715446B2 (en) 2018-01-09 2018-12-04 Music classification method and beat point detection method, storage device and computer device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810019193.3A CN108320730B (zh) 2018-01-09 2018-01-09 音乐分类方法及节拍点检测方法、存储设备及计算机设备
CN201810019193.3 2018-01-09

Publications (1)

Publication Number Publication Date
WO2019137115A1 true WO2019137115A1 (zh) 2019-07-18

Family

ID=62894868

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/119112 WO2019137115A1 (zh) 2018-01-09 2018-12-04 音乐分类方法及节拍点检测方法、存储设备及计算机设备

Country Status (5)

Country Link
US (1) US11715446B2 (zh)
EP (1) EP3723080A4 (zh)
CN (1) CN108320730B (zh)
RU (1) RU2743315C1 (zh)
WO (1) WO2019137115A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111415644A (zh) * 2020-03-26 2020-07-14 腾讯音乐娱乐科技(深圳)有限公司 一种音频舒缓度预测方法及装置、服务器、存储介质
CN112489676A (zh) * 2020-12-15 2021-03-12 腾讯音乐娱乐科技(深圳)有限公司 模型训练方法、装置、设备及存储介质
CN115240619A (zh) * 2022-06-23 2022-10-25 深圳市智岩科技有限公司 音频节奏检测方法、智能灯具、装置、电子设备及介质

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019043797A1 (ja) * 2017-08-29 2019-03-07 Pioneer DJ株式会社 楽曲解析装置および楽曲解析プログラム
CN108320730B (zh) 2018-01-09 2020-09-29 广州市百果园信息技术有限公司 音乐分类方法及节拍点检测方法、存储设备及计算机设备
KR102637599B1 (ko) * 2018-10-08 2024-02-19 주식회사 에이치엘클레무브 차량간 통신 정보를 이용한 차선변경 제어장치 및 방법과, 그를 위한 성향 정보 산출 장치
CN109584902B (zh) * 2018-11-30 2021-07-23 广州市百果园信息技术有限公司 一种音乐节奏确定方法、装置、设备及存储介质
CN109670074B (zh) * 2018-12-12 2020-05-15 北京字节跳动网络技术有限公司 一种节奏点识别方法、装置、电子设备及存储介质
CN109495786B (zh) * 2018-12-20 2021-04-27 北京微播视界科技有限公司 视频处理参数信息的预配置方法、装置及电子设备
CN110070884B (zh) * 2019-02-28 2022-03-15 北京字节跳动网络技术有限公司 音频起始点检测方法和装置
CN110688518A (zh) * 2019-10-12 2020-01-14 广州酷狗计算机科技有限公司 节奏点的确定方法、装置、设备及存储介质
CN110890083B (zh) * 2019-10-31 2022-09-02 北京达佳互联信息技术有限公司 音频数据的处理方法、装置、电子设备及存储介质
CN110808069A (zh) * 2019-11-11 2020-02-18 上海瑞美锦鑫健康管理有限公司 一种演唱歌曲的评价系统及方法
CN110853677B (zh) * 2019-11-20 2022-04-26 北京雷石天地电子技术有限公司 歌曲的鼓声节拍识别方法、装置、终端和非临时性计算机可读存储介质
CN111048111B (zh) * 2019-12-25 2023-07-04 广州酷狗计算机科技有限公司 检测音频的节奏点的方法、装置、设备及可读存储介质
CN111128232B (zh) * 2019-12-26 2022-11-15 广州酷狗计算机科技有限公司 音乐的小节信息确定方法、装置、存储介质及设备
CN113223487B (zh) * 2020-02-05 2023-10-17 字节跳动有限公司 一种信息识别方法及装置、电子设备和存储介质
CN112118482A (zh) * 2020-09-17 2020-12-22 广州酷狗计算机科技有限公司 音频文件的播放方法、装置、终端及存储介质
CN112489681A (zh) * 2020-11-23 2021-03-12 瑞声新能源发展(常州)有限公司科教城分公司 节拍识别方法、装置及存储介质
CN112435687A (zh) * 2020-11-25 2021-03-02 腾讯科技(深圳)有限公司 一种音频检测方法、装置、计算机设备和可读存储介质
CN113223485B (zh) * 2021-04-28 2022-12-27 北京达佳互联信息技术有限公司 节拍检测模型的训练方法、节拍检测方法及装置
CN113727038B (zh) * 2021-07-28 2023-09-05 北京达佳互联信息技术有限公司 一种视频处理方法、装置、电子设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104346147A (zh) * 2013-07-29 2015-02-11 人人游戏网络科技发展(上海)有限公司 音乐游戏的节拍点的编辑方法及装置
CN104620313A (zh) * 2012-06-29 2015-05-13 诺基亚公司 音频信号分析
CN105513583A (zh) * 2015-11-25 2016-04-20 福建星网视易信息系统有限公司 一种歌曲节奏的显示方法及其系统
US9454948B2 (en) * 2014-04-30 2016-09-27 Skiptune, LLC Systems and methods for analyzing melodies
CN107545883A (zh) * 2017-10-13 2018-01-05 广州酷狗计算机科技有限公司 确定音乐的节奏快慢等级的方法和装置
CN108320730A (zh) * 2018-01-09 2018-07-24 广州市百果园信息技术有限公司 音乐分类方法及节拍点检测方法、存储设备及计算机设备

Family Cites Families (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4860624A (en) * 1988-07-25 1989-08-29 Meta-C Corporation Electronic musical instrument employing tru-scale interval system for prevention of overtone collisions
ID29029A (id) * 1998-10-29 2001-07-26 Smith Paul Reed Guitars Ltd Metode untuk menemukan fundamental dengan cepat
US20070163425A1 (en) * 2000-03-13 2007-07-19 Tsui Chi-Ying Melody retrieval system
US6542869B1 (en) * 2000-05-11 2003-04-01 Fuji Xerox Co., Ltd. Method for automatic analysis of audio including music and speech
US7026536B2 (en) * 2004-03-25 2006-04-11 Microsoft Corporation Beat analysis of musical signals
US7236226B2 (en) * 2005-01-12 2007-06-26 Ulead Systems, Inc. Method for generating a slide show with audio analysis
WO2007072394A2 (en) * 2005-12-22 2007-06-28 Koninklijke Philips Electronics N.V. Audio structure analysis
TW200727170A (en) * 2006-01-09 2007-07-16 Ulead Systems Inc Method for generating a visualizing map of music
US7612275B2 (en) * 2006-04-18 2009-11-03 Nokia Corporation Method, apparatus and computer program product for providing rhythm information from an audio signal
JP4672613B2 (ja) * 2006-08-09 2011-04-20 株式会社河合楽器製作所 テンポ検出装置及びテンポ検出用コンピュータプログラム
JP4823804B2 (ja) * 2006-08-09 2011-11-24 株式会社河合楽器製作所 コード名検出装置及びコード名検出用プログラム
WO2008095190A2 (en) * 2007-02-01 2008-08-07 Museami, Inc. Music transcription
US20090063277A1 (en) * 2007-08-31 2009-03-05 Dolby Laboratiories Licensing Corp. Associating information with a portion of media content
JP5593608B2 (ja) * 2008-12-05 2014-09-24 ソニー株式会社 情報処理装置、メロディーライン抽出方法、ベースライン抽出方法、及びプログラム
JP5282548B2 (ja) * 2008-12-05 2013-09-04 ソニー株式会社 情報処理装置、音素材の切り出し方法、及びプログラム
CN101599271B (zh) * 2009-07-07 2011-09-14 华中科技大学 一种数字音乐情感的识别方法
TWI484473B (zh) * 2009-10-30 2015-05-11 Dolby Int Ab 用於從編碼位元串流擷取音訊訊號之節奏資訊、及估算音訊訊號之知覺顯著節奏的方法及系統
TWI426501B (zh) * 2010-11-29 2014-02-11 Inst Information Industry 旋律辨識方法與其裝置
KR20130051386A (ko) * 2011-11-09 2013-05-20 차희찬 스마트 기기를 이용한 악기 튜너 제공 방법
JP5962218B2 (ja) * 2012-05-30 2016-08-03 株式会社Jvcケンウッド 曲順決定装置、曲順決定方法、および曲順決定プログラム
GB201310861D0 (en) * 2013-06-18 2013-07-31 Nokia Corp Audio signal analysis
GB2518663A (en) * 2013-09-27 2015-04-01 Nokia Corp Audio analysis apparatus
CN108335687B (zh) * 2017-12-26 2020-08-28 广州市百果园信息技术有限公司 音频信号底鼓节拍点的检测方法以及终端
CN109256146B (zh) * 2018-10-30 2021-07-06 腾讯音乐娱乐科技(深圳)有限公司 音频检测方法、装置及存储介质
CN110769309B (zh) * 2019-11-04 2023-03-31 北京字节跳动网络技术有限公司 用于展示音乐点的方法、装置、电子设备和介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104620313A (zh) * 2012-06-29 2015-05-13 诺基亚公司 音频信号分析
CN104346147A (zh) * 2013-07-29 2015-02-11 人人游戏网络科技发展(上海)有限公司 音乐游戏的节拍点的编辑方法及装置
US9454948B2 (en) * 2014-04-30 2016-09-27 Skiptune, LLC Systems and methods for analyzing melodies
CN105513583A (zh) * 2015-11-25 2016-04-20 福建星网视易信息系统有限公司 一种歌曲节奏的显示方法及其系统
CN107545883A (zh) * 2017-10-13 2018-01-05 广州酷狗计算机科技有限公司 确定音乐的节奏快慢等级的方法和装置
CN108320730A (zh) * 2018-01-09 2018-07-24 广州市百果园信息技术有限公司 音乐分类方法及节拍点检测方法、存储设备及计算机设备

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3723080A4 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111415644A (zh) * 2020-03-26 2020-07-14 腾讯音乐娱乐科技(深圳)有限公司 一种音频舒缓度预测方法及装置、服务器、存储介质
CN111415644B (zh) * 2020-03-26 2023-06-20 腾讯音乐娱乐科技(深圳)有限公司 一种音频舒缓度预测方法及装置、服务器、存储介质
CN112489676A (zh) * 2020-12-15 2021-03-12 腾讯音乐娱乐科技(深圳)有限公司 模型训练方法、装置、设备及存储介质
CN115240619A (zh) * 2022-06-23 2022-10-25 深圳市智岩科技有限公司 音频节奏检测方法、智能灯具、装置、电子设备及介质

Also Published As

Publication number Publication date
EP3723080A1 (en) 2020-10-14
RU2743315C1 (ru) 2021-02-17
EP3723080A4 (en) 2021-02-24
CN108320730A (zh) 2018-07-24
CN108320730B (zh) 2020-09-29
US11715446B2 (en) 2023-08-01
US20200357369A1 (en) 2020-11-12

Similar Documents

Publication Publication Date Title
WO2019137115A1 (zh) 音乐分类方法及节拍点检测方法、存储设备及计算机设备
EP3198247B1 (en) Device for capturing vibrations produced by an object and system for capturing vibrations produced by a drum.
JP5247855B2 (ja) 複数感知の音声強調のための方法および機器
Goto A real-time music-scene-description system: Predominant-F0 estimation for detecting melody and bass lines in real-world audio signals
JP6027087B2 (ja) スペクトル挙動の変換を実行する音響信号処理システム及び方法
US20120128165A1 (en) Systems, method, apparatus, and computer-readable media for decomposition of a multichannel music signal
Brossier et al. Real-time temporal segmentation of note objects in music signals
EP2962299B1 (en) Audio signal analysis
Turchet et al. Real-time hit classification in a Smart Cajón
CA2999839C (en) Systems and methods for capturing and interpreting audio
JP2015200685A (ja) アタック位置検出プログラムおよびアタック位置検出装置
Yao et al. Efficient vocal melody extraction from polyphonic music signals
Cantri et al. Cumulative Scores Based for Real-Time Music Beat Detection System
JP2019060976A (ja) 音声処理プログラム、音声処理方法および音声処理装置
JP4625934B2 (ja) 音分析装置およびプログラム
Ramires Automatic Transcription of Drums and Vocalised percussion
JP4625935B2 (ja) 音分析装置およびプログラム
Lagrange et al. Robust similarity metrics between audio signals based on asymmetrical spectral envelope matching
Fonseca Multi-channel approaches for musical audio content analysis
CN116524953A (zh) 音频检测方法、训练方法、装置、电子设备和存储介质
Grunberg Developing a Noise-Robust Beat Learning Algorithm for Music-Information Retrieval
Dimitrov Framework for Analyzing Sounds of Home Environment for Device Recognition
Rafii Source Separation by Repetition
Irigaray Transient and steady-state component separation for audio signals

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18900195

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2018900195

Country of ref document: EP

Effective date: 20200710