CN113823268A

CN113823268A - Intelligent music identification method and device

Info

Publication number: CN113823268A
Application number: CN202111008903.0A
Authority: CN
Inventors: 林东姝
Original assignee: Beijing Yiqi Network Technology Co ltd
Current assignee: Beijing Yiqi Network Technology Co ltd
Priority date: 2021-08-31
Filing date: 2021-08-31
Publication date: 2021-12-21

Abstract

The invention provides an intelligent music identification method, which comprises the following steps: step 1: performing tone characteristic extraction on music, wherein the tone characteristic extraction comprises extracted tone information of a sound source and a template library of the sound source; step 2: extracting tone color information of a sound source, wherein the tone color information of the extracted sound source is derived from a sound source appearing in the current music teaching; and step 3: making a feature set of a sound source, and expressing the extracted tone color information of the sound source by using the feature set of the sound source; and 4, step 4: and performing Euclidean distance calculation, and calculating the Euclidean distance between the feature set of the sound source and the feature set of the sound source with different timbres in the template library of the sound source. Through the intelligent recognition of tone, can help mr in the classroom, the accurate discernment of the instrument of what, what sound that sends out of discerning and judging, improved the teaching quality and the interest in music classroom greatly, also improved the interest in study in student's classroom.

Description

Intelligent music identification method and device

Technical Field

The invention belongs to the technical field of music identification, and particularly relates to a music intelligent identification method and device.

Background

The timbre is also called tone quality, which is a feature of the sound felt by the sense of hearing. Timbre (Timbre) means that different sound shows the characteristics that are always distinctive in terms of waveform, and different object vibrations have different characteristics. Different sounding bodies have different materials and structures, so the tone of the sounding is different. For example, pianos, violins and people make different sounds, and everyone makes different sounds.

Complex tones (Complex tones) are generated when a sounding body forms Complex vibration, the generated sound contains many frequencies, a spectrogram is drawn, the Complex frequency spectrum structure in the Complex Tone can be found, the lowest frequency is called fundamental frequency (First Partial), and the sound component is fundamental Tone (fundamental Tone). The sound components corresponding to other frequencies above the fundamental frequency in the spectral structure are called overtones (overs). When the frequency of a harmonic is exactly an integer multiple of the fundamental frequency, it is called a harmonic frequency. The vibration energy in the polyphonic tones emitted by the instrument is usually assigned to the lowest frequency, so the pitch of the tones perceived by humans is often determined by the fundamental frequency. The sounds produced by natural musical instruments are complex sounds, and each animal of every person has its own unique sound quality, which is also an important characteristic of sound fingerprints. The recognition of the tone of the complex tone can help us to accurately judge what musical instrument and what person makes the sound.

In music teaching, a computer is usually used for assisting, audio sampling is carried out through equipment, and the change of sound is recorded according to a certain frequency, so that the conversion from a continuous analog signal to a discrete digital signal is completed. For timbre, it is the timbre of the character resulting from the difference in the signals of the different signal sources themselves. Often, the sound quality of a sound source from which instrument and person make sound cannot be accurately judged through the tone quality in a music teaching classroom, so that the teaching quality is reduced in the music teaching process.

Disclosure of Invention

The invention aims to provide a navigation driving method of a traffic vehicle based on a high-precision map, which aims to solve the technical problems:

in order to solve the above technical problems, the following technical solutions are now provided:

an intelligent music identification method comprises the following steps: step 1: performing tone characteristic extraction on music, wherein the tone characteristic extraction comprises extracted tone information of a sound source and a template library of the sound source; the sound source template library is represented by TP, wherein TP is { TP1, TP2, TP3, … …, TPK }, and K is the total number of the templates; step 2: extracted timbre of sound sourceInformation, wherein the extracted timbre information of the sound source is derived from the sound source appearing in the current music teaching; and step 3: making a feature set of a sound source, and expressing the extracted tone color information of the sound source by using the feature set of the sound source; the feature set of the sound source is represented by TPj ═ { TPji }; and 4, step 4: performing Euclidean distance calculation, and calculating the Euclidean distance between the feature set of the sound source and the feature set of the sound source with different timbres in the template library of the sound source; calculation method of Euclidean distance

And 5: determining the tone of the sound source, selecting the tone corresponding to the template with the minimum Euclidean distance as the tone of the sound source, and calculating, comparing and identifying the extracted tone information of the sound source with the template library of the sound source so as to quickly judge which sound source the tone comes from.

As a further aspect of the present invention, in the method for intelligently identifying music, the template library of the sound source includes feature sets of sound sources with different timbres.

As a further scheme of the invention, in the intelligent music identification method, the template library further comprises a plurality of sub-templates; the sub-templates are used for storing the feature sets of the sound sources which are stored in advance.

As a further aspect of the present invention, in the method for intelligently identifying music, the feature set of the sound source is used to calculate the euclidean distance; the Euclidean distance is used for judging whether the tone corresponding to the template in the sound source template library is identified to be the tone of the sound source.

As a further aspect of the present invention, the method for intelligently identifying music includes a method for extracting a tone characteristic, where the method for extracting a tone characteristic includes extracting a tone characteristic value of the sound source, and the tone characteristic value includes a sound intensity in the sound source and a time duration of a change in the sound intensity in the sound source; the duration of the sound intensity change in the sound source is the length of time required in the middle of the sound intensity change in the sound source.

In a further aspect of the present invention, in the method for intelligently identifying music, the number of elements of the timbre feature value is the same as the number of elements included in the feature set of the sound source.

As a further aspect of the present invention, in the method for intelligently identifying music, the sound intensity of the sound source is the maximum sound intensity in the sound source, and the minimum sound intensity in the sound source, respectively.

As a further aspect of the present invention, in the method for intelligently identifying music, the time duration of the sound intensity change in the sound source includes a time duration from a maximum sound intensity in the sound source to a maximum sound intensity in the sound source and a time duration from the maximum sound intensity in the sound source to a minimum sound intensity in the sound source.

An intelligent music recognition device, comprising:

the extraction module is used for extracting tone characteristics of the music, and the tone characteristic extraction comprises extracted tone information of a sound source and a template library of the sound source; the template library of the sound source is represented by TP, wherein TP is { TP1, TP2, TP3, … …, TPK }, K is the total number of the templates, and the tone color information of the extracted sound source is derived from the sound source appearing in the current music teaching;

the collecting module is used for making a feature set of the sound source, representing the extracted tone information of the sound source by using the feature set of the sound source, and representing the feature set of the sound source by using TPj ═ { TPji };

the calculation module is used for calculating Euclidean distances and calculating the Euclidean distances between the feature set of the sound source and the feature set of the sound source with different timbres in the template library of the sound source; calculation method of Euclidean distance

And the determining module is used for determining the tone of the sound source, selecting the tone corresponding to the template with the minimum Euclidean distance as the tone of the sound source, and calculating, comparing and identifying the extracted tone information of the sound source with a template library of the sound source so as to quickly judge which sound source the tone comes from.

Compared with the prior art, the invention has the beneficial effects that:

through the intelligent recognition of tone, can help mr in the classroom, the accurate discernment of the instrument of what, what sound that sends out of discerning and judging, improved the teaching quality and the interest in music classroom greatly, also improved the interest in study in student's classroom.

Drawings

FIG. 1 is a schematic diagram illustrating steps of an intelligent music recognition method;

FIG. 2 is a schematic diagram of the receiver operation module of the music intelligent recognition device;

fig. 3 is a schematic diagram of a processor operating module of the intelligent music recognition device.

Detailed Description

The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.

The embodiment of the invention provides an intelligent music identification method, which comprises the following steps: step 1: performing tone characteristic extraction on music, wherein the tone characteristic extraction comprises extracted tone information of a sound source and a template library of the sound source; the sound source template library is denoted by TP, { TP1, TP2, TP3, … …, TPK }, K being the total number of templates, step 2: extracting tone color information of a sound source, wherein the tone color information of the extracted sound source is derived from a sound source appearing in the current music teaching; and step 3: making a feature set of a sound source, and expressing the extracted tone color information of the sound source by using the feature set of the sound source; the feature set of the sound source is denoted by TPj ═ { TPji }, step 4: performing Euclidean distance calculation, and calculating the Euclidean distance between the feature set of the sound source and the feature set of the sound source with different timbres in the template library of the sound source; calculation method of Euclidean distance

And 5:determining the tone of the sound source, selecting the tone corresponding to the template with the minimum Euclidean distance as the tone of the sound source, and calculating, comparing and identifying the extracted tone information of the sound source with the template library of the sound source so as to quickly judge which sound source the tone comes from. Specifically, when the tone characteristics of music are extracted, the tone information of a sound source is mainly extracted, the tone information of the sound source is contained in a template library containing different sound sources, preferably, the tone information of a large part of sound sources in the template library is poor, the tone information of the sound source can be clearly and conveniently obtained, but people or other sounding objects from which the tone comes can not be identified at the first time in music teaching, but the tone characteristics of different sound sources are different and can be extracted, and the tone characteristics of the sound source can help to judge which sound source the tone comes from through calculation, comparison and identification, so that the sound source is identified. Through the intelligent recognition of tone, can help mr in the classroom, the accurate discernment of the instrument of what, what sound that sends out of discerning and judging, improved the teaching quality and the interest in music classroom greatly, also improved the interest in study in student's classroom.

The embodiment of the invention provides another music intelligent identification method, wherein a template library of a sound source comprises feature sets of a plurality of sound sources with different timbres. Preferably, the template library further comprises a plurality of sub-templates; the sub-template is used for storing a feature set of a sound source which is stored in advance, and specifically, the template library comprises a plurality of sound sources with different timbres, so that when the sound source from which the timbre which cannot be rapidly judged comes, the corresponding timbre can be found by comparing the sound source with various sound sources in the template library through calculation and analysis, and the sound source corresponding to the timbre can be known. The plurality of sub-templates are set in the template library, so that the tone information of the sound sources in the template library can be arbitrarily increased or decreased, for example, TP represents the template library, TP ═ TP1, TP2, TP3, … …, TPK, and K represents the total number of templates. Each template has a set of features that are stored in advance, for example, for the jth template TPj ═ TPji, i is a natural number. The method can conveniently and quickly find the tone color information of the corresponding sound source from the template library after full calculation, thereby quickly helping teachers or students to quickly input new learning in the music teaching process and greatly improving the efficiency of the music teaching.

The embodiment of the invention provides another music intelligent identification method, wherein the extracted tone color information is represented by a characteristic set of a sound source; the feature set of the sound source is used for calculating Euclidean distance; the euclidean distance is used to determine whether the tone corresponding to the template in the recognized sound source template library is the tone of the sound source. Specifically, when extracting the tone color feature for each sound source, the extracted tone color feature set may be represented by, for example, Ta ═ { Ta1, Ta2, …, TaN }, Tb ═ Tb1, Tb2, …, TbM }; calculating Euclidean distance between feature sets Ta and Tb

If N < M, the Tan +1 values TaM are all 0-filled, and if N > M, the TbN values in TbM +1 are all 0-filled.

The sound source from which the tone comes can be quickly judged through the Euclidean distance, if A is the sound source of the tone to be identified, only the Euclidean distance between the feature set Ta of the sound source A and the feature sets of the sound sources of different tones in the template library needs to be calculated, and the tone corresponding to the template with the minimum Euclidean distance is selected as the tone of the sound source.

TP represents the template library, TP { TP1, TP2, TP3, … …, TPK }, and K is the total number of templates. Each template has a set of features that are stored in advance, for example, for the jth template TPj ═ TPji, i is a natural number. The timbre of sound source A is Min (D (Ta, { TP1, TP2, TP3, … …, TPK }).

The embodiment of the invention provides another music intelligent identification method, which comprises a tone characteristic extraction method, wherein the tone characteristic extraction method comprises the steps of extracting tone characteristic values of a sound source, and the tone characteristic values comprise the sound intensity in the sound source and the change duration of the sound intensity in the sound source; the duration of the sound intensity change in the sound source is the length of time required in the middle of the sound intensity change in the sound source. It will be appreciated by those skilled in the art that the sound is typically stored in the computer in a form of a plurality of successive frames. Thus, for sound, there are two dimensions, time and frequency. The time domain dimension is not considered in the present invention because the duration of the sound to be recognized may not be exactly the same, the duration of the note from one of its strokes may be longer for a piano, and a tap may make the note duration shorter. On the other hand, the correlation between the characteristics of timbre and duration of sound is not large, so that only frequency domain factors are considered here. For a sound source, it is stored in the computer as a series of frames, here denoted by f (R), and R denotes the total number of frames. That is, (r) { f1, f2 … … fR }, the frame fR with the highest sound intensity in f (r) is selected.

First, Fourier transform is performed on f (r)

Where N is the window width.

Since fr is a real signal, its imaginary part is 0, and at this time

F (ω) ═ fr (cos2 pi ω/N-jsin2 pi ω/N), that is, F (ω) can be transformed into two parts, cosine and sine, denoted by real and imag, respectively. Namely, it is

real(fr)＝fr*(cos2πω/N)

imag(fr)＝-fr*(cos2πω/N)

That is, real (fr) and imag (fr) can be used to represent the characteristic of fr, i.e. fr has two characteristic values, real (fr) and imag (fr), then real (fr) and imag (fr) can also be used as the characteristic value of the timbre of the sound source a, but the choice of the timbre characteristic value has a certain limitation because the timbre is not necessarily connected with the sound intensity. It is noted that the maximum and minimum intensity and the mid-intensity of each sound source at the time of occurrence will differ, and this is therefore a reasonable way of relating sound intensity to timbre. For example, in the case of a piano and a harmonica, the same note C is played, but the instantaneous maximum intensity of sound produced by the piano and the process of attenuation to the minimum intensity after the sound lasts are different from those of the harmonica although the frequencies of the piano and the harmonica are the same, so that the timbre can be distinguished by using a plurality of intensity characteristics, that is, more generally, a plurality of frames can be selected from f (r) { f1, f2 … … fR } to obtain the timbre characteristic value of the sound source.

The embodiment of the invention provides another music intelligent identification method, and the number of elements of the tone characteristic value is the same as that of the elements contained in the characteristic set of the sound source. Specifically, the tone color feature set of the sound source template library Tp also includes several elements, so that the same can be reused

The Euclidean distance between the sound source and the template library Tp can be calculated, and then Min (D (Ta, Tp)) is taken, so that the sound quality closest to the sound source can be obtained.

The embodiment of the invention provides another intelligent music identification method, wherein the sound intensity of a sound source is respectively the maximum sound intensity, the most middle sound intensity and the least middle sound intensity in the sound source. Specifically, it is assumed that in the most preferred embodiment of the present invention, the timbre characteristic value is a quintuple Fet ().

The timbre characteristics of the sound source a are:

Fet(A)＝(Max{f1,f2……fR},Mid{f1,f2……fR},Min{f1,f2……fR},TMax_Mid,TMid_min)

max { f1, f2 … … fR } represents the frame with the largest sound intensity, Mid { f1, f2 … … fR } represents the frame with the largest sound intensity, and Min { f1, f2 … … fR } represents the frame with the smallest sound intensity.

The embodiment of the invention provides another intelligent music identification method, wherein the time length of the change of the sound intensity in the sound source comprises the time length from the maximum sound intensity in the sound source to the most central sound intensity in the sound source and the time length from the most central sound intensity in the sound source to the minimum sound intensity in the sound source. Specifically, TMax _ Mid represents a time period from a frame having the highest sound intensity to a frame having the highest sound intensity, and TMid _ Min represents a time period from a frame having the highest sound intensity to a frame having the lowest sound intensity.

An intelligent music recognition device, comprising:

Specifically, when a sound source is present, the receiver is turned on, and the receiving capture module rapidly captures the key characteristics of the sound source, such as the timbre, wavelength, frequency and the like, of the sound source, and then the characteristics are transmitted to the subsequent processor by the transmission module. The invention can quickly capture the sound source which the user wants to know but does not know the syllable belongs to at the first time, thereby improving the intelligence and the interest of the learning people in the system.

The embodiment of the invention provides another intelligent identification device, which also comprises a processor for processing the transmission from the receiver; the processor comprises a receiving module, a storage module, an extraction module, a calculation module, an analysis module and an output module, wherein the receiving module is used for receiving the captured sound source information transmitted by the receiver; the storage module is used for storing the sound source information received by the receiving module; the extracting module is used for extracting the sound source stored in the current storage module; the calculation module is used for calculating the Euclidean distance between the sound source extracted from the storage module and the sound source in the template library; the analysis module is used for finding out the minimum Euclidean distance between the extracted sound source and the sound source in the template library; the output module is used for outputting the tone of the corresponding sound source obtained by the analysis module. Specifically, the receiving module mainly receives the tone color information of the sound source transmitted from the receiver, the tone color information of the sound source is stored in the storage module, then the extracting module in the processor can extract the tone color information of the sound source from the storage module, then the calculating module in the processor can process and calculate the euclidean distance between the tone color feature set and the sound source feature set stored in the template library, after the euclidean distance between the tone color feature set of the sound source and all the sound sources in the template library is calculated, the euclidean distance between the tone color feature set of the sound source and one of the sound sources in the template library is found to be the minimum through the analyzing module, the tone color information of the sound source is obtained through analysis, the tone color of the sound source with the minimum euclidean distance in the template library is judged to be the tone color of the sound source, and finally the result is obtained through the output module.

Although several embodiments and examples of the present invention have been described for those skilled in the art, these embodiments and examples are presented as examples and are not intended to limit the scope of the invention. These new embodiments can be implemented in other various ways, and various omissions, substitutions, and changes can be made without departing from the spirit of the invention. These embodiments and modifications thereof are included in the scope and gist of the invention, and are included in the invention described in the claims and the equivalent scope thereof.

Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.

Claims

1. An intelligent music identification method is characterized by comprising the following steps:

step 1: extracting tone color characteristics of the music, wherein the tone color characteristic extraction comprises extracted tone color information of a sound source and a template library of the sound source; the sound source template library is represented by TP, where TP is { TP1, TP2, TP3, … …, TPK }, and K is the total number of templates;

step 2: extracting the tone color information of the sound source, wherein the extracted tone color information of the sound source is derived from the sound source appearing in the current music teaching;

and step 3: preparing a feature set of a sound source, and expressing the extracted tone information of the sound source by using the feature set of the sound source, wherein the feature set of the sound source is expressed by TPj ═ { TPji };

and 4, step 4: performing Euclidean distance calculation, and calculating the Euclidean distance between the feature set of the sound source and the feature set of the sound source with different timbres in the template library of the sound source; calculation method of Euclidean distance

And 5: determining the tone of the sound source, selecting the tone corresponding to the template with the minimum Euclidean distance as the tone of the sound source, and using the extracted tone information of the sound source to calculate, compare and identify with the template library of the sound source so as to quickly judge which sound source the tone comes from.

2. An intelligent music recognition method according to claim 1, wherein the sound source template library comprises feature sets of sound sources with different timbres.

3. An intelligent music recognition method according to claim 2, wherein the template library further comprises a plurality of sub-templates; the sub-templates are used to store the pre-stored feature sets of the sound sources.

4. An intelligent music recognition method according to claim 3, wherein the feature set of the sound source is used to calculate the Euclidean distance; the euclidean distance is used to determine whether the tone corresponding to the template in the template library for identifying the sound source is the tone of the sound source.

5. The intelligent music identification method according to claim 4, further comprising a method for extracting timbre features, wherein the method for extracting timbre features comprises extracting timbre feature values of the sound source, and the timbre feature values of the sound source comprise the intensity of sound in the sound source and the duration of change of the intensity of sound in the sound source; the duration of the sound intensity change in the sound source is the length of time required in the middle of the sound intensity change in the sound source.

6. An intelligent music recognition method according to claim 5, wherein the number of elements of the timbre feature value is the same as the number of elements included in the feature set of the sound source.

7. An intelligent music recognition method according to claim 6, wherein the sound intensity of the sound source is the maximum sound intensity of the sound source, and the minimum sound intensity of the sound source.

8. An intelligent music recognition method according to claim 7, wherein the duration of the change in the sound intensity of the sound source includes a duration from the maximum sound intensity in the sound source to the maximum sound intensity in the sound source and a duration from the maximum sound intensity in the sound source to the minimum sound intensity in the sound source.

9. An intelligent music recognition device, comprising:

computingThe module is used for calculating Euclidean distances and calculating the Euclidean distances between the feature set of the sound source and the feature set of the sound source with different timbres in the template library of the sound source; calculation method of Euclidean distance