CN107481582A

CN107481582A - A kind of vocality study electron assistant articulatory system

Info

Publication number: CN107481582A
Application number: CN201710768301.2A
Authority: CN
Inventors: 赵毅
Original assignee: Xuchang University
Current assignee: Xuchang University
Priority date: 2017-08-31
Filing date: 2017-08-31
Publication date: 2017-12-15

Abstract

The invention discloses a kind of vocality study electron assistant articulatory system, it is related to musicology teaching field.The vocality study electron assistant articulatory system includes：Audio collection module, first buffer module, reference audio memory module, second buffer module, first time-frequency convert module, second time-frequency convert module, first beat extraction module, second beat extraction module, audio similarity detection module, tone color contrast module, beat contrast module, comprehensive assessment module, input module, control module, display module and module for reading and writing, the collection of the data in whole pronunciation situation detection process can be completed by the present invention, conversion and assessment, and assess from accuracy in pitch, tone color and the aspect of beat three are assessed, so that student grasps oneself when practising vocal music skill in time, the quality of oneself tone color quality, whether beat aligns, whether accuracy in pitch is accurate, and then student is caused pointedly to carry out vocal music exercise, so as to improve the effect of exercise.

Description

A kind of vocality study electron assistant articulatory system

Technical field

The present invention relates to musicology teaching field, more particularly relates to a kind of vocality study electron assistant articulatory system.

Background technology

Vocal music includes：Bel canto, the folk styled singing and popular singing style, it is now Chinese aboriginal music vocals occur again.Vocal music The musical form performed by one or more singers, using musical accompaniment whether all can, and voice is the emphasis of works.For For vocal music, it may is that a kind of most ancient musical form, because not needing any musical instrument, only voice.Various All there is some form of vocal music in music culture.Though language and music purpose are different, all have common pitch and rhythm because Element.

Vocal music exercise in, breathing be vocal music exercise important component, be vocal music basis and power, musical respiration The needs of vocals are submitted to, are separated according to the requirement of melody strong, weak, anxious, slow, make breath during performance fuller, more It is happy, song is had more expressive force.

But in existing vocal music exercise equipment, although can be that practitioner beat beat, appearance etc., nothing are sung in recording, check and correction Method allows student to understand oneself when practising vocal music skill, and whether the quality of oneself tone color quality, beat align, and whether accuracy in pitch is accurate Really, and then cause student can not pointedly carry out vocal music exercise, cause the effect of exercise poor.

The content of the invention

The embodiment of the present invention provides a kind of vocality study electron assistant articulatory system, to solve not allowing in the prior art Student understands oneself when practising vocal music skill, and whether the quality of oneself tone color quality, beat align, and whether accuracy in pitch is accurate, enters And cause student can not pointedly carry out vocal music exercise, cause exercise effect it is poor the problem of.

The embodiment of the present invention provides a kind of vocality study electron assistant articulatory system, including：Audio collection module, first are delayed Die block, reference audio memory module, the second buffer module, the first time-frequency convert module, the second time-frequency convert module, first segment Clap extraction module, the second beat extraction module, audio similarity detection module, tone color contrast module, beat contrast module, synthesis Evaluation module, input module, control module, display module and module for reading and writing；

The input module, for inputting read write command, acquisition instructions, the first conversion instruction, the first extraction instruction, second Conversion instruction, the second extraction instruct, tone color contrast instructs, beat contrasts instruction, detect instruction, assessment instructs and idsplay order；

The control module, for sending read write command to the module for reading and writing；Send and adopt to the audio collection module Collection instruction；The first conversion instruction is sent to the first time-frequency convert module；First is sent to the first beat extraction module Extraction instruction；The second conversion instruction is sent to the second time-frequency convert module；The is sent to the second beat extraction module Two extraction instructions；Detection instruction is sent to the audio similarity detection module；Beat pair is sent to the beat contrast module Than instruction；Tone color contrast instruction is sent to the tone color contrast module；Sent to the comprehensive assessment module and assess instruction；To institute State display module and send idsplay order；

The audio collection module, after receiving acquisition instructions, gather the sounding audio-frequency signal of practitioner, and by sound Frequency signal quantifies to be converted into audio time domain data by noise reduction post-sampling, and by the buffer module of audio time domain data storage first；

The reference audio memory module, the reference audio time domain data practised for storing practitioner；

The module for reading and writing, after receiving read write command, when reading the reference audio that current exercise person is practised Numeric field data, and the reference audio time domain data is stored in the second buffer module；

The first time-frequency convert module, after receiving the first conversion instruction, transfer the sound in the first buffer module Frequency time domain data, and the audio time domain data are changed into audio frequency domain data by Fast Fourier Transform (FFT)；

The second time-frequency convert module, after receiving the second conversion instruction, transfer the ginseng in the second buffer module Audio time domain data are examined, and the reference audio time domain data is changed into reference audio frequency domain number by Fast Fourier Transform (FFT) According to；

The first beat extraction module, after receiving the first extraction instruction, transfer the sound in the first buffer module Frequency time domain data, and audio time domain data are subjected to Gassian low-pass filter and obtain the envelope data of audio time domain data, and pass through The peak value of detected envelope data determines beat sequence；

The second beat extraction module, after receiving the second extraction instruction, transfer the ginseng in the second buffer module Audio time domain data are examined, and reference audio time domain data is carried out to the reference of Gassian low-pass filter acquisition reference audio time domain data Envelope data, and determine to refer to beat sequence by detecting the peak value of reference envelope data；

The audio similarity detection module, after receiving detection instruction, by audio time domain data and reference audio Time domain data is normalized respectively, obtains normalizing audio time domain data and normalizes reference audio time domain data, and Normalization audio time domain data are passed through into MFCC algorithms with normalization reference audio time domain data respectively and obtain normalization characteristic sequence Row and normalization fixed reference feature sequence, and calculate normalization characteristic sequence and normalize the likelihood ratio of fixed reference feature sequence；

The tone color contrast module, after receiving tone color contrast instruction, by audio frequency domain data and reference audio frequency Numeric field data is normalized respectively, obtains normalizing audio frequency domain data and normalizes reference audio frequency domain data；For Will normalization audio frequency domain data carry out respectively low frequency filtering, intermediate frequency filtering and High frequency filter obtain the frequent numeric field data of bass, Middle frequency domain audio frequency domain data and high-frequency audio frequency domain data；Will normalization reference audio frequency domain data carry out respectively low frequency filtering, Intermediate frequency filtering and High frequency filter obtain with reference to the frequent numeric field data of bass, the frequency domain audio frequency domain data and with reference to high-frequency audio with reference in Frequency domain data；The frequent numeric field data of bass, middle frequency domain audio frequency domain data and high-frequency audio frequency domain data are carried out integrating respectively To low frequency audio power, intermediate frequency audio power and high-frequency audio energy；The frequent numeric field data of bass, the frequency domain audio with reference in will be referred to Frequency domain data and reference high-frequency audio frequency domain data are integrated to obtain with reference to low frequency audio power, the frequency domain audio energy with reference in respectively Measure and refer to high-frequency audio energy；For calculating bass frequency difference value, middle frequency domain audio difference value and high-frequency audio difference value；

The beat contrast module, after receiving beat contrast instruction, calculate beat sequence and with reference to beat sequence Likelihood ratio；

The comprehensive assessment module, after receiving assessment instruction, referred to according to normalization characteristic sequence and normalization The likelihood ratio of characteristic sequence determines accuracy in pitch similarity；Determine that beat is accurate according to beat sequence and with reference to the likelihood ratio of beat sequence Degree；Bass tones quality, middle pitch are determined according to bass frequency difference value, middle frequency domain audio difference value and high-frequency audio difference value respectively Tone color quality and high pitch tone color quality.

Preferably, the display module, after receiving control instruction, display accuracy in pitch similarity, the beat degree of accuracy and Bass tones quality, middle pitch tone color quality and high pitch tone color quality.

Preferably, the comprehensive assessment module is according to normalization characteristic sequence and the likelihood ratio of normalization fixed reference feature sequence Determine that accuracy in pitch similarity includes：

The comprehensive assessment module is specified in data area from multiple first, obtains normalization characteristic sequence and normalization is joined First where examining the likelihood ratio of characteristic sequence specifies data area, and first based on acquisition specifies data area, from storage In corresponding relation between data area and accuracy in pitch similarity, accuracy in pitch similarity corresponding to acquisition.

Preferably, the comprehensive assessment module determines that beat is accurate according to beat sequence and with reference to the likelihood ratio of beat sequence Degree includes：

The comprehensive assessment module is specified in data area from multiple second, obtains beat sequence and with reference to beat sequence Second where likelihood ratio specifies data area, and second based on acquisition specifies data area, from the data area and section of storage Clap in the corresponding relation between the degree of accuracy, the beat degree of accuracy corresponding to acquisition.

Preferably, the comprehensive assessment module determines that bass tones quality includes according to bass frequency difference value：

The comprehensive assessment module is specified in data area from multiple three, obtains the 3rd where bass frequency difference value Data area is specified, the 3rd based on acquisition specifies data area, between the data area and bass tones quality of storage In corresponding relation, bass tones quality corresponding to acquisition.

The embodiment of the present invention, there is provided a kind of vocality study electron assistant articulatory system includes：Audio collection module, first Buffer module, reference audio memory module, the second buffer module, the first time-frequency convert module, the second time-frequency convert module, first It is beat extraction module, the second beat extraction module, audio similarity detection module, tone color contrast module, beat contrast module, comprehensive Evaluation module, input module, control module, display module and module for reading and writing are closed, can complete entirely to send out by the system of the present invention Assessment in sound situation detection process, and assess and assessed in terms of accuracy in pitch, tone color and beat three so that student grasps in time , when practising vocal music skill, whether the quality of oneself tone color quality, beat align, and whether accuracy in pitch is accurate for oneself, and then to learn Life can pointedly carry out vocal music exercise, so as to improve the effect of exercise.And invention is by calculating beat sequence and reference node The likelihood ratio of sequence is clapped, bass frequency difference value, middle frequency domain audio difference value and high-frequency audio difference value is calculated and calculates and return The mode of the likelihood ratio of one change characteristic sequence and normalization fixed reference feature sequence carries out the assessment of beat, tone color and accuracy in pitch, assesses Accuracy rate is high.

Brief description of the drawings

In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing There is the required accompanying drawing used in technology description to be briefly described, it should be apparent that, drawings in the following description are only this Some embodiments of invention, for those of ordinary skill in the art, on the premise of not paying creative work, can be with Other accompanying drawings are obtained according to these accompanying drawings.

Fig. 1 is a kind of block diagram of vocality study electron assistant articulatory system provided in an embodiment of the present invention.

Description of reference numerals：

1st, audio collection module；2nd, the first buffer module；3rd, reference audio memory module；4th, the second buffer module；5th, One time-frequency modular converter；6th, the second time-frequency convert module；7th, the first beat extraction module；8th, the second beat extraction module；9th, sound Frequency similarity detection module；10th, tone color contrast module；11st, beat contrast module；12nd, comprehensive assessment module；13rd, input module； 14th, control module；15th, module for reading and writing；16th, display module.

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, rather than whole embodiments.It is based on Embodiment in the present invention, those of ordinary skill in the art are obtained every other under the premise of creative work is not made Embodiment, belong to the scope of protection of the invention.

The frame for showing a kind of vocality study electron assistant articulatory system provided in an embodiment of the present invention exemplary Fig. 1 Figure, the vocality study electron assistant articulatory system include audio collection module 1, the first buffer module 2, reference audio memory module 3rd, the second buffer module 4, the first time-frequency convert module 5, the second time-frequency convert module 6, the first beat extraction module 7, second section Clap extraction module 8, audio similarity detection module 9, tone color contrast module 10, beat contrast module 11, comprehensive assessment module 12, Input module 13, display module 16, control module 14 and module for reading and writing 15.

Specifically, the input module 13, for input read write command, acquisition instructions, the first conversion instruction, first extraction refer to Make, the second conversion instruction, second extraction instruction, tone color contrast instruction, beat contrast instruction, detection instruction and assess instruction；Should Control module 14, for sending read write command to the module for reading and writing；Acquisition instructions are sent to the audio collection module；To this first Time-frequency convert module sends the first conversion instruction；The first extraction instruction is sent to the first beat extraction module；To this second when Frequency modular converter sends the second conversion instruction；The second extraction instruction is sent to the second beat extraction module；It is similar to the audio Spend detection module and send detection instruction；Beat contrast instruction is sent to the beat contrast module；Sent to the tone color contrast module Tone color contrast instruction；Sent to the comprehensive assessment module and assess instruction.

Specifically, the audio collection module 1, after receiving acquisition instructions, the sounding audio-frequency signal of practitioner is gathered, And audio signal is converted into audio time domain data by the quantization of noise reduction post-sampling, and audio time domain data storage first is buffered Module 2.

The reference audio memory module 3, the reference audio time domain data practised for storing practitioner.

The module for reading and writing 15, after receiving read write command, when reading the reference audio that current exercise person is practised Numeric field data, and the reference audio time domain data is stored in the second buffer module 4.

Specifically, the first time-frequency convert module 5, after receiving the first conversion instruction, the first buffer module 2 is transferred In audio time domain data, and the audio time domain data are changed into audio frequency domain data by Fast Fourier Transform (FFT)；This Two time-frequency convert modules 6, after receiving the second conversion instruction, transfer the reference audio time domain number in the second buffer module 4 According to, and the reference audio time domain data is changed into reference audio frequency domain data by Fast Fourier Transform (FFT).

Specifically, the first beat extraction module 7, after receiving the first extraction instruction, the first buffer module 2 is transferred In audio time domain data, and by audio time domain data carry out Gassian low-pass filter obtain audio time domain data envelope data, And beat sequence is determined by the peak value of detected envelope data.

Specifically, the second beat extraction module 8, after receiving the second extraction instruction, the second buffer module 4 is transferred In reference audio time domain data, and by reference audio time domain data carry out Gassian low-pass filter obtain reference audio time domain data Reference envelope data, and by detect reference envelope data peak value determine refer to beat sequence.

Wherein, the first beat extraction module 7 determines that beat sequence is specially by the peak value of detected envelope data：First Beat extraction module 7 determining that the numerical value in beat sequence is 1, when can't detect peak value, determining beat when detecting peak value Numerical value in sequence is -1, that is, beat sequence is by a string 1 and -1 sequence formed.

Further, control module 14 sends detection instruction, audio similarity inspection to the audio similarity detection module 9 Module is surveyed, after receiving detection instruction, first transfers audio time domain data from the first buffer module 2, from the second buffering mould Reference audio time domain data is transferred in block 4, then audio time domain data and reference audio time domain data are normalized respectively Processing, obtain normalize audio time domain data and normalize reference audio time domain data, and will normalization audio time domain data and Normalize reference audio time domain data and normalization characteristic sequence and normalization fixed reference feature sequence obtained by MFCC algorithms respectively, And calculate normalization characteristic sequence and normalize the likelihood ratio of fixed reference feature sequence.

Specifically, tone color contrast module 10, after receiving tone color contrast instruction, by audio frequency domain data and reference Audio frequency domain data is normalized respectively, obtains normalizing audio frequency domain data and normalizes reference audio frequency domain number According to；For carrying out low frequency filtering, intermediate frequency filtering and High frequency filter respectively to obtain bass frequent normalization audio frequency domain data Numeric field data, middle frequency domain audio frequency domain data and high-frequency audio frequency domain data；Normalization reference audio frequency domain data is carried out respectively low Frequency filtering, intermediate frequency filtering and High frequency filter are obtained with reference to the frequent numeric field data of bass, frequency domain audio frequency domain data and reference with reference in High-frequency audio frequency domain data；The frequent numeric field data of bass, middle frequency domain audio frequency domain data and high-frequency audio frequency domain data are entered respectively Row integration obtains low frequency audio power, intermediate frequency audio power and high-frequency audio energy；The frequent numeric field data of bass, reference will be referred to Middle frequency domain audio frequency domain data and reference high-frequency audio frequency domain data are integrated to obtain with reference to low frequency audio power, with reference in respectively Frequency domain audio energy and reference high-frequency audio energy；For calculating bass frequency difference value, middle frequency domain audio difference value and high frequency audio Frequency difference value.

Wherein, tone color contrast module 10 calculates bass frequency difference value, middle frequency domain audio difference value and high-frequency audio difference value Including：Bass frequency difference value is determined by low frequency audio power and the ratio with reference to low frequency audio power；Pass through middle frequency domain audio Frequency domain audio difference value during energy determines with the ratio with reference to intermediate frequency audio power；By high-frequency audio energy and refer to high-frequency audio The ratio of energy determines high-frequency audio difference value.

In addition, will normalization audio frequency domain data it is used when carrying out low frequency filtering, intermediate frequency filtering and High frequency filter respectively Wave filter is respectively：Gaussian filter, Butterworth filter and Chebyshev filter.

Furthermore the scope in the frequent domain of bass is 40Hz~160Hz, the scope of middle frequency domain audio frequency domain for 160Hz~ 1280Hz；The scope of high-frequency audio frequency domain is 1280Hz~5120Hz.

Specifically, the beat contrast module 11, after receiving beat contrast instruction, beat sequence and reference node are calculated Clap the likelihood ratio of sequence.

Specifically, comprehensive assessment module 12, after receiving assessment instruction, according to normalization characteristic sequence and normalizing The likelihood ratio for changing fixed reference feature sequence determines accuracy in pitch similarity；Section is determined according to beat sequence and with reference to the likelihood ratio of beat sequence Clap the degree of accuracy；Bass tones matter is determined according to bass frequency difference value, middle frequency domain audio difference value and high-frequency audio difference value respectively Amount, middle pitch tone color quality and high pitch tone color quality.

Wherein, the comprehensive assessment module 12 is true according to the likelihood ratio of normalization characteristic sequence and normalization fixed reference feature sequence The quasi- similarity of accordatura includes：

The comprehensive assessment module 12 is specified in data area from multiple first, obtains normalization characteristic sequence and normalization is joined First where examining the likelihood ratio of characteristic sequence specifies data area, and first based on acquisition specifies data area, from storage In corresponding relation between data area and accuracy in pitch similarity, accuracy in pitch similarity corresponding to acquisition.

In addition, the plurality of first specifies data area to set in advance, the data model stored in comprehensive assessment module 12 Enclose the corresponding relation between accuracy in pitch similarity to set in advance, and during setting, this can be passed through by user The control interface of input module 13 is configured, and is commented it is of course also possible to be sent to the synthesis after being configured by user terminal Estimate module 12, the embodiment of the present invention is not specifically limited to this.

Such as the likelihood of the normalization characteristic sequence that gets of the comprehensive assessment module 12 and normalization fixed reference feature sequence Than for 0.7, the plurality of first to specify data area be the comprehensive assessment module 12 between the data area and accuracy in pitch similarity Corresponding relation in acquire, and the corresponding relation between the data area and accuracy in pitch similarity can with as shown in table 1 below, Be, the plurality of first specify data area can be 0~0.6,0.6~0.8, more than 0.8.Obtained from such as table 1 below corresponding Accuracy in pitch similarity to be substantially similar.

Table 1

Accuracy in pitch similarity	It is closely similar	Substantially it is similar	It is dissimilar
				Data area	More than 0.8	0.6~0.8	0~0.6

It should be noted that in embodiments of the present invention, only with the data area shown in above-mentioned table 1 and accuracy in pitch similarity it Between corresponding relation exemplified by illustrate, above-mentioned table 1 not to the embodiment of the present invention form limit.

Further, the comprehensive assessment module determines that beat is accurate according to beat sequence and with reference to the likelihood ratio of beat sequence Degree includes：

The comprehensive assessment module is specified in data area from multiple second, obtains beat sequence and reference beat sequence seemingly So second where ratio specifies data area, and second based on acquisition specifies data area, from the data area and beat of storage In corresponding relation between the degree of accuracy, the beat degree of accuracy corresponding to acquisition.

Wherein, the plurality of second data area is specified to set in advance, the data model stored in comprehensive assessment module 12 Enclose the corresponding relation between the beat degree of accuracy to set in advance, and during setting, this can be passed through by user The control interface of input module 13 is configured, and is commented it is of course also possible to be sent to the synthesis after being configured by user terminal Estimate module 12, the embodiment of the present invention is not specifically limited to this.

Such as the beat sequence that gets of the comprehensive assessment module 12 and be 0.5 with reference to the likelihood ratio of beat sequence, this is more Individual second specifies data area as in corresponding relation of the comprehensive assessment module 12 between the data area and the beat degree of accuracy Acquire, and the corresponding relation between the data area and the beat degree of accuracy can be that is to say with as shown in table 1 below, it is the plurality of Second specify data area can be 0~0.6,0.6~0.8, more than 0.8.The beat degree of accuracy corresponding to being obtained from such as table 2 below For inaccuracy.

Table 2

Accuracy in pitch similarity	It is very accurate	Substantially it is accurate	It is inaccurate
				Data area	More than 0.8	0.6~0.8	0~0.6

It should be noted that in embodiments of the present invention, only with the data area shown in above-mentioned table 2 and the beat degree of accuracy it Between corresponding relation exemplified by illustrate, above-mentioned table 1 not to the embodiment of the present invention form limit.

Specifically, the comprehensive assessment module determines that bass tones quality includes according to bass frequency difference value：

The comprehensive assessment module is specified in data area from multiple three, and the 3rd where obtaining bass frequency difference value refers to Determine data area, the 3rd based on acquisition specifies data area, pair between the data area and bass tones quality of storage In should being related to, bass tones quality corresponding to acquisition.

Wherein, the plurality of 3rd data area is specified to set in advance, the data model stored in comprehensive assessment module 12 Enclose the corresponding relation between bass tones quality to set in advance, and during setting, can be passed through by user The control interface of the input module 13 is configured, it is of course also possible to be sent to the synthesis after being configured by user terminal Evaluation module 12, the embodiment of the present invention is not specifically limited to this.

For example the bass frequency difference value that the comprehensive assessment module 12 is got is 1, the plurality of 3rd specifies data area To be acquired in corresponding relation of the comprehensive assessment module 12 between the data area and bass tones quality, and the data Corresponding relation between scope and bass tones quality can be that is to say, the plurality of 3rd specifies data area with as shown in table 3 below Can be 0~0.8,0.8~1.5, more than 1.5.Bass tones quality corresponding to being obtained from such as table 3 below is that bass is balanced.

Table 3

It should be noted that in embodiments of the present invention, only with the data area shown in above-mentioned table 3 and bass tones quality Between corresponding relation exemplified by illustrate, above-mentioned table 1 not to the embodiment of the present invention form limit.

In addition, comprehensive assessment module frequency domain audio difference value in determines middle pitch tone color quality, according to high frequency audio frequency difference Different value determines that the method for high pitch tone color quality determines that the method for bass tones quality is similar with according to bass frequency difference value, this hair This is no longer going to repeat them for bright embodiment.

Specifically, display module 16, after the control instruction for receiving control module transmission, display accuracy in pitch similarity, The beat degree of accuracy and bass tones quality, middle pitch tone color quality and high pitch tone color quality.

The embodiment of the present invention, there is provided a kind of vocality study electron assistant articulatory system includes：Audio collection module, first Buffer module, reference audio memory module, the second buffer module, the first time-frequency convert module, the second time-frequency convert module, first It is beat extraction module, the second beat extraction module, audio similarity detection module, tone color contrast module, beat contrast module, comprehensive Evaluation module, input module, control module, display module and module for reading and writing are closed, can be completed in the system by the present invention whole Assessment in pronunciation situation detection process, and assess and assessed in terms of accuracy in pitch, tone color and beat three so that student slaps in time Oneself is held when practising vocal music skill, whether the quality of oneself tone color quality, beat align, and whether accuracy in pitch is accurate, and then cause Student can pointedly carry out vocal music exercise, so as to improve the effect of exercise.And invention is by calculating beat sequence and reference The likelihood ratio of beat sequence, calculate bass frequency difference value, middle frequency domain audio difference value and high-frequency audio difference value and calculating The mode of the likelihood ratio of normalization characteristic sequence and normalization fixed reference feature sequence carries out the assessment of beat, tone color and accuracy in pitch, comments Estimate accuracy rate height.

Although preferred embodiments of the present invention have been described, but those skilled in the art once know basic creation Property concept, then can make other change and modification to these embodiments.So appended claims be intended to be construed to include it is excellent Select embodiment and fall into having altered and changing for the scope of the invention.

Obviously, those skilled in the art can carry out the essence of various changes and modification without departing from the present invention to the present invention God and scope.So, if these modifications and variations of the present invention belong to the scope of the claims in the present invention and its equivalent technologies Within, then the present invention is also intended to comprising including these changes and modification.

Claims

A kind of 1. vocality study electron assistant articulatory system, it is characterised in that including：Audio collection module, the first buffer module, Reference audio memory module, the second buffer module, the first time-frequency convert module, the second time-frequency convert module, the extraction of the first beat Module, the second beat extraction module, audio similarity detection module, tone color contrast module, beat contrast module, comprehensive assessment mould Block, input module, control module, display module and module for reading and writing；

The input module, for inputting read write command, acquisition instructions, the first conversion instruction, the first extraction instruction, the second conversion Instruct, the second extraction instructs, tone color contrast instructs, beat contrasts instruction, detect instruction, assessment instructs and idsplay order；

The control module, for sending read write command to the module for reading and writing；Collection is sent to the audio collection module to refer to Order；The first conversion instruction is sent to the first time-frequency convert module；The first extraction is sent to the first beat extraction module Instruction；The second conversion instruction is sent to the second time-frequency convert module；Second is sent to the second beat extraction module to carry Instruction fetch；Detection instruction is sent to the audio similarity detection module；Beat contrast is sent to the beat contrast module to refer to Order；Tone color contrast instruction is sent to the tone color contrast module；Sent to the comprehensive assessment module and assess instruction；Show to described Show that module sends idsplay order；

The audio collection module, after receiving acquisition instructions, the sounding audio-frequency signal of practitioner is gathered, and audio is believed Number quantify to be converted into audio time domain data by noise reduction post-sampling, and by the buffer module of audio time domain data storage first；

The reference audio memory module, the reference audio time domain data practised for storing practitioner；

The module for reading and writing, after receiving read write command, read the reference audio time domain number that current exercise person is practised According to, and the reference audio time domain data is stored in the second buffer module；

The first time-frequency convert module, after receiving the first conversion instruction, when transferring the audio in the first buffer module Numeric field data, and the audio time domain data are changed into audio frequency domain data by Fast Fourier Transform (FFT)；

The second time-frequency convert module, after receiving the second conversion instruction, transfer the reference sound in the second buffer module Frequency time domain data, and the reference audio time domain data is changed into reference audio frequency domain data by Fast Fourier Transform (FFT)；

The first beat extraction module, after receiving the first extraction instruction, when transferring the audio in the first buffer module Numeric field data, and audio time domain data are subjected to Gassian low-pass filter and obtain the envelope data of audio time domain data, and pass through detection The peak value of envelope data determines beat sequence；

The second beat extraction module, after receiving the second extraction instruction, transfer the reference sound in the second buffer module Frequency time domain data, and by reference audio time domain data carry out Gassian low-pass filter obtain reference audio time domain data reference envelope Data, and determine to refer to beat sequence by detecting the peak value of reference envelope data；

The audio similarity detection module, after receiving detection instruction, by audio time domain data and reference audio time domain Data are normalized respectively, obtain normalizing audio time domain data and normalize reference audio time domain data, and will return One change audio time domain data and normalization reference audio time domain data respectively by MFCC algorithms obtain normalization characteristic sequence and Fixed reference feature sequence is normalized, and calculates normalization characteristic sequence and normalizes the likelihood ratio of fixed reference feature sequence；

The tone color contrast module, after receiving tone color contrast instruction, by audio frequency domain data and reference audio frequency domain number According to being normalized respectively, obtain normalizing audio frequency domain data and normalize reference audio frequency domain data；For that will return One change audio frequency domain data carries out low frequency filtering, intermediate frequency filtering and High frequency filter and obtains the frequent numeric field data of bass, intermediate frequency respectively Audio frequency domain data and high-frequency audio frequency domain data；Normalization reference audio frequency domain data is subjected to low frequency filtering, intermediate frequency respectively Filtering and High frequency filter obtain with reference to the frequent numeric field data of bass, with reference in frequency domain audio frequency domain data and refer to high-frequency audio frequency domain Data；The frequent numeric field data of bass, middle frequency domain audio frequency domain data and high-frequency audio frequency domain data are integrated to obtain respectively low Frequency domain audio energy, intermediate frequency audio power and high-frequency audio energy；The frequent numeric field data of bass, the frequency domain audio frequency domain with reference in will be referred to Data and integrated to obtain respectively with reference to low frequency audio power with reference to high-frequency audio frequency domain data, with reference to intermediate frequency audio power and With reference to high-frequency audio energy；For calculating bass frequency difference value, middle frequency domain audio difference value and high-frequency audio difference value；

The beat contrast module, after receiving beat contrast instruction, calculate beat sequence and reference beat sequence seemingly So ratio；

The comprehensive assessment module, after receiving assessment instruction, according to normalization characteristic sequence and normalization fixed reference feature The likelihood ratio of sequence determines accuracy in pitch similarity；The beat degree of accuracy is determined according to beat sequence and with reference to the likelihood ratio of beat sequence； Bass tones quality, middle pitch sound are determined according to bass frequency difference value, middle frequency domain audio difference value and high-frequency audio difference value respectively Chromaticness amount and high pitch tone color quality.
2. vocality study electron assistant articulatory system as claimed in claim 1, it is characterised in that the comprehensive assessment module root Determine that accuracy in pitch similarity includes according to the likelihood ratio of normalization characteristic sequence and normalization fixed reference feature sequence：

The comprehensive assessment module is specified in data area from multiple first, obtains normalization characteristic sequence and normalization with reference to special First where levying the likelihood ratio of sequence specifies data area, and first based on acquisition specifies data area, from the data of storage In corresponding relation between scope and accuracy in pitch similarity, accuracy in pitch similarity corresponding to acquisition.
3. vocality study electron assistant articulatory system as claimed in claim 1, it is characterised in that the comprehensive assessment module root Determine that the beat degree of accuracy includes according to beat sequence and with reference to the likelihood ratio of beat sequence：

The comprehensive assessment module is specified in data area from multiple second, obtains beat sequence and the likelihood with reference to beat sequence Second than place specifies data area, and second based on acquisition specifies data area, accurate from data area and the beat of storage In corresponding relation between exactness, the beat degree of accuracy corresponding to acquisition.
4. vocality study electron assistant articulatory system as claimed in claim 1, it is characterised in that the comprehensive assessment module root Determine that bass tones quality includes according to bass frequency difference value：

The comprehensive assessment module is specified in data area from multiple three, and the 3rd where obtaining bass frequency difference value specifies Data area, the based on acquisition the 3rd specifies data area, from corresponding between the data area and bass tones quality of storage In relation, bass tones quality corresponding to acquisition.
5. vocality study electron assistant articulatory system as claimed in claim 1, it is characterised in that the display module, be used for After receiving control instruction, display accuracy in pitch similarity, the beat degree of accuracy and bass tones quality, middle pitch tone color quality and high pitch sound Chromaticness amount.