CN109346058A

CN109346058A - A kind of speech acoustics feature expansion system

Info

Publication number: CN109346058A
Application number: CN201811443497.9A
Authority: CN
Inventors: 程冰
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2018-11-29
Filing date: 2018-11-29
Publication date: 2019-02-15

Abstract

The application belongs to sound processing techniques field, expands system more particularly to a kind of speech acoustics feature.In Course of Language Learning, the corpus of suitable brain perception is produced after needing to expand speech acoustics feature for learner to stimulate brain.The application provides a kind of speech acoustics feature expansion system, including voice acquisition unit, the voice acquisition unit are connected with Audio Processing Unit, and the Audio Processing Unit is connected with video editing unit；Wherein, the voice acquisition unit, for being obtained to natural-sounding；The Audio Processing Unit, for carrying out different degrees of expansion to the spectrum signature in natural-sounding, to make corpus；The video editing unit is used for synthetic video segment after voice and video and processed voice edition.The speech acoustics feature, which expands system, can produce the corpus for being more suitable for brain perception, so that learner be helped to form the voice scope more close to mother tongue person in the brain.

Description

A kind of speech acoustics feature expansion system

Technical field

The application belongs to sound processing techniques field, expands system more particularly to a kind of speech acoustics feature.

Background technique

With the rapid development of the related fieldss such as bioengineering, computer science, data statistics processing, brain imaging technique, Brain science research combines the advantage of cross discipline, has carried out entirely to the interactive process of brain development growth and language learning environment New exploration.Studies have shown that baby has just gradually lost the sensibility to non-mother tongue pronunciation after 12 months, to cause Future foreign language phonetic study obstacle.One self-study a foreign language is often accustomed to going from oneself original speech perception Recognize new language, so receive than very fast to the foreign language voice of similar mother tongue pronunciation, and to the voice not having in mother tongue, it connects Being got up can be relatively difficult.However often when learning the voice similar with mother tongue, learner is easier to be influenced by mother tongue, from And generate accent.For example, the U.S. has different perception from the brain of Chinese to a same English Phonetics.

Because insensitive to non-mother tongue pronunciation, learner is first from acoustically cannot comprehensively receiving language message, institute To be difficult correctly to pronounce.Meanwhile learner's one phoneme of every study requires the voice scope for establishing this sound in the brain. This voice scope not instead of point, a set.Because of the language ring that foreign language learner and mother tongue learner touch Border is incomparable, so the voice scope established in their brains also greatly differs from each other.

In Course of Language Learning, suitable brain is produced for learner after the acoustic feature of natural-sounding is expanded The corpus of perception, the nervous system for stimulating them to lose sensibility to non-mother tongue pronunciation reopen and then receive voice letter comprehensively Breath, so that learner be helped to form the voice scope more close to mother tongue person in the brain.

Summary of the invention

1. technical problems to be solved

It produces based in Course of Language Learning, after the acoustic feature of natural-sounding is expanded for learner suitable Language is reopened and then received comprehensively to the corpus of brain perception, the nervous system for stimulating them to lose sensibility to non-mother tongue pronunciation Message breath, so that learner be helped to form the voice scope more close to mother tongue person in the brain, this application provides a kind of languages Phonematics feature expands system.

2. technical solution

To achieve the above object, this application provides a kind of speech acoustics features to expand system, including voice obtains Unit, the voice acquisition unit are connected with Audio Processing Unit, and the Audio Processing Unit is connected with video editing unit It connects；

The voice acquisition unit, for being obtained to natural-sounding；

The Audio Processing Unit makes language for carrying out different degrees of expansion to the spectrum signature in natural-sounding Material；

The video editing unit, for different video segment will to be synthesized after voice and video and processed voice edition.

Optionally, the Audio Processing Unit includes being based on MATLAB sound processing module.

Optionally, described based on MATLAB sound processing module includes that expand submodule, fundamental tone same for formant frequency difference Walk the submodule that splices, frequency separation submodule, bandwidth separation submodule and gap separation submodule.

Optionally, the MATLAB sound processing module that is based on includes phonetic analysis submodule and sound synthon module.

Optionally, the video editing unit includes format analysis processing module and frame frequency processing module.

Optionally, the Audio Processing Unit, for carrying out 3 kinds of different degrees of expansions to the spectrum signature in voice, Respectively 300%, 208%, 144%, to make corpus.

3. beneficial effect

Compared with prior art, the beneficial effect that a kind of speech acoustics feature provided by the present application expands system is:

Speech acoustics feature provided by the present application expands system, by by voice acquisition unit, Audio Processing Unit, video Edit cell is connected；After expanding to the spectrum signature of natural-sounding, it is fabricated to video.It is connect when simulation pedology idiom speech The acoustic feature of the voice contacted produces the corpus of suitable brain perception for learner to stimulate brain, makes its Foreign Language language The decreased brain of sound susceptibility is capable of the physical acoustics feature of clearly perceptual speech, to establish in the brain similar female The voice scope of language, and then improve the accuracy of pronunciation.

Detailed description of the invention

Fig. 1 is that a kind of speech acoustics feature of the application expands system principle schematic diagram；

In figure: 1- voice acquisition unit, 2- Audio Processing Unit, 3- video editing unit, 4- are based at MATLAB sound Reason module, 5- formant frequency difference expand submodule, 6- pitch synchronous and splice submodule, 7- frequency separation submodule, 8- band Width separation submodule, the gap 9- separate submodule, 10- phonetic analysis submodule, 11- sound rendering submodule, 12- format analysis processing Module, 13- frame frequency processing module.

Specific embodiment

Hereinafter, specific embodiment of the reference attached drawing to the application is described in detail, it is detailed according to these Description, one of ordinary skill in the art can implement the application it can be clearly understood that the application.Without prejudice to the application principle In the case where, the feature in each different embodiment can be combined to obtain new embodiment, or be substituted certain Certain features in embodiment, obtain other preferred embodiments.

The phonetic unit of " childrenese " by the vibration frequencies of vocal cords and oral cavity, cavum laryngis, nasal cavity resonant frequency by turgidly It shows, the gap between the distinctive formant of vowel is also artificially increased.This exaggeration not only makes baby be easy to distinguish Other phonetic unit, and the crucial phonetic feature that word senses are distinguished in mother tongue has been experienced simultaneously.When mother and child speak Sound there is very big elasticity and mobility, such elasticity, which changes, to be facilitated baby and establishes effective acoustic mode to carry out Voice is sorted out, that is, establishes the mother tongue pronunciation scope of each phoneme in the brain.Brain science field finds baby's acquistion Mother tongue pronunciation process has following features: 1) baby has an opportunity to hear various people's one's voices in speech；2) they are organic can be appreciated that difference The pronunciation degree of lip-rounding of people；3) sound when mother speaks to baby is total to by the vibration frequency of vocal cords and oral cavity, cavum laryngis, nasal cavity Vibration frequency is turgidly showed.The highly beneficial energy for being conducive to improve difference phoneme of speech sound difference with baby of these three elements Power establishes comprehensive mother tongue pronunciation scope.

Corpus, i.e. linguistic data.Corpus is the content of introduction on linguistics research.Corpus is the basic unit for constituting corpus.

Youngster is adult, the language used when especially mother speaks to infant to language (Matherese, or " mother's language ") Speech.The content and form (words and phrases, intonation, word speed used etc.) of language all needs to adapt to the language competence and cognitive ability of children, Consider the understanding and ability to accept of baby.Studies have shown that youngster has the physics expanded than normal speech to language in terms of voice Acoustic feature.

Referring to Fig. 1, the application provides a kind of speech acoustics feature expansion system, including voice acquisition unit 1, the voice Acquiring unit 1 is connected with Audio Processing Unit 2, and the Audio Processing Unit 2 is connected with video editing unit 3；

The voice acquisition unit 1, for being obtained to natural-sounding；

The Audio Processing Unit 2 makes language for carrying out different degrees of expansion to the spectrum signature in natural-sounding Material；

The video editing unit 3, for different video segment will to be synthesized after voice and video and processed voice edition.

Optionally, the Audio Processing Unit 2 includes being based on MATLAB sound processing module 4.

Optionally, described that submodule 5, fundamental tone are expanded including formant frequency difference based on MATLAB sound processing module 4 Synchronize the submodule 6 that splices, frequency separation submodule 7, bandwidth separation submodule 8 and gap separation submodule 9.

Optionally, described that phonetic analysis submodule 10 and sound synthon mould are included based on MATLAB sound processing module 4 Block 11.Here it after the sound of 10 pairs of phonetic analysis submodule acquisitions is analyzed, is synthesized newly by sound rendering submodule 11 Sound.

Optionally, the video editing list 3 includes format analysis processing module 12 and frame frequency processing module 13.

Optionally, the Audio Processing Unit 2, for carrying out 3 kinds of different degrees of expansions to the spectrum signature in voice, Respectively 300%, 208%, 144%, to make corpus.

Embodiment

Amplification target voice is to important differentiation acoustics element.For the voice of each group of needs training, need according to this The distinctive elements of two speech acoustics features determine the physical parameter of specific natural sound processing.

Audio Processing Unit 2 is sent to after obtaining nature recording by voice acquisition unit 1, passes through MATLAB acoustic processing Module 4 by sound spectrum signature carry out 3 kinds of different degrees of amplifications, respectively 300%, 208%, 144%, then with original Beginning sound is made into the training corpus of four grades together.Such as English Phonetics/r-l/ pairs, 3 kinds of parameters are F3 cross frequence, F3 band Wide, F3 transit time.In the synthesis process, 5 amplifications of submodule/r-l/ formant frequency is expanded by formant frequency difference Difference simultaneously reduces F3 bandwidth.The amplification of/r-l/ time response is spliced son by pitch synchronous using time warping technique Module 6 is added.English vowel/i-I/ pairs for another example passes through frequency separation submodule 7, bandwidth separates submodule 8 and gap It separates submodule 9 and carries out the cross frequence of F1 and F2, bandwidth, adjust the gap between F1 and F2.

" the LPC Analysis and Synthesis of in MATLAB sound processing module 4 is used when production This submodule of Speech ".LPC refers to Linear Prediction Coding.It is closed including phonetic analysis submodule 10 and sound At submodule 11, it can analyze and synthesize new sound.(operation is shown in: DSP System Toolbox^TMfunctionality available at the command line.)

After acoustic processing, using Final Cut Pro7, including format analysis processing module 12 and frame frequency processing module 13, Can mix and arrange in pairs or groups in time shaft different-format and frame frequency, and the slow motion that the video of sound passes through synchronous different editions is regarded Frequency and time-stretching track, then put together with processed sound and are edited, synthesize different video segment, as into one The corpus of step production training soft ware.

Speech acoustics feature provided by the present application expands system, by by voice acquisition unit, Audio Processing Unit, video Edit cell is connected；After expanding to the spectrum signature of voice, it is fabricated to video.It is touched when simulation pedology idiom speech Voice acoustic feature, produce the corpus of suitable brain perception for learner to stimulate brain, keep Foreign Language voice sensitive The physical acoustics feature of clearly perceptual speech can be listened by spending decreased brain, establish the voice of similar mother tongue in the brain Scope, and then improve the accuracy of pronunciation.

Although the application is described above by referring to specific embodiment, one of ordinary skill in the art are answered Work as understanding, in principle disclosed in the present application and range, many modifications can be made for configuration disclosed in the present application and details. The protection scope of the application is determined by the attached claims, and claim is intended to technical characteristic in claim Equivalent literal meaning or range whole modifications for being included.

Claims

1. a kind of speech acoustics feature expands system, it is characterised in that: including voice acquisition unit, the voice acquisition unit with Audio Processing Unit is connected, and the Audio Processing Unit is connected with video editing unit；

The voice acquisition unit, for being obtained to natural-sounding；

The Audio Processing Unit makes corpus for carrying out different degrees of expansion to the spectrum signature in natural-sounding；

2. speech acoustics feature as described in claim 1 expands system, it is characterised in that: the Audio Processing Unit includes base In MATLAB sound processing module.

3. speech acoustics feature as claimed in claim 2 expands system, it is characterised in that: described to be based on MATLAB acoustic processing Module includes that formant frequency difference expands submodule, pitch synchronous and splices submodule, frequency separation submodule, bandwidth segregant Module and gap separate submodule.

4. speech acoustics feature as claimed in claim 2 expands system, it is characterised in that: described to be based on MATLAB acoustic processing Module includes phonetic analysis submodule and sound synthon module.

5. speech acoustics feature as described in any one of claims 1 to 4 expands system, it is characterised in that: the video is compiled Collecting unit includes format analysis processing module and frame frequency processing module.

6. speech acoustics feature as claimed in claim 5 expands system, it is characterised in that: the Audio Processing Unit is used for To in voice spectrum signature carry out 3 kinds of different degrees of expansions, respectively 300%, 208%, 144%, to make corpus.