CN102799759A - Vocal tract morphological standardization method during large-scale physiological pronunciation data processing - Google Patents
Vocal tract morphological standardization method during large-scale physiological pronunciation data processing Download PDFInfo
- Publication number
- CN102799759A CN102799759A CN2012101965474A CN201210196547A CN102799759A CN 102799759 A CN102799759 A CN 102799759A CN 2012101965474 A CN2012101965474 A CN 2012101965474A CN 201210196547 A CN201210196547 A CN 201210196547A CN 102799759 A CN102799759 A CN 102799759A
- Authority
- CN
- China
- Prior art keywords
- thin plate
- point
- data
- physiology
- coordinate
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Processing Or Creating Images (AREA)
Abstract
The invention discloses a vocal tract morphological standardization method during large-scale physiological pronunciation data processing, and the method comprises the following steps of: firstly establishing a template, then calibrating mark points, and finally determining the parameter of the spline function of respective sheet according to the corresponding relation between the template and the speaker model mark points. Compared with the prior art, the method disclosed by the invention has the advantages that the physiological pronunciation data standardization can be realized and the kinetic characteristic and spatial position relation of physiological pronunciation can be kept through conducting physiological standardization on the vocal tracts of different speakers when the method is compared with the traditional linearization standard method.
Description
Technical field
The present invention relates to sound pronunciation morphological analysis process field, particularly relate to a kind of safety defect modeling technique procotol.
The invention belongs to sound pronunciation morphological analysis process field.In voice physiology pronunciation research process, because the difference of sound channel form makes to the research of the motion essential characteristic of physiology pronunciation and modeling difficulty very between the experimenter.Especially when handling, be difficult to the manual data normalization of accomplishing for different speakers for large-scale data.So, a kind of method of the standardization vocal tract shape based on thin plate spline function has been proposed.Compare with widely used linearize standardized method, method can under the prerequisite of the personal characteristics that keeps the experimenter, effectively reduce modal difference.This method has important effect for handling extensive physiology pronunciation data.
Background technology
Phonetics is the subject that pronunciation is studied to human language.Main research contents has two aspects, is the physiology articulatory phonetics that the research vocal organs act in the physiology phonation on the one hand, is the acoustic phonetics of research voice acoustic characteristic on the other hand.Early stage phonetics is more studied the acoustic characteristic of voice, nowadays, also has increasing researchist to take to the research of vocal organs mechanism in the physiology phonation.Yet the researchist utilizes physiology pronunciation speech database to make an experiment not as in the acoustic voice research process fully.Except obtaining the physiology pronunciation data relatively the difficulty; Also because vocal tract shape exists individual difference between the different experiments person; Want to eliminate these differences and must realize the modal standardization of vocal tract shape, yet standardized technology still is a bottleneck in the research of physiology pronunciation.Therefore in the research of voice physiology pronunciation, to be hidden in different speakers pronunciation essence behind for excavation be an individual requisite process with kinetic characteristic for the morphologic difference that reduces the different objects of speaking is standardized different speaker's physiology pronunciation datas.After the normalization method standard of using, the physiology pronunciation model has also kept the kinetic characteristic of vocal organs when the physiology phonation when not only reducing between the individuality morphological differences.
The simulation of conveniently pronouncing.Because to change degree very big for the form of sound channel in the process of pronunciation, be difficulty very so only come sound channel standardized through the affined transformation of simple rigid objects.At present, the scholar of studying physiological pronunciation has proposed the method for several kinds of vocal tract normalization, yet all is based on the method for linearize.People such as Bechman adopt the method for linearize sound channel wall that the MRI data recorded is carried out transformation of coordinates, thereby realize the standardization of data.After the sound channel motion morphology standardization when sending out vowel during people such as Hashi pronounce physiology, formed x light database.More than these two kinds of methods all realize the standardization of sound channel on length through method to maxilla wall contour curve linearize; Though use the method for linearize can reduce the difference between the speaker; But according to the data presentation in the test [6]; Morphological differences between the different speakers is not only relevant with sound channel length, and is also closely bound up with the volume size in chamber, sound channel front and back.The method of sound channel linearize is not only lost the locus relativeness of maxilla and two contour curves of tongue surface after standardization, and lost the nonlinear relationship of different enunciator's sound channel morphological differencess.Especially concerning the data that arrive in sound channel local height deformation station acquisition, will lose important nonlinear relationship, even increase individual data items in the axial difference of x.
In image calibration and figure coupling field, everybody a kind of non-rigid normalization method that is widely used based on thin plate spline mapping transfer function, it can effectively solve the problem that occurs in the above-mentioned linearize method process of normalization.
Because former method for normalizing all is to adopt linearize to realize the standardization of sound channel, has the relative position in physiology pronunciation space and the defective that the nonlinear motion characteristic is lost in these methods.Therefore, for fear of losing of sound channel shape information,
Just because of this, originally having researched and proposed a method based on thin plate spline function comes the EMMA physiology pronunciation data between the different enunciators is carried out standard.Used three enunciators' EMMA physiology pronunciation data, average through maxilla and tongue contour shape to three people, and the template of acquisition standardization sound channel.The sound channel template physiology pronunciation space that then utilizes an existing grid system to come respectively three speaker's physiology to be pronounced spaces and on average obtain is carried out monumented point ground and is demarcated.Confirm the thin plate spline transforming function transformation function by each speaker corresponding relation of monumented point that pronounces in space indicate point and the template space then, just can utilize this thin plate spline function to carry out the coordinate transform of physiology pronunciation data thus, thereby realize standardization.
Summary of the invention
Problem based on above-mentioned prior art existence; The present invention proposes the form method for normalizing of sound channel in a kind of extensive physiology pronunciation data processing; Confirm that each enunciator's physiology pronunciation space is to each self-corresponding thin plate spline function of template physiology pronunciation space coordinate transformation; Keep the relative position on the pronunciation space between maxilla and tongue with this method, also keep the nonlinear motion characteristic of organ in its phonation simultaneously; Finally, make the minimizing of the morphological differences of sound channel between individuality through standardizing.
, use the non-rigid standardized method of realization to solve the defect problem in the above-mentioned linearize methodological standardization process based on thin plate spline mapping transfer function.Just because of this,
The present invention proposes the form method for normalizing of sound channel in a kind of extensive physiology pronunciation data processing, it is characterized in that this method may further comprise the steps:
Compared with prior art; The present invention compares with traditional standard method that utilizes linearize; Sound channel through to different speaker is carried out modal standard, when realizing the standard of physiology pronunciation data, but can also keep the kinetic characteristic and the spatial relation of physiology pronunciation; Help the motion essence of organ in the phonation is analyzed, and needn't consider the difference between the individuality.
Description of drawings
Fig. 1 is the monumented point of three experimenters of the present invention and template;
Fig. 2 shows the data of one of them vowel for the raw data before the standardization, each subgraph;
Fig. 3 is the data after the standardization;
Fig. 4 is for using linearize method standardization sound channel data afterwards;
Fig. 5 is the standard difference comparison diagram of experimenter's raw data and normalized number certificate;
Fig. 6 is before each experimenter's standardization and the pronunciation of the vowel physiology after standardization image
Embodiment
The present invention proposes a method and solve the standardization that realizes the EMMA data between the different experiments person based on thin plate spline mapping transfer function.Used three enunciators' EMMA data, average through maxilla and tongue contour shape to three people, and obtain standardized sound channel template.The sound channel template space indicate point that then utilizes an existing grid system to come respectively three speakers to be pronounced the space and on average obtain is demarcated.Confirm the thin plate spline transforming function transformation function by each speaker corresponding relation of monumented point that pronounces in space indicate point and the template space then, just can utilize this thin plate spline function to carry out the coordinate up conversion thus, thereby realize standardization.
Below in conjunction with accompanying drawing and preferred embodiment,, specify as follows according to embodiment provided by the invention, structure, characteristic and effect thereof.
Wanting to confirm need be through three step based on the standardized method of thin plate spline function: the foundation that at first is template; Being the demarcation of monumented point then, is to confirm the parameter of thin plate spline function separately according to the corresponding relation of monumented point in template and the speaker model at last.
Template is set up
Utilization among the present invention from the EMMA database of NTT, comprising three enunciators' physiology pronunciation and the data of acoustic voice, i.e. EMMA database.The view data of the sound channel profile that electromagnetism pronunciation registering instrument is caught in the database; Three enunciator's maxillas and tongue surface contour curve are averaged; Remove the morphological differences of sound channel between the individuality, thereby obtain the modal mean profile of sound channel, as in the normalization method with reference to template.
Monumented point is demarcated
Because the EMMA data recorded is two-dimentional, and unlike the same being the pronunciation data of three-dimensional and can very clearly catching sound channel organ distorted movement spatially in the physiology phonation of writing down of image recording systems such as MRI and X ray.People such as Beautemps in nineteen ninety-five the sound channel area function that extracts in profile and the formant frequency of therefrom losing of proposing as vowel and fricative model; In order to address this problem; We use [9] a kind of grid system that obtains behind the above-mentioned model modification to come respectively the monumented point mark to be carried out in the sound channel space of three experimenters and template, come to confirm accurately that the spatial movement of diverse location in the physiology phonation changes.At first confirm each enunciator's tongue surface outline curves and maxilla contour curve; General T1 more than 1,000 is to T4 sensor movement zone when in the physiology phonation, sending out vowel whole with each enunciator in the physiology pronunciation data storehouse; On average try to achieve the center of each sensor region; The line of four central points is exactly enunciator's tongue surface profile space curve separately, and the maxilla curve also is through on average obtaining.And then determine the position at central point place by tongue surface outline curves locus.After tongue surface contour curve and central point are confirmed; According to tongue surface outline curves of having confirmed and central point whole physiology pronunciation spatial is slit into ten equal angular sectors; The space of whole physiology pronunciation is separated with 11 rays, and the limit of each sector just can intersect at 2 points with maxilla curve, the tongue surface curve confirmed before like this, and the mid point of getting between two intersection points obtains 11 intersection points; Line is exactly the meta curve; Limit, sector and the 1cm of tongue surface intersection point below the limit, sector can obtain 11 other intersection points, and line is a curve under the tongue surface, finally can obtain 44 intersection points.These 44 intersection points are just as the monumented point in physiology pronunciation channel space.Owing to follow the example of just the same when three enunciators of definition and template space indicate point; So think; In four spaces as shown in Figure 1, difference pronunciation space, different spaces should be corresponding one by one through the monumented point of the ad-hoc location of same procedure sign, has identical sign.
The thin plate spline function Determination of Parameters
Whole physiology pronunciation is to be produced by the elastic shrinkage of tongue and the motion of chin.So after the stiff radiation conversion standardization of different enunciators' vocal tract shape process; Vocal tract shape can not well be mated together; Come the standard vocal tract shape such as utilizing the linearize method; Not only lose a lot of Useful Informations, and made the individual morphology difference in x direction of principal axis data increase on the contrary.The thin plate spline transforming function transformation function that uses among this paper is exactly belong to flexible radiation conversion a kind of, and it guarantees that conversion all is level and smooth on the overall situation.
What utilize is that the monumented point one-to-one relationship of mentioning in the above-mentioned grid system is confirmed the function that thin plate radiates.Supposing has n coordinate points in the sound channel two-dimensional coordinate system; The song of thin plate spline sticks up can be by the individual parametric description of 2 (n+3); These parameters are made up of common 2n nonlinear parameter of 6 overall linear dimensions and n monumented point; Wherein half is the axial parametric description of x, and second half is the axial parametric description of y.This 2 (n+3) individual parameter can be confirmed by the linear system of mentioning in [7].Suppose
i=1; N; N monumented point on the expression plane is the situation of 44 monumented points in this experiment.The functional value that the coordinate of these monumented points is brought the correspondence that thin plate spline function tries to achieve into is
i=1; 2;, n.It is thus clear that (x, y) expression is mapping relations of
to thin plate spline difference functions f.The thin plate spline difference functions defines as follows:
Above-mentioned equality (1), a
1+ a
2X+a
3Y is linear transformation,
It is nonlinear transformation.Wherein the r implication as follows
represent the point that will carry out conversion in each enunciator space; Square distance with respect to each monumented point; X and y represent thin plate spline f respectively, and (x promptly inserts the coordinate that thin plate spline function will carry out the point of conversion in y).Equality (1) is to be the equality of load centre thin plate deformation in the infinite space scope with each monumented point coordinate
.Thin plate with
For the weights under the situation of load centre are w
i[7], the w here
iIt is the parameter that Zagorchev mentions in the article of delivering in 2006 " comparative study of non-rigid images match transfer function ".The spline interpolation function of thin plate is made up of two parts; The linear change that a part is described by first three element, remaining part are to describe the nonlinearities change that the batten song sticks up.Through making the flexional E of function f difference functions
fCan reach minimum qualifications, and the coordinate one-to-one relationship of monumented point is confirmed a
1, a
2, a
3And w
iValue, thereby confirm thin plate spline function.E wherein
fDefine as follows:
Formula (2) is represented flexional, can find out as the E that represents flexional
fHour, (x, the conversion of y) carrying out will reach minimum degreeof tortuosity to f, approach the conversion on the thin plate plane.
Below be three constraint conditions:
Constraint condition (3) show all be applied in load on the thin plate with should be zero.This requires thin plate under the situation of forcing load, to keep static rather than motion.Constraint (4) and (5) requirement be when force at thin plate under load and the non-rotary situation x axle and y axle separately the motion of direction be zero.
TPS parameter vector a comprises a
1, a
2And a
3Three components, vectorial w comprises several w
iComponent, these two vectors can calculate through following linear equation:
Wherein
I=1 ... N, j=1 ... M wherein n equals the number of monumented point, and j need to equal the number of the raw data coordinate points of conversion, will be that load is found the solution at the center with different monumented points all because each needs the coordinate points of conversion, so be n * m r among the A
IjThe i of matrix P is capable to be that one dimension ternary vector
O is 3 * 3 null matrix.At rightmost 0 of equality 6 is the null vector of one dimension ternary.W, a and v are respectively by w
i, a
1, a
2, a
3And v
iThe one-dimensional vector that component is formed.Next equality leftmost (n+3) * (n+3) matrix is represented with K.
In this research; The monumented point of each experimenter EMMA data that reference is given (x '; Y ') with template in monumented point
corresponding relation that defines, emphasis be the mapping of coordinate
respective coordinates
in the template coordinate system of EMMA data.So what be concerned about is the 2D point that the thin plate spline function of reference mark definition is obtained through distortion by many.For this reason, respectively the x coordinate and the y coordinate of data are shone upon with the TPS function.The song that can derive thin plate spline
to
mapping from equality 6 sticks up conversion, can recover through following formula:
Where
and
are respectively from
and
component consisting of one-dimensional vector.w
xAnd a
xBe the parameter of x axle, w
yAnd a
yIt is the parameter of y axle.Point (x
j, y
j) to coordinate
Conversion try to achieve by following formula:
Wherein
I=1 ..., n, j=1 ..., the j of m.Q matrix row for vector (1, x
j, y
j), vector x among the result ' and the j of y ' row bring x and y into formula and obtain afterwards
With
Vector.
How below to illustrate with above-mentioned mention be applied in the example in the research process of physiology pronunciation based on thin pressing bar method, and the effect after physiology pronunciation voice aspect standardized with the acoustic voice aspect is assessed.
Standardization instance in the physiology pronunciation.
The effect of different vocal organs in research voice people's the physiology pronunciation; Select to represent the unique point of its motion state; Though the clear and definite regulation of neither one in this respect, we have rule of thumb selected four positions of tongue surface from the tip of the tongue to the root of the tongue as motion characteristics point in the phonation, in experiment we with T1 to four sensors of T4 as measurement point; Be placed on the position of unique point, catch its movement locus by EMMA.Come the studying physiological pronunciation from these unique points motion trace data phonation.But the difference between the individuality not only shows the difference of sound channel profile, shows that also the dispersion of unique point moving region is not concentrated.We want studying physiological vocal organs movement characteristic and eliminate the difference between individuality, will use our nonlinear normalization method.The practical implementation process is following:
At first the sound channel contour curve position to three experimenters averages, and obtains our normalized template sound channel profile.Utilize known gridding technique respectively, the sound channel space of three experimenters and template is carried out the sign of monumented point.We choose four curves of line under maxilla line, center line, tongue surface line and the tongue surface, are the center with tongue surface particle then, and the pronunciation movement space is divided into ten sectors, and the intersection point of the limit of sector and four curves is exactly the monumented point of our regulation.We can be confirmed the thin pressing bar radiation parameter of each experimenter to template, thereby obtained our normalized method by the monumented point corresponding relationship afterwards.In Fig. 2, the data coordinates that the differs greatly change that standardizes, result such as Fig. 3, each experimenter's vocal tract shape and movement locus almost overlap, and can prove that individual difference obviously reduces.
Recruitment evaluation after acoustics and the physiology pronunciation standardization effect of evaluation experimental method, the method that we have selected in this field, to be in the linearize maxilla wall of leading position compares as benchmark.Fig. 4 has shown the result after same physiology pronunciation data adopts the method for linearize maxilla wall.Compare with the methods and results based on TPS that we show in Fig. 3, the result space that the method through linearize maxilla wall obtains shows more difference on distributing.Data after raw data in experimenter's tongue surface probe motion process, the standardization of linearize method and all be presented among Fig. 5 based on the standard deviation of the data after the standardization of TPS method.Can find out that the standard deviation of the data of experimenter's sensor and five vowels has reduced 0.8mm at the x direction of principal axis, has reduced 2.4mm at the y direction of principal axis after the standardization of TPS method.From Fig. 5 we can find out data that the method for linearize maxilla wall obtains and raw data comparison with standard difference the x direction of principal axis depart from more.This is because stretching maxilla wall and the assurance observation point linearize method vertical with the maxilla wall increase the irrelevance in x direction of principal axis data.
In order to estimate based on the normalization method of the TPS effect at the voice acoustic connection, we study the acoustic feature of the data after raw data and the standardization.Data after we will standardize produce complete vocal tract shape as the input of physiology pronunciation model; Then the data of each vowel under 320 kinds of different contexts are synthesized, calculate first three resonance peak and and the original sound in the EMMA database do comparison.Table 1 has shown average first three resonance peak of EMMA data and the average resonance peak of synthetic vowel.The result shows the acoustic characteristic that can keep vowel based on the normalization method of TPS.
For whether the back voice dynamic perfromance of standardizing is kept estimate, in Fig. 6, our image of each experimenter's raw data that drawn with standardization back data medial vowel.It is quite similar with the shape of initial vowel structure that we can find out clearly that each experimenter T1 obtains the shape of vowel structure behind the data normalization of T4.This result shows, adopt the normalization method of this paper that speaker's personal characteristics can be kept, and the difference that is caused by the vocal organs form between the speaker has also reduced.
The standard difference comparison diagram of experimenter's raw data as shown in Figure 5 and normalized number certificate; The representative of left side bar be the standard deviation of original EMMA data; What central strip was represented is the standard deviation of linearize method standardization back data, and the right bar is the standard deviation of data after standardizing with the TPS method.
The mean value of table 1. resonance peak and standard deviation
Compare with traditional method for normalizing that utilizes linearize, standardization of the present invention not only can be eliminated the modal difference of individual vocal organs, has also kept the personalized kinetic characteristic of individual pronunciation simultaneously.Through said method, we standardize to the unique point movement locus of four sensing stations from the tip of the tongue to the root of the tongue on each experimenter's tongue surface in the physiology phonation.Before the standardization, we as can be seen from Figure 2 different experiments person not only exist notable difference in sound channel contour curve position, and motion feature point moving region extremely disperses in phonation.After utilizing nonlinear method standardization based on thin pressing bar, the result is as shown in Figure 3, and experimenter's maxilla curve coincides together, and the moving region of unique point also almost is in the same space zone.So not only the relative position relation on the maxilla of channel model and tongue surface can keep, and personalized kinetic characteristic is not lost yet during the pronunciation of each vocal organs.
List of references:
[1]M.E.J.Beckman,T.,T.-P.Jung,S.-h.Lee,K.d.Jong,A.K.Krishnamurthy,S.C.Ahalt,K.B.Cohen,and?M.J.Collins,″Variability?in?the?production?of?quantal?vowels?revisited,″J.Acoust.Soc.Am.,vol.97,pp.471-490,1995.
[2]M.Hashi,J.R.Westbury,and?K.Honda,″Vowel?posture?normalization,″JASA,vol.104,pp.2426–2437,1998.
[3]B.FL,″Principal?warps:Thin?plate?splines?and?the?decomposition?of?deformations,″IEEE?Trans?Pattern?Anal.Mach.Intell,vol.11,pp.567-85,1989.
[4]T.Okadome?and?M.Honda,″Generation?of?articulatory?movements?by?using?a?kinematic?triphone?model,″J.Acoust.Soc.Am,pp.453-463,2001.
[5]J.Dang?and?K.Honda,″Construction?and?control?of?a?physiological?articulatory?model,″JASA,vol.115,pp.853-870,2004.
[6]Yang,C.-S.and?Kasuya,H.,“Uniform?and?non-uniform?normalization?of?vocal?tracts?measured?by?MRI?across?male,female?and?child,”IEICE?Trans.On?Inf.&Syst.,Vol.E78-D,No.6,pp.732-737,1995
[7]L.Zagorchev?and?A.Goshtasby,″A?comparative?study?of?transformation?functions?for?nonrigid?image?registration,″IEEE?Trans.Image?Processing,vol.15,pp.529-538,2006.
[8]J.Lim?and?M.H.Yang,″A?Direct?Method?for?modeling?Non-rigid?Motion?with?Thin?Plate?Spline,″in2005?IEEE?Computer?Society?Conference?on?Computer?Vision?and?Pattern?Recognition.
[9]Beautemps,D.,Badin,P.,and?Laboissière,R.(1995).Deriving?vocal-tract?area?function?from?midsagittal?profiles?and?formants?frequencies:A?new?model?for?vowels?and?fricative?consosnants?based?on?experimental?data.Speech?Communication,16,27-47.
Claims (1)
1. the form method for normalizing of sound channel during an extensive physiology pronunciation data is handled is characterized in that this method may further comprise the steps:
Step 1 obtains many group maxillas and tongue surface profile lines data from physiology pronunciation data storehouse, the average shape that obtains sound channel in the physiology phonation according to these data is set up many groups template of this method;
Step 2, utilize grid system that the monumented point in the vocal tract shape of many groups template of last step is marked; Specific practice: at first; According to the average shape of the data computation tongue surface contour curve of all vowel physiology pronunciation in the physiology pronunciation data storehouse, and then go out the mean place of tongue surface central point by the data computation on all tongues surfaces in the database; Average shape and tongue centre of surface point mean place according to the tongue surface contour curve are confirmed grid system; The whole grid system that obtains is divided into ten onesize sectors; Make ten sectors cover the space of sound channel motion in the whole physiology phonation; And the limit of each sector respectively with maxilla curve, meta curve, tongue surface curve and tongue surface under curve intersection; Thereby to 44 crossing points, with these 44 points just as the mark point of sound channel;
Step 3, utilize in above-mentioned mark point and the physiology pronunciation data storehouse between the original point one-to-one relationship to confirm the thin plate spline function parameter, realize the physiology pronunciation data handle in the form standard of sound channel, specific algorithm comprises:
Supposing has n coordinate points in the sound channel two-dimensional coordinate system; The song of thin plate spline sticks up can be by the individual parametric description of 2 (n+3); These parameters are made up of common 2n nonlinear parameter of 6 overall linear dimensions and n monumented point; Wherein half is the axial parametric description of x, and second half is the axial parametric description of y.This 2 (n+3) individual parameter can be confirmed by the linear system of mentioning in [7].Supposing
and represent n monumented point on the plane, is the situation of 44 monumented points in this experiment.The functional value that the coordinate of these monumented points is brought the correspondence that thin plate spline function tries to achieve into is that (x, y) expression is mapping relations of
to the visible thin plate spline difference functions f of
.The thin plate spline difference functions defines as follows:
Above-mentioned equality (1), a
1+ a
2X+a
3Y is linear transformation,
It is nonlinear transformation; Wherein the r implication is following
Represent the point that will carry out conversion in each enunciator space, with respect to the square distance of each monumented point, x and y represent thin plate spline f respectively (x promptly insert the coordinate that thin plate spline function will carry out the point of conversion in y).Equality (1) is to be the equality of load centre thin plate deformation in the infinite space scope with each monumented point coordinate
.Thin plate with
For the weights under the situation of load centre are w
iThe spline interpolation function of thin plate is made up of two parts; The linear change that a part is described by first three element, remaining part are to describe the nonlinearities change that the batten song sticks up.Through making the flexional E of function f difference functions
fCan reach minimum qualifications, and the coordinate one-to-one relationship of monumented point is confirmed a
1, a
2, a
3And w
iValue, thereby confirm thin plate spline function.E wherein
fDefine as follows:
Formula (2) is represented flexional, can find out as the E that represents flexional
fHour, (x, the conversion of y) carrying out will reach minimum degreeof tortuosity to f, approach the conversion on the thin plate plane.
Below be three constraint conditions:
Constraint condition (3) show all be applied in load on the thin plate with should be zero.This requires thin plate under the situation of forcing load, to keep static rather than motion.Constraint (4) and (5) requirement be when force at thin plate under load and the non-rotary situation x axle and y axle separately the motion of direction be zero.
TPS parameter vector a comprises a
1, a
2And a
3Three components, vectorial w comprises several w
iComponent, these two vectors can calculate through following linear equation:
Wherein
Wherein n equals the number of monumented point, and j need to equal the number of the raw data coordinate points of conversion, will be that load is found the solution at the center with different monumented points all because each needs the coordinate points of conversion, so be n * m r among the A
IjThe i of matrix P is capable to be that one dimension ternary vector
O is 3 * 3 null matrix.At rightmost 0 of equality (6) is the null vector of one dimension ternary.W, a and v are respectively by w
i, a
1, a
2, a
3And v
iThe one-dimensional vector that component is formed.Next equality leftmost (n+3) * (n+3) matrix is represented with K.
The monumented point of each experimenter EMMA data that reference is given (x '; Y ') with template in monumented point
corresponding relation that defines, emphasis be the mapping of coordinate
respective coordinates
in the template coordinate system of EMMA data.So what be concerned about is the 2D point that the thin plate spline function of reference mark definition is obtained through distortion by many.For this reason, respectively the x coordinate and the y coordinate of data are shone upon with the TPS function.The song that can derive thin plate spline
to
mapping from equality 6 sticks up conversion, can recover through following formula:
Where
and
are respectively from
and
component consisting of one-dimensional vector.w
xAnd a
xBe the parameter of x axle, w
yAnd a
yIt is the parameter of y axle.Point (x
j, y
j) to coordinate
Conversion try to achieve by following formula:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2012101965474A CN102799759A (en) | 2012-06-14 | 2012-06-14 | Vocal tract morphological standardization method during large-scale physiological pronunciation data processing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2012101965474A CN102799759A (en) | 2012-06-14 | 2012-06-14 | Vocal tract morphological standardization method during large-scale physiological pronunciation data processing |
Publications (1)
Publication Number | Publication Date |
---|---|
CN102799759A true CN102799759A (en) | 2012-11-28 |
Family
ID=47198868
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2012101965474A Pending CN102799759A (en) | 2012-06-14 | 2012-06-14 | Vocal tract morphological standardization method during large-scale physiological pronunciation data processing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102799759A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108133713A (en) * | 2017-11-27 | 2018-06-08 | 苏州大学 | A kind of method that sound channel area is estimated in the case where glottis closes phase |
WO2019034183A1 (en) * | 2017-08-17 | 2019-02-21 | 厦门快商通科技股份有限公司 | Utterance testing method and device, and speech category learning method and system |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102034272A (en) * | 2010-09-29 | 2011-04-27 | 浙江大学 | Generating method of individualized maxillofacial soft tissue hexahedral mesh |
-
2012
- 2012-06-14 CN CN2012101965474A patent/CN102799759A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102034272A (en) * | 2010-09-29 | 2011-04-27 | 浙江大学 | Generating method of individualized maxillofacial soft tissue hexahedral mesh |
Non-Patent Citations (2)
Title |
---|
JIANGUO WEI ET AL.: "《Acoustics Speech and Signal Processing(ICASSP),2010 IEEE International Conference on》", 19 March 2010 * |
LYUBOMIR ZAGORCHEV ET AL.: "A Comparative Study of Transformation Functions for Nonrigid Image Registration", 《IMAGE PROCESSING,IEEE TRANSACTIONS ON》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019034183A1 (en) * | 2017-08-17 | 2019-02-21 | 厦门快商通科技股份有限公司 | Utterance testing method and device, and speech category learning method and system |
CN108133713A (en) * | 2017-11-27 | 2018-06-08 | 苏州大学 | A kind of method that sound channel area is estimated in the case where glottis closes phase |
CN108133713B (en) * | 2017-11-27 | 2020-10-02 | 苏州大学 | Method for estimating sound channel area under glottic closed phase |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101561710B (en) | Man-machine interaction method based on estimation of human face posture | |
Cheng et al. | A novel phonology-and radical-coded Chinese sign language recognition framework using accelerometer and surface electromyography sensors | |
CN109036467B (en) | TF-LSTM-based CFFD extraction method, voice emotion recognition method and system | |
CN108364639A (en) | Speech processing system and method | |
Du et al. | Robust iterative closest point algorithm for registration of point sets with outliers | |
CN102203852B (en) | Method for converting voice | |
CN101159064A (en) | Image generation system and method for generating image | |
CN104008564A (en) | Human face expression cloning method | |
CN103258340B (en) | Is rich in the manner of articulation of the three-dimensional visualization Mandarin Chinese pronunciation dictionary of emotional expression ability | |
CN105701504B (en) | Multi-modal manifold embedding grammar for zero sample learning | |
CN104346824A (en) | Method and device for automatically synthesizing three-dimensional expression based on single facial image | |
CN108537145A (en) | Human bodys' response method based on space-time skeleton character and depth belief network | |
CN101976453A (en) | GPU-based three-dimensional face expression synthesis method | |
Wang et al. | Evaluation of Chinese calligraphy by using DBSC vectorization and ICP algorithm | |
CN110197503A (en) | Non-rigid point set method for registering based on enhanced affine transformation | |
CN103778661A (en) | Method for generating three-dimensional motion model of speaker, system and computer thereof | |
Ryumin et al. | Automatic detection and recognition of 3D manual gestures for human-machine interaction | |
CN106096642A (en) | Based on the multi-modal affective characteristics fusion method differentiating locality preserving projections | |
CN102799759A (en) | Vocal tract morphological standardization method during large-scale physiological pronunciation data processing | |
Gattone et al. | A shape distance based on the Fisher–Rao metric and its application for shapes clustering | |
CN102945550B (en) | A kind of method building remote sensing image semanteme based on Gaussian scale-space | |
CN102750549A (en) | Automatic tongue contour extraction method based on nuclear magnetic resonance images | |
CN106055244B (en) | Man-machine interaction method based on Kinect and voice | |
CN104064187A (en) | Sign language conversion voice system | |
Girin et al. | Extending the cascaded gaussian mixture regression framework for cross-speaker acoustic-articulatory mapping |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C02 | Deemed withdrawal of patent application after publication (patent law 2001) | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20121128 |