CN102789594B - Voice generation method based on DIVA neural network model - Google Patents
Voice generation method based on DIVA neural network model Download PDFInfo
- Publication number
- CN102789594B CN102789594B CN201210219670.3A CN201210219670A CN102789594B CN 102789594 B CN102789594 B CN 102789594B CN 201210219670 A CN201210219670 A CN 201210219670A CN 102789594 B CN102789594 B CN 102789594B
- Authority
- CN
- China
- Prior art keywords
- hidden layer
- neuron
- neurons
- candidate
- speech
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000003062 neural network model Methods 0.000 title claims abstract description 21
- 238000000034 method Methods 0.000 title claims description 17
- 210000002569 neuron Anatomy 0.000 claims abstract description 137
- 238000012549 training Methods 0.000 claims abstract description 15
- 238000013528 artificial neural network Methods 0.000 claims abstract description 11
- 238000000605 extraction Methods 0.000 claims abstract description 7
- 230000003044 adaptive effect Effects 0.000 claims abstract description 6
- 230000006870 function Effects 0.000 claims description 49
- 239000013598 vector Substances 0.000 claims description 13
- 238000012795 verification Methods 0.000 claims description 12
- 239000011159 matrix material Substances 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000012360 testing method Methods 0.000 claims description 3
- 238000010200 validation analysis Methods 0.000 claims description 3
- 238000012937 correction Methods 0.000 claims description 2
- 238000010606 normalization Methods 0.000 claims 1
- 238000004519 manufacturing process Methods 0.000 abstract description 2
- 210000003926 auditory cortex Anatomy 0.000 description 13
- 210000004027 cell Anatomy 0.000 description 10
- 230000001953 sensory effect Effects 0.000 description 7
- 230000003238 somatosensory effect Effects 0.000 description 6
- 210000000337 motor cortex Anatomy 0.000 description 5
- 230000008447 perception Effects 0.000 description 5
- 210000004092 somatosensory cortex Anatomy 0.000 description 4
- 230000001755 vocal effect Effects 0.000 description 4
- 239000012634 fragment Substances 0.000 description 3
- 210000004205 output neuron Anatomy 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 210000004556 brain Anatomy 0.000 description 2
- 230000001054 cortical effect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 210000002364 input neuron Anatomy 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000000712 assembly Effects 0.000 description 1
- 238000000429 assembly Methods 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 210000003477 cochlea Anatomy 0.000 description 1
- 230000019771 cognition Effects 0.000 description 1
- 238000005094 computer simulation Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
Landscapes
- Feedback Control In General (AREA)
Abstract
本发明公开了一种基于DIVA神经网络模型的语音生产方法,包括语音样本提取、语音样本分类与学习、语音输出和修正输出语音,所述语音样本分类与学习采用自适应生长型神经网络(AGNN)实现对语音样本的分类学习,利用获取的语音共振峰频率来进一步计算输入层候选神经元的数目,再根据输入层候选神经元来确定隐层神经元,最终获得AGNN的输出值,并根据输出值来确定音素,采用上述结构的神经网络训练精度高且学习速度快。
The invention discloses a voice production method based on a DIVA neural network model, which includes voice sample extraction, voice sample classification and learning, voice output and corrected output voice, and the voice sample classification and learning adopts an adaptive growing neural network (AGNN ) to realize the classification learning of speech samples, use the obtained speech formant frequency to further calculate the number of candidate neurons in the input layer, and then determine the hidden layer neurons according to the candidate neurons in the input layer, and finally obtain the output value of AGNN, and according to The output value is used to determine the phoneme, and the neural network with the above structure has high training accuracy and fast learning speed.
Description
技术领域 technical field
本发明涉及一种语音生成方法,特别是一种基于DIVA神经网络模型的语音生成方法。The invention relates to a speech generation method, in particular to a speech generation method based on a DIVA neural network model.
背景技术 Background technique
随着人工智能的发展,人们对这个领域的研究不断深入。对类似真人发音的语音生成和获取的控制,是机器人发音系统急需解决的问题。语音生成与获取是一个涉及大脑诸多部位复杂的认知过程,这个过程包括一种从依照句法和语法组织句子或短语的表述一直延伸到音素产生的分层结构,需要根据发声时大脑中各种感官和运动区域的交互作用建立相应的神经网络模型。目前DIVA(DirectionsInto Velocities of Articulators)模型就是一种关于语音生成与获取后描述相关处理过程的数学模型,主要被用来仿真和描述有关大脑中涉及语音生成和语音理解区域的相关功能。也可以说它是一种为了生成单词、音节或者音素,而用来控制模拟声道运动的自适应神经网络模型。在当今真正具有生物学意义的语音生成和获取的神经网络模型中,DIVA模型的定义和测试相对而言是最彻底的,并且是唯一一种应用伪逆控制方案的模型。With the development of artificial intelligence, people's research in this field continues to deepen. The control of voice generation and acquisition similar to human voice is an urgent problem to be solved in the robot voice system. Speech generation and acquisition is a complex cognitive process involving many parts of the brain. This process includes a hierarchical structure extending from the expression of sentences or phrases organized according to syntax and grammar to the production of phonemes. The interaction of sensory and motor areas builds a corresponding neural network model. At present, the DIVA (Directions Into Velocities of Articulators) model is a mathematical model related to the processing of speech generation and post-acquisition description. It is mainly used to simulate and describe the relevant functions in the brain related to speech generation and speech understanding. It can also be said that it is an adaptive neural network model used to control the movement of the simulated vocal tract in order to generate words, syllables or phonemes. Of today's truly biologically meaningful neural network models of speech generation and acquisition, the DIVA model is relatively the most thoroughly defined and tested, and is the only one that applies a pseudo-inverse control scheme.
人们对于人类语言能力的统一计算模型的需求推动着DIVA模型的发展。这个模型自从由MIT大学语音实验室的Guenther1994年首次提出以来,近些年来不断地被更新、完善和改进。DIVA系统由语音通道模块、耳蜗模块、听觉皮层模型模块、听觉皮层类别感知模块、语音细胞集模块、运动皮层模块、声道模块、体觉皮质模块、感觉模块和感觉通道模块组成。People's demand for a unified computational model of human language ability drives the development of the DIVA model. Since this model was first proposed by Guenther of the Speech Laboratory of MIT University in 1994, it has been continuously updated, improved and improved in recent years. The DIVA system consists of a speech channel module, a cochlea module, an auditory cortex model module, an auditory cortex category perception module, a speech cell set module, a motor cortex module, a vocal tract module, a somatosensory cortex module, a sensory module and a sensory channel module.
通过对DIVA模型的分析,我们可以发现其听觉皮层类别感知模块中所使用的分类方法是RBF。而RBF神经网络对样本的依赖性很大,对于某一具体的研究问题,如何确定合适的隐含层节点数,目前尚无通用有效的算法或者定理。人们更多的是凭借经验,反复试验来确定网络的规模,这种试凑的方法非常繁琐,不易找到合适的结构。网络隐含层的节点数对网络的收敛速度、精度及泛化能力都有很大的影响。隐含层节点过多,虽可以完成训练,但会影响收敛速度,而且有可能出现过学习;而隐含层节点过少,网络不能充分学习,达不到训练精度的要求。此外,RBF神经网络训练的时间也不够快。Through the analysis of the DIVA model, we can find that the classification method used in the category perception module of its auditory cortex is RBF. The RBF neural network is very dependent on samples. For a specific research problem, how to determine the appropriate number of hidden layer nodes, there is no general and effective algorithm or theorem. People rely more on experience and trial and error to determine the size of the network. This trial and error method is very cumbersome and it is difficult to find a suitable structure. The number of nodes in the hidden layer of the network has a great influence on the convergence speed, accuracy and generalization ability of the network. If there are too many nodes in the hidden layer, although the training can be completed, it will affect the convergence speed, and there may be over-learning; if there are too few nodes in the hidden layer, the network cannot fully learn and cannot meet the requirements of training accuracy. In addition, the training time of RBF neural network is not fast enough.
发明内容 Contents of the invention
本发明的目的在于提供一种发音精度高、学习速度快的基于DIVA神经网络模型的语音生成方法。The object of the present invention is to provide a voice generation method based on the DIVA neural network model with high pronunciation accuracy and fast learning speed.
实现本发明目的的技术解决方案为:一种基于DIVA神经网络模型的语音生成方法,包括语音样本提取、语音样本分类与学习、语音输出和修正输出语音,所述语音样本分类与学习采用自适应生长型神经网络(AGNN)实现对语音样本的分类学习,具体为:The technical solution that realizes the object of the present invention is: a kind of voice generation method based on DIVA neural network model, including voice sample extraction, voice sample classification and learning, voice output and corrected output voice, described voice sample classification and learning adopt adaptive The growing neural network (AGNN) realizes the classification and learning of speech samples, specifically:
步骤一、将提取的语音共振峰频率通过雅克比行列式转换为矩阵形式,该矩阵的特征向量的维数即输入层候选神经元的数目m;计算输入层候选神经元的适应度函数值并按适应度函数值递增的顺序排列候选神经元,输入层候选神经元适应度函数值的列表相应的为S={Si1≤Si2≤…≤Sim},并按相应的顺序将候选神经元放在列表X中,X=(x1,...,xm),所述适应度函数计算公式为: Step 1, convert the speech formant frequency of extraction into matrix form by Jacobian determinant, the dimension of the eigenvector of this matrix is the number m of input layer candidate neurons; calculate the fitness function value of input layer candidate neurons and Arrange the candidate neurons in the order of increasing fitness function value, the corresponding list of candidate neuron fitness function values in the input layer is S={S i1 ≤S i2 ≤...≤S im }, and put the candidate neurons in the corresponding order Elements are placed in the list X, X=(x 1 ,...,x m ), the calculation formula of the fitness function is:
yi为实际输出值,为目标值,n为数据集中样本的数目且n为自然数;y i is the actual output value, is the target value, n is the number of samples in the data set and n is a natural number;
步骤二、初始隐层神经元个数r=0并设C0=Si1,C0为隐层神经元个数r=0时的适应度函数值;Step 2: The number of neurons in the initial hidden layer is r=0 and C 0 =S i1 , where C 0 is the fitness function value when the number of neurons in the hidden layer is r=0;
步骤三、设r=r+1和p=r+1,其中r为第r个隐层候选神经元,生成一个有p个输入的隐层候选神经元;Step 3. Set r=r+1 and p=r+1, where r is the rth hidden layer candidate neuron, and generate a hidden layer candidate neuron with p inputs;
步骤四、若r>1,将该隐层候选神经元分别连接到其前面所有的隐层神经元和输入节点x1上;否则把该隐层候选神经元只连接到输入节点x1上;Step 4. If r>1, connect the hidden layer candidate neuron to all previous hidden layer neurons and the input node x1 ; otherwise, connect the hidden layer candidate neuron only to the input node x1 ;
步骤五、设置下一个需要和新添加的隐层候选神经元相连接的集合X中的元素所处的位置h的初始值为2,其中2≤h≤m,m、h为正整数;将此隐层候选神经元的第P个输入连接到列表X中位置为h的输入节点上;Step 5. Set the initial value of the position h of the element in the next set X that needs to be connected to the newly added hidden layer candidate neuron to 2, where 2≤h≤m, m and h are positive integers; set The Pth input of this hidden layer candidate neuron is connected to the input node whose position is h in the list X;
步骤六、训练此隐层候选神经元并计算它的适应度函数值Cr,若Cr≥Cr-1,则执行步骤七;若Cr<Cr-1则将此隐层候选神经元连接到网络中作为第r个隐层神经元,再返回执行步骤三至步骤六,直到第m个输入层神经元接入网络中不满足此条件为止;Step 6. Train the candidate neurons of the hidden layer and calculate its fitness function value C r . If C r ≥ C r- 1 , perform step 7; connected to the network as the rth hidden layer neuron, and then return to step 3 to step 6, until the mth input layer neuron connected to the network does not meet this condition;
步骤七、将h=h+1,重新训练此隐层候选神经元,直到h=m时,若仍不满足Cr<Cr-1,则结束训练,此隐层候选神经元与分类无关,舍弃此隐层候选神经元,把该隐层候选神经元的前一个隐层神经元作为输出层;Step 7: Set h=h+1, retrain the hidden layer candidate neurons until h=m, if C r <C r-1 is still not satisfied, then end the training, the hidden layer candidate neurons have nothing to do with classification , discard this hidden layer candidate neuron, and use the previous hidden layer neuron of this hidden layer candidate neuron as the output layer;
步骤八、根据输出层的输出数值来确定音素。Step 8: Determine the phoneme according to the output value of the output layer.
进一步地,本发明基于DIVA神经网络模型的语音生成方法中,步骤六中训练此隐层候选神经元并计算它的适应度函数值Cr,具体为:Further, in the speech generation method based on the DIVA neural network model of the present invention, the hidden layer candidate neuron is trained in step six and its fitness function value C r is calculated, specifically:
(1)、将语音共振峰频率归一化所形成的数据集划分为训练集、验证集和测试集,这里划分的训练集和验证集的样本数目分别为nA,nB,划分依据为:nA=nB;(1), divide the data set formed by normalizing the voice formant frequency into a training set, a verification set and a test set. The number of samples of the training set and the verification set divided here are respectively n A , n B , and the division basis is :n A =n B ;
(2)、根据划分后的三个集合,利用下述公式计算隐层候选神经元的适应度函数值Cr,i=1,…,nB,nB为验证集中的样本数,其中,yB∈YB,YB为验证集中的目标向量,UB为验证集对隐层神经元的输入且UB为p×1矢量的矩阵,Wk-1为权向量,k为迭代次数,取值范围为k=0,1,2,3,…,n,其中n为正整数。(2) According to the three divided sets, use the following formula to calculate the fitness function value C r of the candidate neurons in the hidden layer, i=1,...,n B , n B is the number of samples in the validation set, Among them, y B ∈ Y B , Y B is the target vector in the verification set, U B is the input of the verification set to the hidden layer neurons and U B is a matrix of p×1 vectors, W k-1 is the weight vector, and k is The number of iterations, the value range is k=0,1,2,3,…,n, where n is a positive integer.
进一步地,本发明基于DIVA神经网络模型的语音生成方法中,所述步骤八中根据输出层的输出数值来确定音素,具体为:所述输出层的输出数值为0至1区间的数值,并根据DIVA神经网络模型中每个音素所对应的范围值来确定AGNN神经网络输出数值所对应的音素。Further, in the speech generation method based on the DIVA neural network model of the present invention, in the eighth step, the phoneme is determined according to the output value of the output layer, specifically: the output value of the output layer is a value between 0 and 1, and The phoneme corresponding to the output value of the AGNN neural network is determined according to the range value corresponding to each phoneme in the DIVA neural network model.
本发明与现有技术相比,其显著优点:由于自适应生长型神经网络模型是从一个输入节点开始学习,根据外部规则调节神经元的权值,并逐渐增加新的输入节点及新的隐神经元。所以构造出的AGNN是一个窄而深的网络,有接近最小数目的输入神经元、隐神经元及网络连接,能有效地防止网络的过拟合,并且网络的计算成本低,学习速度快;原来DIVA模型中所使用的RBF网络对样本的分类精度平均在80%,而AGNN则可以达到90%以上;对于一般难度的学习样本原模型分类学习并生成语音这个过程用时10s—13s,而使用AGNN模型改进后的系统在相同情况下,对于一般难度的学习样本模型分类学习并生成语音这个过程仅仅用时8s—10s,相同条件下相比快2—3s。对于中等难度及以上的学习样本来说使用AGNN模型改进后的系统在相同情况下性能就显得更优,和改进前的模型相比用时方面快4-5s,对样本的分类精度方面原系统下降到70%-75%,而使用AGNN模型改进后的系统在同等条件下仍能保持较高的准确率90%;可见将自适应生长型神经网络模型应用于DIVA模型中使得模型的发音精度更高而且学习速度更快。Compared with the prior art, the present invention has significant advantages: since the self-adaptive growth type neural network model learns from an input node, adjusts the weight of neurons according to external rules, and gradually adds new input nodes and new hidden Neurons. Therefore, the constructed AGNN is a narrow and deep network with close to the minimum number of input neurons, hidden neurons and network connections, which can effectively prevent over-fitting of the network, and the network has low computational cost and fast learning speed; The RBF network used in the original DIVA model has an average classification accuracy of 80% of the samples, while the AGNN can reach more than 90%. For the learning samples of general difficulty, the original model classifies and learns and generates speech. The process takes 10s-13s, while using Under the same conditions, the improved system of the AGNN model takes only 8s-10s for the general difficulty learning sample model to classify and learn and generate speech, which is 2-3s faster than that under the same conditions. For learning samples with medium difficulty and above, the performance of the improved system using the AGNN model is better under the same circumstances. Compared with the model before the improvement, the time is 4-5s faster, and the classification accuracy of the samples is reduced by the original system. to 70%-75%, while the improved system using the AGNN model can still maintain a high accuracy rate of 90% under the same conditions; it can be seen that applying the adaptive growth neural network model to the DIVA model makes the pronunciation accuracy of the model more accurate Higher and faster to learn.
附图说明 Description of drawings
图1为本发明的流程图;Fig. 1 is a flowchart of the present invention;
图2为DIVA神经网络模型的结构框图;Fig. 2 is the structural block diagram of DIVA neural network model;
图3为实施例中用于分类的AGNN神经网络结构示意图;Fig. 3 is the AGNN neural network structure schematic diagram that is used for classification in the embodiment;
具体实施方式 Detailed ways
下面结合附图对本发明作进一步详细描述。The present invention will be described in further detail below in conjunction with the accompanying drawings.
如图1所示,本发明一种基于DIVA神经网络模型的语音生成方法,包括语音样本提取、语音样本分类与学习、语音输出和修正输出语音,其特征在于,所述语音样本分类与学习采用自适应生长型神经网络(AGNN)实现对语音样本的分类学习,具体为:As shown in Fig. 1, a kind of speech generation method based on DIVA neural network model of the present invention comprises speech sample extraction, speech sample classification and learning, speech output and revised output speech, it is characterized in that, described speech sample classification and learning adopt The adaptive growing neural network (AGNN) realizes the classification and learning of speech samples, specifically:
步骤一、将提取的语音共振峰频率通过雅克比行列式转换为矩阵形式,该矩阵的特征向量的维数即输入层候选神经元的数目m;计算输入层候选神经元的适应度函数值并按适应度函数值递增的顺序排列候选神经元,输入层候选神经元适应度函数值的列表相应的为S={Si1≤Si2≤…≤Sim},并按相应的顺序将候选神经元放在列表X中,X=(x1,…,xm),所述适应度函数计算公式为:yi为实际输出值,为目标值,n为数据集中样本的数目且n为自然数;Step 1, convert the speech formant frequency of extraction into matrix form by Jacobian determinant, the dimension of the eigenvector of this matrix is the number m of input layer candidate neurons; calculate the fitness function value of input layer candidate neurons and Arrange the candidate neurons in the order of increasing fitness function value, the corresponding list of candidate neuron fitness function values in the input layer is S={S i1 ≤S i2 ≤...≤S im }, and put the candidate neurons in the corresponding order Elements are placed in the list X, X=(x 1 ,…,x m ), the calculation formula of the fitness function is: y i is the actual output value, is the target value, n is the number of samples in the data set and n is a natural number;
步骤二、初始隐层神经元个数r=0并设C0=Si1,C0为隐层神经元个数r=0时的适应度函数值;Step 2: The number of neurons in the initial hidden layer is r=0 and C 0 =S i1 , where C 0 is the fitness function value when the number of neurons in the hidden layer is r=0;
步骤三、设r=r+1和p=r+1,其中r为第r个隐层候选神经元,生成一个有p个输入的隐层候选神经元;Step 3. Set r=r+1 and p=r+1, where r is the rth hidden layer candidate neuron, and generate a hidden layer candidate neuron with p inputs;
步骤四、若r>1,将该隐层候选神经元分别连接到其前面所有的隐层神经元和输入节点x1上;否则把该隐层候选神经元只连接到输入节点x1上;Step 4. If r>1, connect the hidden layer candidate neuron to all previous hidden layer neurons and the input node x1 ; otherwise, connect the hidden layer candidate neuron only to the input node x1 ;
步骤五、设置下一个需要和新添加的隐层候选神经元相连接的集合X中的元素所处的位置h的初始值为2,其中2≤h≤m,m、h为正整数;将此隐层候选神经元的第P个输入连接到列表X中位置为h的输入节点上;Step 5. Set the initial value of the position h of the element in the next set X that needs to be connected to the newly added hidden layer candidate neuron to 2, where 2≤h≤m, m and h are positive integers; set The Pth input of this hidden layer candidate neuron is connected to the input node whose position is h in the list X;
步骤六、训练此隐层候选神经元并计算它的适应度函数值Cr,若Cr≥Cr-1,则执行步骤七;若Cr<Cr-1则将此隐层候选神经元连接到网络中作为第r个隐层神经元,再返回执行步骤三至步骤六,直到第m个输入层神经元接入网络中不满足此条件为止;其中适应度函数值Cr,具体为:Step 6. Train the candidate neurons of the hidden layer and calculate its fitness function value C r . If C r ≥ C r- 1 , perform step 7; connected to the network as the rth hidden layer neuron, and then return to step 3 to step 6, until the mth input layer neuron connected to the network does not meet this condition; where the fitness function value C r , specifically for:
(1)、将语音共振峰频率归一化所形成的数据集划分为训练集、验证集和测试集,这里划分的训练集和验证集的样本数目分别为nA,nB,划分依据为:nA=nB;(1), divide the data set formed by normalizing the voice formant frequency into a training set, a verification set and a test set. The number of samples of the training set and the verification set divided here are respectively n A , n B , and the division basis is :n A =n B ;
(2)、根据划分后的三个集合,利用下述公式计算隐层候选神经元的适应度函数值Cr,i=1,…,nB,nB为验证集中的样本数,其中,yB∈YB,YB为验证集中的目标向量,UB为验证集对隐层神经元的输入且UB为p×1矢量的矩阵,Wk-1为权向量,k为迭代次数,取值范围为k=0,1,2,3,…,n,其中n为正整数,训练的精度越高迭代次数k值越大。(2) According to the three divided sets, use the following formula to calculate the fitness function value C r of the candidate neurons in the hidden layer, i=1,...,n B , n B is the number of samples in the validation set, Among them, y B ∈ Y B , Y B is the target vector in the verification set, U B is the input of the verification set to the hidden layer neurons and U B is a matrix of p×1 vectors, W k-1 is the weight vector, and k is The number of iterations, the value range is k=0,1,2,3,...,n, where n is a positive integer, the higher the training accuracy is, the larger the number of iterations k is.
步骤七、将h=h+1,重新训练此隐层候选神经元,直到h=m时,若仍不满足Cr<Cr-1,则结束训练,此隐层候选神经元与分类无关,舍弃此隐层候选神经元,把该隐层候选神经元的前一个隐层神经元作为输出层;Step 7: Set h=h+1, retrain the hidden layer candidate neurons until h=m, if C r <C r-1 is still not satisfied, then end the training, the hidden layer candidate neurons have nothing to do with classification , discard this hidden layer candidate neuron, and use the previous hidden layer neuron of this hidden layer candidate neuron as the output layer;
步骤八、根据输出层的输出数值来确定音素,所述输出层的输出数值为0至1区间的数值,并根据DIVA神经网络模型中每个音素所对应的范围值来确定AGNN神经网络输出数值所对应的音素。Step 8, determine the phoneme according to the output value of the output layer, the output value of the output layer is a value in the interval of 0 to 1, and determine the AGNN neural network output value according to the corresponding range value of each phoneme in the DIVA neural network model the corresponding phonemes.
实施例 Example
如图2所示,在该实施例中,首先采集麦克风等发音设备的语音经过语音通道模块以一个给定的延迟,把语音的共振峰频率以向量的形式发送到耳蜗模块。耳蜗模块计算这个语音的耳蜗表示(频谱),并且把共振峰频率发送到听觉皮层模块。听觉皮层把由耳蜗模块传来的共振峰频率表示的语音传送到听觉皮层类别感知模块。听觉皮层类别感知模块接收到该语音后就将其分成语音的基本单位-音素,初始化输出的音素目标经由语音细胞集模块各自到达听觉皮层和体觉皮层模块分别形成的对听觉和体觉结果,这个模块通过比较来自听觉皮层模块的语音片段和已经存储的音素表征来识别语音片段,其中每个音素表征为0-1之间的数值范围,存储在语音细胞集模块中,具体来说识别过程为:听觉皮层类别感知模块把分成的音素(即AGNN的输出值)一一和语音细胞集中的音素表征相匹配,如果在语音细胞集中没有找到与之相匹配的音素表征即该音素还没有被训练学习过,语音细胞集模块就会在特定的区域创建一个新的音素表征来表示当前的音素。听觉皮层类别感知模块输出的音素目标和语音细胞集之间是一对一的关系。之后,语音细胞集模块启动音素片段的产生,发送需要产生的音素目标的索引到运动皮层、听觉皮层和体觉皮层模块。运动皮层在接收到来自语音细胞集模块的音素目标索引后向声道模块发送控制命令,声道模块计算接收到的控制命令的声道参数,传送到音箱设备产生相应的语音,同时声道模块将计算的听觉效果和参数配置分别通过语音通道和感觉通道传送给耳蜗模块和感觉模块形成反馈。感觉模块在接收到由感觉通道传来的以向量形式发送过来的声道配置信息后,计算声道配置相关的体觉的结果,并把他们发送到体觉皮质模块。然后,体觉皮质模块计算输入的体觉和体觉目标之间的皮层表示的差别,并将体觉误差传送给运动皮质模块用以修正生成的语音。耳蜗模块在接收到由声道模块产生的语音经语音通道传过来的以向量形式表示的语音的共振峰频率后,将其传到听觉皮层模块,听觉皮层模块就计算皮层代表的该语音和其目标语音之间的差异,并把误差传递到运动皮层模块,用以修正生成的语音。As shown in FIG. 2 , in this embodiment, the voice of the pronunciation device such as a microphone is firstly collected and passed through the voice channel module with a given delay, and the formant frequency of the voice is sent to the cochlear module in the form of a vector. The cochlear module computes the cochlear representation (spectrum) of this speech and sends the formant frequencies to the auditory cortex module. The auditory cortex transmits the speech represented by the formant frequency from the cochlear module to the auditory cortical category perception module. After the auditory cortex category perception module receives the speech, it divides it into the basic unit of speech - phoneme, and the phoneme target of the initialization output reaches the auditory cortex and the somatosensory cortex module respectively through the speech cell set module to form the auditory and somatosensory results respectively. This module recognizes speech fragments by comparing the speech fragments from the auditory cortex module with the stored phoneme representations, where each phoneme is represented as a value range between 0-1, stored in the speech cell set module, specifically the recognition process For: the auditory cortex category perception module matches the divided phonemes (i.e. the output value of AGNN) one by one with the phoneme representations in the speech cell set, if no matching phoneme representation is found in the speech cell set, that is, the phoneme has not been identified After training and learning, the speech cell set module will create a new phoneme representation in a specific area to represent the current phoneme. There is a one-to-one relationship between phoneme targets and speech cell assemblies output by class-aware modules in auditory cortex. Afterwards, the speech cell assembly module starts the generation of phoneme fragments, and sends the index of the phoneme target to be generated to the motor cortex, auditory cortex and somatosensory cortex modules. The motor cortex sends control commands to the vocal tract module after receiving the phoneme target index from the speech cell set module. The calculated auditory effect and parameter configuration are transmitted to the cochlear module and the sensory module through the speech channel and the sensory channel respectively to form feedback. After receiving the channel configuration information sent by the sensory channel in the form of a vector, the sensory module calculates the somatosensory results related to the channel configuration and sends them to the somatosensory cortex module. The somatosensory module then computes the difference in cortical representations between the input somatosensory and the somatosensory target, and passes the somatosensory error to the motor cortex module for correction of the generated speech. After the cochlear module receives the formant frequency of the voice expressed in vector form from the voice generated by the vocal tract module through the voice channel, it transmits it to the auditory cortex module, and the auditory cortex module calculates the voice represented by the cortex and other The difference between the target speech and pass the error to the motor cortex module to modify the generated speech.
如下表所示,现有的DIVA神经网络模型中语音细胞集模块中存储的29个音素表征所对应的数值范围。AGNN的分类结果就是一个数值,所得的数值代表不同的音素(所得的数值落在不同的数值区间就代表一个特定的音素)。As shown in the table below, the numerical ranges corresponding to the 29 phoneme representations stored in the speech cell set module in the existing DIVA neural network model. The classification result of AGNN is a numerical value, and the obtained numerical values represent different phonemes (the obtained numerical values fall in different numerical intervals to represent a specific phoneme).
如图3所示,取学习效率η=1.9,Δ=0.0015,初始权值的选择符合正态分布。As shown in Figure 3, the learning efficiency η = 1.9, Δ = 0.0015, and the selection of the initial weight conforms to the normal distribution.
根据输入数据集X,计算其特征向量的维数即输入层候选神经元数目m=8,根据公式:其中yi为实际输出值,为目标值,n为数据集中样本的数目。计算输入数据集X中每个元素的适应度函数值,按适应度函数递增的顺序排列他们,并依次选取前8个作为候选神经元,分别为x8,x5,x12,x16,x24,x27,x19,x23,其中第一个输入神经元x8对应的适应度函数值最小并记为C0。According to the input data set X, calculate the dimension of its feature vector, that is, the number of candidate neurons in the input layer m=8, according to the formula: where y i is the actual output value, is the target value, and n is the number of samples in the data set. Calculate the fitness function value of each element in the input data set X, arrange them in the order of increasing fitness function, and select the first 8 neurons as candidate neurons, which are x 8 , x 5 , x 12 , x 16 , x 24 , x 27 , x 19 , x 23 , where the fitness function value corresponding to the first input neuron x 8 is the smallest and recorded as C 0 .
增加一个隐层候选神经元z1,则其有2个输入。把它的两个输入与输入层候选神经元x8和x5相连,训练此隐层候选神经元然后计算z1的适应度函数C1,此时将C1与C0相比,有C1<C0,这时把z1加入到网络中作为第1个隐层神经元。再增加一个隐层候选神经元z2,则其有3个输入。将其前2个输入与前面的隐层神经元z1及x8相连,第3个输入连接到x5,训练此候选隐层神经元并计算它的适应度函数C2,将C2与C1比较有C2<C1,把z2加入到网络中作为第2个隐层神经元。加入隐层候选神经元z3,则其有4个输入。把前3个输入连接到隐层神经元z1和z2及输入层神经元x8上,第4个输入连接到x5上,训练此候选神经元并计算其适应度函数但此时计算的适应度函数值小于C2,把第4个输入连接到x12上,训练此候选神经元并计算此时的适应度函数值C3,有C3<C2,把z3加入到网络中作为第3个隐层神经元。把z4加入到网络中作为隐层候选神经元,有5个输入,把前4个输入连接到z1~z3和x8上,将第5个输入连接到x12上,训练此候选神经元并计算其适应度函数但小于C3,将第5个输入连接到x16上,训练此候选神经元并计算其适应度函数C4,因为C4<C3,把z4加入到网络中作为第4个隐层神经元。接着再把z5加入到网络中作为隐层候选神经元,它有6个输入,把前5个输入连接到z1~z4和x8上,将第6个输入连接到x16上,训练该候选神经元并计算其适应度函数C5,由于C5<C4将z5加入到网络中作为第5个隐层神经元。继续添加z6作为隐层候选神经元,其有7个输入,将前6个输入连接到z1~z5和x8上,将第7个输入连接到x16上,训练此候选神经元并计算其适应度函数但小于C5,将第7个输入连接到x24上,训练该候选神经元并计算其适应度函数C6,因为C6<C5,把z6加入到网络中作为第6个隐层神经元。再把z7连接到网络中作为隐层候选神经元,则有8个输入,将前7个输入连接到z1~z6和x8上,将第8个输入连接到x24上,训练此候选神经元并计算其适应度函数C7,C7<C6,把z7加入到网络中作为第7个隐层神经元。再把z8加入到网络中作为隐层候选神经元,其有9个输入,将前8个输入连接到z1~z7和x8上,并将第9个输入连接到x24上,训练此隐层候选神经元并计算其适应度函数但小于C7,将第9个输入连接到x27上,训练该隐层候选神经元并计算其适应度函数C8,因为C8<C7,将z8加入到网络中作为第8个隐层神经元。下面再将z9加入网络中作为隐层候选神经元,其有10个输入,把前9个输入连接到z1~z8和x8上,将第10输入连接到x27,训练此隐层候选神经元并计算其适应度函数C9,由于C9<C8,将z9加入到网络中作为第9个隐层神经元。继续添加z10,作为隐层候选神经元,它有11个输入,把前10个输入连接到z1~z9和x8上,将第11个输入连接到x27,训练此隐层候选神经元并计算其适应度函数但小于C9,则将第11个输入连接到x19上,训练此隐层候选神经元并计算其适应度函数C10,因为C10<C9,把z10加入到网络中作为第10个隐层神经元。接着添加z11作为隐层候选神经元,有12个输入,将前11个输入连接到z1~z10和上x8上,将第12个输入连接到x19上,训练此隐层候选神经元并计算其适应度函数C11,因为C11<C10,把z11加入到网络中。添加z12作为隐层候选神经元,有13个输入,将其前12个输入连接到z1~z11和上x8上,将第13个输入连接到x19上,训练此隐层候选神经元并计算其适应度函数但小于C11,则将第13个输入连接到x23上,训练此隐层候选神经元并计算其适应度函数C12,因为C12<C11,故把z12加入到网络中作为隐层神经元。添加z13作为隐层候选神经元,有14个输入,将其前13个输入连接到z1~z12和上x8上,将第14个输入连接到x23上,训练此隐层候选神经元并计算其适应度函数C13<C12,把z13加入到网络中作为隐层神经元。添加z14作为隐层候选神经元,有15个输入,把前14个输入连接到z1~z13和上x8上,将第15个输入连接到x23上,训练此隐层候选神经元并计算其适应度函数但小于C13,此时下面已没有候选神经元可连,舍弃z14,把z13作为输出神经元。Add a hidden layer candidate neuron z 1 , which has 2 inputs. Connect its two inputs to the input layer candidate neurons x 8 and x 5 , train the hidden layer candidate neurons and then calculate the fitness function C 1 of z 1 , and compare C 1 with C 0 at this time, there is C 1 <C 0 , then add z 1 to the network as the first hidden layer neuron. Add another hidden layer candidate neuron z 2 , which has 3 inputs. Connect its first 2 inputs to the previous hidden layer neurons z 1 and x 8 , connect the third input to x 5 , train this candidate hidden layer neuron and calculate its fitness function C 2 , connect C 2 with C 1 compares with C 2 <C 1 , and z 2 is added to the network as the second hidden layer neuron. Add candidate neuron z 3 of the hidden layer, and it has 4 inputs. Connect the first 3 inputs to hidden layer neurons z 1 and z 2 and input layer neuron x 8 , and connect the fourth input to x 5 , train this candidate neuron and calculate its fitness function but at this time calculate The fitness function value is less than C 2 , connect the fourth input to x 12 , train the candidate neuron and calculate the fitness function value C 3 at this time, if C 3 <C 2 , add z 3 to the network as the third hidden layer neuron. Add z 4 to the network as a hidden layer candidate neuron with 5 inputs, connect the first 4 inputs to z 1 ~ z 3 and x 8 , connect the fifth input to x 12 , and train this candidate neuron and calculate its fitness function but less than C 3 , connect the 5th input to x 16 , train this candidate neuron and calculate its fitness function C 4 , since C 4 < C 3 , add z 4 to In the network as the fourth hidden layer neuron. Then add z 5 to the network as a hidden layer candidate neuron, which has 6 inputs, connect the first 5 inputs to z 1 ~ z 4 and x 8 , connect the sixth input to x 16 , Train the candidate neuron and calculate its fitness function C 5 , since C 5 < C 4 , add z 5 into the network as the fifth hidden layer neuron. Continue to add z 6 as a hidden layer candidate neuron, which has 7 inputs, connect the first 6 inputs to z 1 ~ z 5 and x 8 , connect the 7th input to x 16 , and train this candidate neuron And calculate its fitness function but less than C 5 , connect the 7th input to x 24 , train the candidate neuron and calculate its fitness function C 6 , since C 6 < C 5 , add z 6 to the network as the sixth hidden layer neuron. Then connect z 7 to the network as a hidden layer candidate neuron, then there are 8 inputs, connect the first 7 inputs to z 1 ~ z 6 and x 8 , connect the 8th input to x 24 , and train This candidate neuron calculates its fitness function C 7 , C 7 <C 6 , and adds z7 into the network as the seventh hidden layer neuron. Then add z 8 to the network as a hidden layer candidate neuron, which has 9 inputs, connect the first 8 inputs to z 1 ~ z 7 and x 8 , and connect the ninth input to x 24 , Train this hidden layer candidate neuron and calculate its fitness function but less than C 7 , connect the 9th input to x 27 , train this hidden layer candidate neuron and calculate its fitness function C 8 , because C 8 <C 7 , add z 8 to the network as the eighth hidden layer neuron. Next, add z 9 to the network as a hidden layer candidate neuron, which has 10 inputs, connect the first 9 inputs to z 1 ~ z 8 and x 8 , connect the tenth input to x 27 , and train the hidden layer layer candidate neuron and calculate its fitness function C 9 , since C 9 <C 8 , add z 9 to the network as the ninth hidden layer neuron. Continue to add z 10 as a hidden layer candidate neuron, it has 11 inputs, connect the first 10 inputs to z 1 ~ z 9 and x 8 , connect the 11th input to x 27 , and train this hidden layer candidate Neuron and calculate its fitness function but less than C 9 , then connect the 11th input to x 19 , train this hidden layer candidate neuron and calculate its fitness function C 10 , because C 10 <C 9 , put z 10 is added to the network as the 10th hidden layer neuron. Then add z 11 as a hidden layer candidate neuron, with 12 inputs, connect the first 11 inputs to z 1 ~ z 10 and upper x 8 , connect the 12th input to x 19 , and train this hidden layer candidate Neuron and calculate its fitness function C 11 , because C 11 <C 10 , add z 11 to the network. Add z 12 as a hidden layer candidate neuron with 13 inputs, connect its first 12 inputs to z 1 ~ z 11 and upper x 8 , connect the 13th input to x 19 , and train this hidden layer candidate Neuron and calculate its fitness function but less than C 11 , then connect the 13th input to x 23 , train this hidden layer candidate neuron and calculate its fitness function C 12 , because C 12 <C 11 , so put z 12 is added to the network as a hidden layer neuron. Add z 13 as a hidden layer candidate neuron with 14 inputs, connect its first 13 inputs to z 1 ~ z 12 and upper x 8 , connect the 14th input to x 23 , and train this hidden layer candidate neuron and calculate its fitness function C 13 <C 12 , and add z 13 into the network as a hidden layer neuron. Add z 14 as a hidden layer candidate neuron with 15 inputs, connect the first 14 inputs to z 1 ~ z 13 and upper x 8 , connect the 15th input to x 23 , and train this hidden layer candidate neuron and calculate its fitness function but it is less than C 13 . At this time, there are no candidate neurons to be connected, so z 14 is discarded, and z 13 is used as the output neuron.
网络选择了8个输入特征,12个隐层神经元和1个输出神经元。第一个隐层神经元连接到输入节点x8,x5。输出神经元的输入连接到隐层神经元的输出z1~z12和输入节点x8,x23上。The network selects 8 input features, 12 hidden layer neurons and 1 output neuron. The neurons of the first hidden layer are connected to input nodes x 8 , x 5 . The input of the output neuron is connected to the output z 1 ~ z 12 of the hidden layer neuron and the input nodes x 8 , x 23 .
Claims (3)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210219670.3A CN102789594B (en) | 2012-06-28 | 2012-06-28 | Voice generation method based on DIVA neural network model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210219670.3A CN102789594B (en) | 2012-06-28 | 2012-06-28 | Voice generation method based on DIVA neural network model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102789594A CN102789594A (en) | 2012-11-21 |
CN102789594B true CN102789594B (en) | 2014-08-13 |
Family
ID=47154995
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210219670.3A Active CN102789594B (en) | 2012-06-28 | 2012-06-28 | Voice generation method based on DIVA neural network model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102789594B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110119810B (en) * | 2019-03-29 | 2023-05-16 | 华东师范大学 | A Neural Network Based Dependency Analysis Method of Human Behavior |
CN112861988B (en) * | 2021-03-04 | 2022-03-11 | 西南科技大学 | A Feature Matching Method Based on Attention Graph Neural Network |
CN115565540B (en) * | 2022-12-05 | 2023-04-07 | 浙江大学 | Invasive brain-computer interface Chinese pronunciation decoding method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101650945A (en) * | 2009-09-17 | 2010-02-17 | 浙江工业大学 | Method for recognizing speaker based on multivariate core logistic regression model |
CN102201236A (en) * | 2011-04-06 | 2011-09-28 | 中国人民解放军理工大学 | Speaker recognition method combining Gaussian mixture model and quantum neural network |
CN102222501A (en) * | 2011-06-15 | 2011-10-19 | 中国科学院自动化研究所 | Method for generating duration parameter in speech synthesis |
-
2012
- 2012-06-28 CN CN201210219670.3A patent/CN102789594B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101650945A (en) * | 2009-09-17 | 2010-02-17 | 浙江工业大学 | Method for recognizing speaker based on multivariate core logistic regression model |
CN102201236A (en) * | 2011-04-06 | 2011-09-28 | 中国人民解放军理工大学 | Speaker recognition method combining Gaussian mixture model and quantum neural network |
CN102222501A (en) * | 2011-06-15 | 2011-10-19 | 中国科学院自动化研究所 | Method for generating duration parameter in speech synthesis |
Non-Patent Citations (9)
Title |
---|
A neural network model of speech acquisition and motor equivalent speech production;Frank H. Guenther;《Biological Cybernetics》;19941231;第72卷(第1期);第43-53页 * |
Brain-computer interfaces for speech communication;J.S. Brumberg et al.;《Speech communication》;20101231;第52卷(第2期);第367-379页 * |
Frank H. Guenther.A neural network model of speech acquisition and motor equivalent speech production.《Biological Cybernetics》.1994,第72卷(第1期),第43-53页. |
J.S. Brumberg et al..Brain-computer interfaces for speech communication.《Speech communication》.2010,第52卷(第2期),第367-379页. |
一种新的适用于DIVA模型的小脑模型构建方法;张少白等;《2009中国控制与决策会议论文集》;20090617;第954-959页 * |
关于DIVA模型中语速对语音生成影响的研究;刘燕燕等;《计算机技术与发展》;20111210;第21卷(第12期);第33-35、40页 * |
刘燕燕等.关于DIVA模型中语速对语音生成影响的研究.《计算机技术与发展》.2011,第21卷(第12期),第33-35、40页. |
张少白等.一种新的适用于DIVA模型的小脑模型构建方法.《2009中国控制与决策会议论文集》.2009,第954-959页. |
张昕等.一种改进的伪逆控制方案在DIVA模型中的应用.《南京邮电大学学报(自然科学版)》.2012,第32卷(第3期),第81-85页. * |
Also Published As
Publication number | Publication date |
---|---|
CN102789594A (en) | 2012-11-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105741832B (en) | A method and system for spoken language evaluation based on deep learning | |
Garcia et al. | Mel-frequency cepstrum coefficients extraction from infant cry for classification of normal and pathological cry with feed-forward neural networks | |
Al Smadi et al. | Artificial intelligence for speech recognition based on neural networks | |
Xie et al. | Sequence error (SE) minimization training of neural network for voice conversion. | |
CN105139864A (en) | Voice recognition method and voice recognition device | |
CN109671423B (en) | Non-parallel text-to-speech conversion method under limited training data | |
Liu et al. | A fault diagnosis intelligent algorithm based on improved BP neural network | |
CN105787557A (en) | Design method of deep nerve network structure for computer intelligent identification | |
CN105023570A (en) | method and system of transforming speech | |
JP2018511870A (en) | Big data processing method for segment-based two-stage deep learning model | |
CN104091602A (en) | Speech emotion recognition method based on fuzzy support vector machine | |
CN108109615A (en) | A kind of construction and application method of the Mongol acoustic model based on DNN | |
CN102789594B (en) | Voice generation method based on DIVA neural network model | |
CN107293290A (en) | The method and apparatus for setting up Speech acoustics model | |
CN104077598A (en) | Emotion recognition method based on speech fuzzy clustering | |
Sharan et al. | Voice command recognition using biologically inspired time-frequency representation and convolutional neural networks | |
CN114299925A (en) | Method and system for obtaining importance measurement index of dysphagia symptom of Parkinson disease patient based on voice | |
Le et al. | Data selection for acoustic emotion recognition: Analyzing and comparing utterance and sub-utterance selection strategies | |
CN108538301A (en) | A kind of intelligent digital musical instrument based on neural network Audiotechnica | |
CN103310273A (en) | Method for articulating Chinese vowels with tones and based on DIVA model | |
Chen et al. | Attention-based Interactive Disentangling Network for Instance-level Emotional Voice Conversion | |
Wingfield et al. | On the similarities of representations in artificial and brain neural networks for speech recognition | |
CN103310272B (en) | Based on the DIVA neural network model manner of articulation that sound channel action knowledge base is improved | |
Zhang et al. | Baby cry recognition by BCRNet using transfer learning and deep feature fusion | |
Han | An Improved Classification Model for English Syntax Error Correction Design of DL Algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
EE01 | Entry into force of recordation of patent licensing contract |
Application publication date: 20121121 Assignee: Jiangsu Nanyou IOT Technology Park Ltd. Assignor: NANJING University OF POSTS AND TELECOMMUNICATIONS Contract record no.: 2016320000207 Denomination of invention: Voice generation method based on DIVA neural network model Granted publication date: 20140813 License type: Common License Record date: 20161109 |
|
LICC | Enforcement, change and cancellation of record of contracts on the licence for exploitation of a patent or utility model | ||
EC01 | Cancellation of recordation of patent licensing contract | ||
EC01 | Cancellation of recordation of patent licensing contract |
Assignee: Jiangsu Nanyou IOT Technology Park Ltd. Assignor: NANJING University OF POSTS AND TELECOMMUNICATIONS Contract record no.: 2016320000207 Date of cancellation: 20180116 |
|
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20180517 Address after: 510030 Guangzhou, Guangdong, Yuexiu District Beijing Road No. 374, two 1101, 1102 rooms (for office use only). Patentee after: GUANGZHOU ZIB ARTIFICIAL INTELLIGENCE TECHNOLOGY CO.,LTD. Address before: 510000 B1B2, one, two, three and four floors of the podium building 231 and 233, science Avenue, Guangzhou, Guangdong. Patentee before: BOAO ZONGHENG NETWORK TECHNOLOGY Co.,Ltd. Effective date of registration: 20180517 Address after: 510000 B1B2, one, two, three and four floors of the podium building 231 and 233, science Avenue, Guangzhou, Guangdong. Patentee after: BOAO ZONGHENG NETWORK TECHNOLOGY Co.,Ltd. Address before: 210003 new model road 66, Gulou District, Nanjing, Jiangsu Patentee before: NANJING University OF POSTS AND TELECOMMUNICATIONS |