CN105895105A

CN105895105A - Speech processing method and device

Info

Publication number: CN105895105A
Application number: CN201610394300.1A
Authority: CN
Inventors: 黄宇
Original assignee: Beijing Yunzhisheng Information Technology Co Ltd
Current assignee: Unisound Intelligent Technology Co Ltd
Priority date: 2016-06-06
Filing date: 2016-06-06
Publication date: 2016-08-24
Anticipated expiration: 2036-06-06
Also published as: CN105895105B

Abstract

The invention discloses a speech processing method and a speech processing device. The method comprises the following steps: receiving speech information input by a user; carrying out voiceprint recognition on the speech information, and determining the age of the user according to an identification result; judging a target age range of the user age; determining a target speech processing model corresponding to the target age range; processing the speech information by using the target speech processing model. Through the technical scheme, the age of the user is determined according to the speech information input by the user, and then the corresponding target speech processing model is determined according to the age of the user, so that the speech information is processed by using the target speech processing model, different speech processing models can be set by aiming at different ages, and the speech information of each age group is subjected to targeted processing, so that the processing effect is better, the accuracy of speech processing is enhanced, and the use experience of the user is improved.

Description

Method of speech processing and device

Technical field

The present invention relates to voice processing technology field, particularly relate to a kind of method of speech processing and device.

Background technology

Speech recognition is a cross discipline.Recent two decades comes, and speech recognition technology obtains marked improvement, Start to move towards market from laboratory.It is contemplated that, in coming 10 years, speech recognition technology general's entrance industry, The every field such as household electrical appliances, communication, automotive electronics, medical treatment, home services, consumption electronic product.Voice Identify that the dictation machine application in some fields is chosen as development of computer ten in 1997 by US News circle big One of thing.A lot of experts think that speech recognition technology is areas of information technology between 2000 to 2010 One of ten the most important development in science and technology technology.Field involved by speech recognition technology includes: signal processing, Pattern recognition, theory of probability and theory of information, sound generating mechanism and hearing mechanism, artificial intelligence etc..

Summary of the invention

The embodiment of the present invention provides a kind of method of speech processing and device, in order to realize in guarantee speech processes Accuracy rate on the basis of, improve the success rate of semantic analysis and accuracy rate, thus promote the use of user Experience.

First aspect according to embodiments of the present invention, it is provided that a kind of method of speech processing, including:

Receive the voice messaging of user's input；

Described voice messaging is carried out Application on Voiceprint Recognition, and determines the age of described user according to recognition result；

Judge the target age range belonging to the age of described user；

Determine that the target voice corresponding with described target age range processes model；

Use described target voice to process model described voice messaging is processed.

In this embodiment, determine age of user according to the voice messaging of user's input, so according to The age at family determines the target voice processing module of correspondence, so that processing model to voice with target voice Information processes, and so, arranges different speech processes models for different age brackets, to each The voice messaging of age bracket processes targetedly, so that treatment effect is more preferable, improves voice The accuracy processed, promotes the experience of user.

In one embodiment, described determine that the target voice corresponding with described target age range processes mould Type, including:

According to default the range of age and the corresponding relation of default speech processes model, determine and described mesh The target voice that mark the range of age is corresponding processes model.

In one embodiment, described the range of age includes first the range of age, second the range of age and Three the ranges of age, wherein, the age in first the range of age is more than the age in Second Year scope in age, the Age in two the ranges of age is more than the age in described 3rd the range of age, described first the range of age pair The speech processes model answered is the first speech processes model, the speech processes that described second the range of age is corresponding Model is the second speech processes model, and the speech processes model that described 3rd the range of age is corresponding is the 3rd language Sound processes model.

In one embodiment, described first speech processes model includes the first speech model and the first semanteme Model, described second speech processes model includes the second speech model and the second semantic model, the described 3rd Speech processes model includes the 3rd speech model.

In one embodiment, described the range of age becomes positive correlation with the matching degree of corresponding speech processes model.

In this embodiment, for the voice messaging of different age brackets, it is possible to use at different voices Reason model processes, and wherein, speech processing module includes speech model and semantic model, speech model Acoustic model and language model can be included again.Specifically, the age is the biggest, the speech processing module of employing Matching degree can be the highest, thus ensure the accuracy of result.

Such as, the speech processing module of adult requires that accurate matching degree is higher, then speech model and semantic mould Type can all use the model of high matching degree.

The speech processing module of child requires high blur coupling.Such as, acoustic model and language model are adopted With the model of higher matching degree, the model of matching degree in semantic model employing.

Baby's the most corresponding possible acoustic model, only identifies sound, nonrecognition word.Baby is the most sociable, Can only sounding, it is possible to only with acoustic model, nonrecognition language and semanteme.And use low coupling The acoustic model of degree.

Second aspect according to embodiments of the present invention, it is provided that a kind of voice processing apparatus, including:

Receiver module, for receiving the voice messaging of user's input；

First determines module, for described voice messaging is carried out Application on Voiceprint Recognition and true according to recognition result The age of fixed described user；

Judge module, for judging the target age range belonging to the age of described user；

Second determines module, for determining that the target voice corresponding with described target age range processes model；

Processing module, is used for using described target voice to process model and processes described voice messaging.

In one embodiment, described second determine module for:

It should be appreciated that it is only exemplary and explanatory that above general description and details hereinafter describe , the present invention can not be limited.

Other features and advantages of the present invention will illustrate in the following description, and, partly from froming the perspective of Bright book becomes apparent, or understands by implementing the present invention.The purpose of the present invention is excellent with other Point can come real by structure specifically noted in the description write, claims and accompanying drawing Now and obtain.

Below by drawings and Examples, technical scheme is described in further detail.

Accompanying drawing explanation

Accompanying drawing herein is merged in description and constitutes the part of this specification, it is shown that meet this Bright embodiment, and for explaining the principle of the present invention together with description.

Fig. 1 is the flow chart according to the method for speech processing shown in an exemplary embodiment.

Fig. 2 is according to the flow chart of step S104 in the method for speech processing shown in an exemplary embodiment.

Fig. 3 is the block diagram according to a kind of voice processing apparatus shown in an exemplary embodiment.

Detailed description of the invention

Here will illustrate exemplary embodiment in detail, its example represents in the accompanying drawings.Following retouches Stating when relating to accompanying drawing, unless otherwise indicated, the same numbers in different accompanying drawings represents same or analogous and wants Element.Embodiment described in following exemplary embodiment does not represent own consistent with the present invention Embodiment.On the contrary, they only with as appended claims describes in detail, the present invention some The example of the apparatus and method that aspect is consistent.

Fig. 1 is the flow chart according to the method for speech processing shown in an exemplary embodiment.This speech processes side Method is applied in terminal unit, and this terminal unit can be mobile phone, computer, digital broadcast terminal, Messaging devices, game console, tablet device, armarium, body-building equipment, individual digital helps Arbitrary equipment with language process function such as reason.As it is shown on figure 3, the method comprising the steps of S101-S105:

In step S101, receive the voice messaging of user's input；

In step s 102, described voice messaging is carried out Application on Voiceprint Recognition, and determines according to recognition result The age of described user；

So-called vocal print (Voiceprint), is the sound wave spectrum carrying verbal information that shows of electricity consumption acoustic instrument. The generation of human language is a complicated physiology physical process between Body Languages maincenter and phonatory organ, People speech time use phonatory organ--tongue, tooth, larynx, lung, nasal cavity are in terms of size and form Everyone is widely different, so the vocal print collection of illustrative plates of any two people is the most variant.Everyone voice sound Learn the existing relative stability of feature, have again variability, be not absolute, unalterable.This variation May be from physiology, pathology, psychology, simulate, pretend, also relevant with environmental disturbances.While it is true, by Phonatory organ in everyone are not quite similar, and the most in the ordinary course of things, remain to distinguish different people's Sound or judge whether it is the sound of same people.

And by voice messaging being carried out Application on Voiceprint Recognition, the specific features of user can be identified, such as use The age at family, sex etc..

In step s 103, it is judged that the target age range belonging to the age of described user；

Wherein, first the range of age can be the adult section of more than 11 years old, and second the range of age can be Child's section in 3-10 year, the 3rd the range of age can be baby's section in 1-3 year.So, for different Age bracket arranges different speech processes models, carries out the voice messaging of each age bracket targetedly Process, so that treatment effect is more preferable.

In step S104, determine that the target voice corresponding with described target age range processes model；

In step S105, use described target voice to process model and described voice messaging is processed.

As in figure 2 it is shown, in one embodiment, above-mentioned steps S104 includes step S201:

In step s 201, close according to default the range of age and the corresponding of speech processes model preset System, determines that the target voice corresponding with described target age range processes model.

Following for apparatus of the present invention embodiment, may be used for performing the inventive method embodiment.

Fig. 3 is the block diagram according to a kind of voice processing apparatus shown in an exemplary embodiment, and this device can With by software, hardware or both be implemented in combination with become the some or all of of terminal unit.Such as figure Shown in 3, this voice processing apparatus includes:

Receiver module 31, for receiving the voice messaging of user's input；

First determines module 32, for described voice messaging is carried out Application on Voiceprint Recognition, and according to recognition result Determine the age of described user；

Judge module 33, for judging the target age range belonging to the age of described user；

Second determines module 34, for determining that the target voice corresponding with described target age range processes mould Type；

Processing module 35, be used for using described target voice process model to described voice messaging at Reason.

In one embodiment, described second determine module for:

Those skilled in the art it should be appreciated that embodiments of the invention can be provided as method, system or Computer program.Therefore, the present invention can use complete hardware embodiment, complete software implementation, Or combine the form of embodiment in terms of software and hardware.And, the present invention can use one or more The computer-usable storage medium wherein including computer usable program code (includes but not limited to disk Memorizer and optical memory etc.) form of the upper computer program implemented.

The present invention is with reference to method, equipment (system) and computer program according to embodiments of the present invention The flow chart of product and/or block diagram describe.It should be understood that flow process can be realized by computer program instructions Stream in each flow process in figure and/or block diagram and/or square frame and flow chart and/or block diagram Journey and/or the combination of square frame.These computer program instructions can be provided to general purpose computer, dedicated computing The processor of machine, Embedded Processor or other programmable data processing device, to produce a machine, makes Must be produced by the instruction that the processor of computer or other programmable data processing device performs and be used for realizing The merit specified in one flow process of flow chart or multiple flow process and/or one square frame of block diagram or multiple square frame The device of energy.

These computer program instructions may be alternatively stored in and computer or the process of other programmable datas can be guided to set In the standby computer-readable memory worked in a specific way so that be stored in this computer-readable memory In instruction produce and include the manufacture of command device, this command device realize in one flow process of flow chart or The function specified in multiple flow processs and/or one square frame of block diagram or multiple square frame.

These computer program instructions also can be loaded in computer or other programmable data processing device, Make on computer or other programmable devices, perform sequence of operations step computer implemented to produce Process, thus the instruction performed on computer or other programmable devices provides for realizing at flow chart The step of the function specified in one flow process or multiple flow process and/or one square frame of block diagram or multiple square frame 。

Obviously, those skilled in the art can carry out various change and modification without deviating from this to the present invention The spirit and scope of invention.So, if these amendments of the present invention and modification belong to right of the present invention and want Ask and within the scope of equivalent technologies, then the present invention is also intended to comprise these change and modification.

Claims

1. a method of speech processing, it is characterised in that including:

Receive the voice messaging of user's input；

Judge the target age range belonging to the age of described user；

Method the most according to claim 1, it is characterised in that described determine and described target age The target voice process model that scope is corresponding, including:

Method the most according to claim 1, it is characterised in that described the range of age includes First Year Age scope, second the range of age and the 3rd the range of age, wherein, the age in first the range of age is more than At age in second the range of age, the age in second the range of age is more than in described 3rd the range of age At the age, the speech processes model that described first the range of age is corresponding is the first speech processes model, described The speech processes model that two the ranges of age are corresponding is the second speech processes model, described 3rd the range of age pair The speech processes model answered is the 3rd speech processes model.

Method the most according to claim 3, it is characterised in that described first speech processes model bag Including the first speech model and the first semantic model, described second speech processes model includes the second speech model With the second semantic model, described 3rd speech processes model includes the 3rd speech model.

5. according to the method according to any one of claim 2 to 4, it is characterised in that described age model Enclose the matching degree with corresponding speech processes model and become positive correlation.

6. a voice processing apparatus, it is characterised in that including:

Receiver module, for receiving the voice messaging of user's input；

Device the most according to claim 6, it is characterised in that described second determine module for:

Device the most according to claim 6, it is characterised in that described the range of age includes First Year Age scope, second the range of age and the 3rd the range of age, wherein, the age in first the range of age is more than At age in second the range of age, the age in second the range of age is more than in described 3rd the range of age At the age, the speech processes model that described first the range of age is corresponding is the first speech processes model, described The speech processes model that two the ranges of age are corresponding is the second speech processes model, described 3rd the range of age pair The speech processes model answered is the 3rd speech processes model.

Device the most according to claim 8, it is characterised in that described first speech processes model bag Including the first speech model and the first semantic model, described second speech processes model includes the second speech model With the second semantic model, described 3rd speech processes model includes the 3rd speech model.

10. according to the device according to any one of claim 7 to 9, it is characterised in that the described age Scope becomes positive correlation with the matching degree of corresponding speech processes model.