CN105895105A - Speech processing method and device - Google Patents
Speech processing method and device Download PDFInfo
- Publication number
- CN105895105A CN105895105A CN201610394300.1A CN201610394300A CN105895105A CN 105895105 A CN105895105 A CN 105895105A CN 201610394300 A CN201610394300 A CN 201610394300A CN 105895105 A CN105895105 A CN 105895105A
- Authority
- CN
- China
- Prior art keywords
- age
- model
- range
- speech
- speech processes
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000003672 processing method Methods 0.000 title abstract 2
- 238000000034 method Methods 0.000 claims abstract description 138
- 238000012545 processing Methods 0.000 claims abstract description 50
- 230000008569 process Effects 0.000 claims description 118
- 230000000694 effects Effects 0.000 abstract description 6
- 238000005516 engineering process Methods 0.000 description 10
- 238000010586 diagram Methods 0.000 description 8
- 238000004590 computer program Methods 0.000 description 7
- 230000008878 coupling Effects 0.000 description 6
- 238000010168 coupling process Methods 0.000 description 6
- 238000005859 coupling reaction Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000001755 vocal effect Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 210000000056 organ Anatomy 0.000 description 2
- 230000035479 physiological effects, processes and functions Effects 0.000 description 2
- 241001269238 Data Species 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 210000000867 larynx Anatomy 0.000 description 1
- 210000004072 lung Anatomy 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 210000003928 nasal cavity Anatomy 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000007170 pathology Effects 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 210000000515 tooth Anatomy 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/06—Decision making techniques; Pattern matching strategies
- G10L17/14—Use of phonemic categorisation or speech recognition prior to speaker recognition or verification
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Business, Economics & Management (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Game Theory and Decision Science (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
The invention discloses a speech processing method and a speech processing device. The method comprises the following steps: receiving speech information input by a user; carrying out voiceprint recognition on the speech information, and determining the age of the user according to an identification result; judging a target age range of the user age; determining a target speech processing model corresponding to the target age range; processing the speech information by using the target speech processing model. Through the technical scheme, the age of the user is determined according to the speech information input by the user, and then the corresponding target speech processing model is determined according to the age of the user, so that the speech information is processed by using the target speech processing model, different speech processing models can be set by aiming at different ages, and the speech information of each age group is subjected to targeted processing, so that the processing effect is better, the accuracy of speech processing is enhanced, and the use experience of the user is improved.
Description
Technical field
The present invention relates to voice processing technology field, particularly relate to a kind of method of speech processing and device.
Background technology
Speech recognition is a cross discipline.Recent two decades comes, and speech recognition technology obtains marked improvement,
Start to move towards market from laboratory.It is contemplated that, in coming 10 years, speech recognition technology general's entrance industry,
The every field such as household electrical appliances, communication, automotive electronics, medical treatment, home services, consumption electronic product.Voice
Identify that the dictation machine application in some fields is chosen as development of computer ten in 1997 by US News circle big
One of thing.A lot of experts think that speech recognition technology is areas of information technology between 2000 to 2010
One of ten the most important development in science and technology technology.Field involved by speech recognition technology includes: signal processing,
Pattern recognition, theory of probability and theory of information, sound generating mechanism and hearing mechanism, artificial intelligence etc..
Summary of the invention
The embodiment of the present invention provides a kind of method of speech processing and device, in order to realize in guarantee speech processes
Accuracy rate on the basis of, improve the success rate of semantic analysis and accuracy rate, thus promote the use of user
Experience.
First aspect according to embodiments of the present invention, it is provided that a kind of method of speech processing, including:
Receive the voice messaging of user's input;
Described voice messaging is carried out Application on Voiceprint Recognition, and determines the age of described user according to recognition result;
Judge the target age range belonging to the age of described user;
Determine that the target voice corresponding with described target age range processes model;
Use described target voice to process model described voice messaging is processed.
In this embodiment, determine age of user according to the voice messaging of user's input, so according to
The age at family determines the target voice processing module of correspondence, so that processing model to voice with target voice
Information processes, and so, arranges different speech processes models for different age brackets, to each
The voice messaging of age bracket processes targetedly, so that treatment effect is more preferable, improves voice
The accuracy processed, promotes the experience of user.
In one embodiment, described determine that the target voice corresponding with described target age range processes mould
Type, including:
According to default the range of age and the corresponding relation of default speech processes model, determine and described mesh
The target voice that mark the range of age is corresponding processes model.
In one embodiment, described the range of age includes first the range of age, second the range of age and
Three the ranges of age, wherein, the age in first the range of age is more than the age in Second Year scope in age, the
Age in two the ranges of age is more than the age in described 3rd the range of age, described first the range of age pair
The speech processes model answered is the first speech processes model, the speech processes that described second the range of age is corresponding
Model is the second speech processes model, and the speech processes model that described 3rd the range of age is corresponding is the 3rd language
Sound processes model.
In one embodiment, described first speech processes model includes the first speech model and the first semanteme
Model, described second speech processes model includes the second speech model and the second semantic model, the described 3rd
Speech processes model includes the 3rd speech model.
In one embodiment, described the range of age becomes positive correlation with the matching degree of corresponding speech processes model.
In this embodiment, for the voice messaging of different age brackets, it is possible to use at different voices
Reason model processes, and wherein, speech processing module includes speech model and semantic model, speech model
Acoustic model and language model can be included again.Specifically, the age is the biggest, the speech processing module of employing
Matching degree can be the highest, thus ensure the accuracy of result.
Such as, the speech processing module of adult requires that accurate matching degree is higher, then speech model and semantic mould
Type can all use the model of high matching degree.
The speech processing module of child requires high blur coupling.Such as, acoustic model and language model are adopted
With the model of higher matching degree, the model of matching degree in semantic model employing.
Baby's the most corresponding possible acoustic model, only identifies sound, nonrecognition word.Baby is the most sociable,
Can only sounding, it is possible to only with acoustic model, nonrecognition language and semanteme.And use low coupling
The acoustic model of degree.
Second aspect according to embodiments of the present invention, it is provided that a kind of voice processing apparatus, including:
Receiver module, for receiving the voice messaging of user's input;
First determines module, for described voice messaging is carried out Application on Voiceprint Recognition and true according to recognition result
The age of fixed described user;
Judge module, for judging the target age range belonging to the age of described user;
Second determines module, for determining that the target voice corresponding with described target age range processes model;
Processing module, is used for using described target voice to process model and processes described voice messaging.
In one embodiment, described second determine module for:
According to default the range of age and the corresponding relation of default speech processes model, determine and described mesh
The target voice that mark the range of age is corresponding processes model.
In one embodiment, described the range of age includes first the range of age, second the range of age and
Three the ranges of age, wherein, the age in first the range of age is more than the age in Second Year scope in age, the
Age in two the ranges of age is more than the age in described 3rd the range of age, described first the range of age pair
The speech processes model answered is the first speech processes model, the speech processes that described second the range of age is corresponding
Model is the second speech processes model, and the speech processes model that described 3rd the range of age is corresponding is the 3rd language
Sound processes model.
In one embodiment, described first speech processes model includes the first speech model and the first semanteme
Model, described second speech processes model includes the second speech model and the second semantic model, the described 3rd
Speech processes model includes the 3rd speech model.
In one embodiment, described the range of age becomes positive correlation with the matching degree of corresponding speech processes model.
It should be appreciated that it is only exemplary and explanatory that above general description and details hereinafter describe
, the present invention can not be limited.
Other features and advantages of the present invention will illustrate in the following description, and, partly from froming the perspective of
Bright book becomes apparent, or understands by implementing the present invention.The purpose of the present invention is excellent with other
Point can come real by structure specifically noted in the description write, claims and accompanying drawing
Now and obtain.
Below by drawings and Examples, technical scheme is described in further detail.
Accompanying drawing explanation
Accompanying drawing herein is merged in description and constitutes the part of this specification, it is shown that meet this
Bright embodiment, and for explaining the principle of the present invention together with description.
Fig. 1 is the flow chart according to the method for speech processing shown in an exemplary embodiment.
Fig. 2 is according to the flow chart of step S104 in the method for speech processing shown in an exemplary embodiment.
Fig. 3 is the block diagram according to a kind of voice processing apparatus shown in an exemplary embodiment.
Detailed description of the invention
Here will illustrate exemplary embodiment in detail, its example represents in the accompanying drawings.Following retouches
Stating when relating to accompanying drawing, unless otherwise indicated, the same numbers in different accompanying drawings represents same or analogous and wants
Element.Embodiment described in following exemplary embodiment does not represent own consistent with the present invention
Embodiment.On the contrary, they only with as appended claims describes in detail, the present invention some
The example of the apparatus and method that aspect is consistent.
Fig. 1 is the flow chart according to the method for speech processing shown in an exemplary embodiment.This speech processes side
Method is applied in terminal unit, and this terminal unit can be mobile phone, computer, digital broadcast terminal,
Messaging devices, game console, tablet device, armarium, body-building equipment, individual digital helps
Arbitrary equipment with language process function such as reason.As it is shown on figure 3, the method comprising the steps of S101-S105:
In step S101, receive the voice messaging of user's input;
In step s 102, described voice messaging is carried out Application on Voiceprint Recognition, and determines according to recognition result
The age of described user;
So-called vocal print (Voiceprint), is the sound wave spectrum carrying verbal information that shows of electricity consumption acoustic instrument.
The generation of human language is a complicated physiology physical process between Body Languages maincenter and phonatory organ,
People speech time use phonatory organ--tongue, tooth, larynx, lung, nasal cavity are in terms of size and form
Everyone is widely different, so the vocal print collection of illustrative plates of any two people is the most variant.Everyone voice sound
Learn the existing relative stability of feature, have again variability, be not absolute, unalterable.This variation
May be from physiology, pathology, psychology, simulate, pretend, also relevant with environmental disturbances.While it is true, by
Phonatory organ in everyone are not quite similar, and the most in the ordinary course of things, remain to distinguish different people's
Sound or judge whether it is the sound of same people.
And by voice messaging being carried out Application on Voiceprint Recognition, the specific features of user can be identified, such as use
The age at family, sex etc..
In step s 103, it is judged that the target age range belonging to the age of described user;
In one embodiment, described the range of age includes first the range of age, second the range of age and
Three the ranges of age, wherein, the age in first the range of age is more than the age in Second Year scope in age, the
Age in two the ranges of age is more than the age in described 3rd the range of age, described first the range of age pair
The speech processes model answered is the first speech processes model, the speech processes that described second the range of age is corresponding
Model is the second speech processes model, and the speech processes model that described 3rd the range of age is corresponding is the 3rd language
Sound processes model.
Wherein, first the range of age can be the adult section of more than 11 years old, and second the range of age can be
Child's section in 3-10 year, the 3rd the range of age can be baby's section in 1-3 year.So, for different
Age bracket arranges different speech processes models, carries out the voice messaging of each age bracket targetedly
Process, so that treatment effect is more preferable.
In step S104, determine that the target voice corresponding with described target age range processes model;
In step S105, use described target voice to process model and described voice messaging is processed.
In this embodiment, determine age of user according to the voice messaging of user's input, so according to
The age at family determines the target voice processing module of correspondence, so that processing model to voice with target voice
Information processes, and so, arranges different speech processes models for different age brackets, to each
The voice messaging of age bracket processes targetedly, so that treatment effect is more preferable, improves voice
The accuracy processed, promotes the experience of user.
Fig. 2 is according to the flow chart of step S104 in the method for speech processing shown in an exemplary embodiment.
As in figure 2 it is shown, in one embodiment, above-mentioned steps S104 includes step S201:
In step s 201, close according to default the range of age and the corresponding of speech processes model preset
System, determines that the target voice corresponding with described target age range processes model.
In one embodiment, described first speech processes model includes the first speech model and the first semanteme
Model, described second speech processes model includes the second speech model and the second semantic model, the described 3rd
Speech processes model includes the 3rd speech model.
In one embodiment, described the range of age becomes positive correlation with the matching degree of corresponding speech processes model.
In this embodiment, for the voice messaging of different age brackets, it is possible to use at different voices
Reason model processes, and wherein, speech processing module includes speech model and semantic model, speech model
Acoustic model and language model can be included again.Specifically, the age is the biggest, the speech processing module of employing
Matching degree can be the highest, thus ensure the accuracy of result.
Such as, the speech processing module of adult requires that accurate matching degree is higher, then speech model and semantic mould
Type can all use the model of high matching degree.
The speech processing module of child requires high blur coupling.Such as, acoustic model and language model are adopted
With the model of higher matching degree, the model of matching degree in semantic model employing.
Baby's the most corresponding possible acoustic model, only identifies sound, nonrecognition word.Baby is the most sociable,
Can only sounding, it is possible to only with acoustic model, nonrecognition language and semanteme.And use low coupling
The acoustic model of degree.
Following for apparatus of the present invention embodiment, may be used for performing the inventive method embodiment.
Fig. 3 is the block diagram according to a kind of voice processing apparatus shown in an exemplary embodiment, and this device can
With by software, hardware or both be implemented in combination with become the some or all of of terminal unit.Such as figure
Shown in 3, this voice processing apparatus includes:
Receiver module 31, for receiving the voice messaging of user's input;
First determines module 32, for described voice messaging is carried out Application on Voiceprint Recognition, and according to recognition result
Determine the age of described user;
Judge module 33, for judging the target age range belonging to the age of described user;
Second determines module 34, for determining that the target voice corresponding with described target age range processes mould
Type;
Processing module 35, be used for using described target voice process model to described voice messaging at
Reason.
In this embodiment, determine age of user according to the voice messaging of user's input, so according to
The age at family determines the target voice processing module of correspondence, so that processing model to voice with target voice
Information processes, and so, arranges different speech processes models for different age brackets, to each
The voice messaging of age bracket processes targetedly, so that treatment effect is more preferable, improves voice
The accuracy processed, promotes the experience of user.
In one embodiment, described second determine module for:
According to default the range of age and the corresponding relation of default speech processes model, determine and described mesh
The target voice that mark the range of age is corresponding processes model.
In one embodiment, described the range of age includes first the range of age, second the range of age and
Three the ranges of age, wherein, the age in first the range of age is more than the age in Second Year scope in age, the
Age in two the ranges of age is more than the age in described 3rd the range of age, described first the range of age pair
The speech processes model answered is the first speech processes model, the speech processes that described second the range of age is corresponding
Model is the second speech processes model, and the speech processes model that described 3rd the range of age is corresponding is the 3rd language
Sound processes model.
Wherein, first the range of age can be the adult section of more than 11 years old, and second the range of age can be
Child's section in 3-10 year, the 3rd the range of age can be baby's section in 1-3 year.So, for different
Age bracket arranges different speech processes models, carries out the voice messaging of each age bracket targetedly
Process, so that treatment effect is more preferable.
In one embodiment, described first speech processes model includes the first speech model and the first semanteme
Model, described second speech processes model includes the second speech model and the second semantic model, the described 3rd
Speech processes model includes the 3rd speech model.
In one embodiment, described the range of age becomes positive correlation with the matching degree of corresponding speech processes model.
In this embodiment, for the voice messaging of different age brackets, it is possible to use at different voices
Reason model processes, and wherein, speech processing module includes speech model and semantic model, speech model
Acoustic model and language model can be included again.Specifically, the age is the biggest, the speech processing module of employing
Matching degree can be the highest, thus ensure the accuracy of result.
Such as, the speech processing module of adult requires that accurate matching degree is higher, then speech model and semantic mould
Type can all use the model of high matching degree.
The speech processing module of child requires high blur coupling.Such as, acoustic model and language model are adopted
With the model of higher matching degree, the model of matching degree in semantic model employing.
Baby's the most corresponding possible acoustic model, only identifies sound, nonrecognition word.Baby is the most sociable,
Can only sounding, it is possible to only with acoustic model, nonrecognition language and semanteme.And use low coupling
The acoustic model of degree.
Those skilled in the art it should be appreciated that embodiments of the invention can be provided as method, system or
Computer program.Therefore, the present invention can use complete hardware embodiment, complete software implementation,
Or combine the form of embodiment in terms of software and hardware.And, the present invention can use one or more
The computer-usable storage medium wherein including computer usable program code (includes but not limited to disk
Memorizer and optical memory etc.) form of the upper computer program implemented.
The present invention is with reference to method, equipment (system) and computer program according to embodiments of the present invention
The flow chart of product and/or block diagram describe.It should be understood that flow process can be realized by computer program instructions
Stream in each flow process in figure and/or block diagram and/or square frame and flow chart and/or block diagram
Journey and/or the combination of square frame.These computer program instructions can be provided to general purpose computer, dedicated computing
The processor of machine, Embedded Processor or other programmable data processing device, to produce a machine, makes
Must be produced by the instruction that the processor of computer or other programmable data processing device performs and be used for realizing
The merit specified in one flow process of flow chart or multiple flow process and/or one square frame of block diagram or multiple square frame
The device of energy.
These computer program instructions may be alternatively stored in and computer or the process of other programmable datas can be guided to set
In the standby computer-readable memory worked in a specific way so that be stored in this computer-readable memory
In instruction produce and include the manufacture of command device, this command device realize in one flow process of flow chart or
The function specified in multiple flow processs and/or one square frame of block diagram or multiple square frame.
These computer program instructions also can be loaded in computer or other programmable data processing device,
Make on computer or other programmable devices, perform sequence of operations step computer implemented to produce
Process, thus the instruction performed on computer or other programmable devices provides for realizing at flow chart
The step of the function specified in one flow process or multiple flow process and/or one square frame of block diagram or multiple square frame
。
Obviously, those skilled in the art can carry out various change and modification without deviating from this to the present invention
The spirit and scope of invention.So, if these amendments of the present invention and modification belong to right of the present invention and want
Ask and within the scope of equivalent technologies, then the present invention is also intended to comprise these change and modification.
Claims (10)
1. a method of speech processing, it is characterised in that including:
Receive the voice messaging of user's input;
Described voice messaging is carried out Application on Voiceprint Recognition, and determines the age of described user according to recognition result;
Judge the target age range belonging to the age of described user;
Determine that the target voice corresponding with described target age range processes model;
Use described target voice to process model described voice messaging is processed.
Method the most according to claim 1, it is characterised in that described determine and described target age
The target voice process model that scope is corresponding, including:
According to default the range of age and the corresponding relation of default speech processes model, determine and described mesh
The target voice that mark the range of age is corresponding processes model.
Method the most according to claim 1, it is characterised in that described the range of age includes First Year
Age scope, second the range of age and the 3rd the range of age, wherein, the age in first the range of age is more than
At age in second the range of age, the age in second the range of age is more than in described 3rd the range of age
At the age, the speech processes model that described first the range of age is corresponding is the first speech processes model, described
The speech processes model that two the ranges of age are corresponding is the second speech processes model, described 3rd the range of age pair
The speech processes model answered is the 3rd speech processes model.
Method the most according to claim 3, it is characterised in that described first speech processes model bag
Including the first speech model and the first semantic model, described second speech processes model includes the second speech model
With the second semantic model, described 3rd speech processes model includes the 3rd speech model.
5. according to the method according to any one of claim 2 to 4, it is characterised in that described age model
Enclose the matching degree with corresponding speech processes model and become positive correlation.
6. a voice processing apparatus, it is characterised in that including:
Receiver module, for receiving the voice messaging of user's input;
First determines module, for described voice messaging is carried out Application on Voiceprint Recognition and true according to recognition result
The age of fixed described user;
Judge module, for judging the target age range belonging to the age of described user;
Second determines module, for determining that the target voice corresponding with described target age range processes model;
Processing module, is used for using described target voice to process model and processes described voice messaging.
Device the most according to claim 6, it is characterised in that described second determine module for:
According to default the range of age and the corresponding relation of default speech processes model, determine and described mesh
The target voice that mark the range of age is corresponding processes model.
Device the most according to claim 6, it is characterised in that described the range of age includes First Year
Age scope, second the range of age and the 3rd the range of age, wherein, the age in first the range of age is more than
At age in second the range of age, the age in second the range of age is more than in described 3rd the range of age
At the age, the speech processes model that described first the range of age is corresponding is the first speech processes model, described
The speech processes model that two the ranges of age are corresponding is the second speech processes model, described 3rd the range of age pair
The speech processes model answered is the 3rd speech processes model.
Device the most according to claim 8, it is characterised in that described first speech processes model bag
Including the first speech model and the first semantic model, described second speech processes model includes the second speech model
With the second semantic model, described 3rd speech processes model includes the 3rd speech model.
10. according to the device according to any one of claim 7 to 9, it is characterised in that the described age
Scope becomes positive correlation with the matching degree of corresponding speech processes model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610394300.1A CN105895105B (en) | 2016-06-06 | 2016-06-06 | Voice processing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610394300.1A CN105895105B (en) | 2016-06-06 | 2016-06-06 | Voice processing method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105895105A true CN105895105A (en) | 2016-08-24 |
CN105895105B CN105895105B (en) | 2020-05-05 |
Family
ID=56710682
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610394300.1A Active CN105895105B (en) | 2016-06-06 | 2016-06-06 | Voice processing method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105895105B (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107170456A (en) * | 2017-06-28 | 2017-09-15 | 北京云知声信息技术有限公司 | Method of speech processing and device |
CN107193972A (en) * | 2017-05-25 | 2017-09-22 | 山东浪潮云服务信息科技有限公司 | A kind of sorted users method and device based on big data |
CN108281138A (en) * | 2017-12-18 | 2018-07-13 | 百度在线网络技术(北京)有限公司 | Age discrimination model training and intelligent sound exchange method, equipment and storage medium |
CN108364526A (en) * | 2018-02-28 | 2018-08-03 | 上海乐愚智能科技有限公司 | A kind of music teaching method, apparatus, robot and storage medium |
TWI638352B (en) * | 2017-06-02 | 2018-10-11 | 元鼎音訊股份有限公司 | Electronic device capable of adjusting output sound and method of adjusting output sound |
CN109171644A (en) * | 2018-06-22 | 2019-01-11 | 平安科技(深圳)有限公司 | Health control method, device, computer equipment and storage medium based on voice recognition |
CN109859764A (en) * | 2019-01-04 | 2019-06-07 | 四川虹美智能科技有限公司 | A kind of sound control method and intelligent appliance |
CN110265040A (en) * | 2019-06-20 | 2019-09-20 | Oppo广东移动通信有限公司 | Training method, device, storage medium and the electronic equipment of sound-groove model |
CN110798318A (en) * | 2019-09-18 | 2020-02-14 | 云知声智能科技股份有限公司 | Equipment management method and device |
CN110808052A (en) * | 2019-11-12 | 2020-02-18 | 深圳市瑞讯云技术有限公司 | Voice recognition method and device and electronic equipment |
CN110853642A (en) * | 2019-11-14 | 2020-02-28 | 广东美的制冷设备有限公司 | Voice control method and device, household appliance and storage medium |
CN112908312A (en) * | 2021-01-30 | 2021-06-04 | 云知声智能科技股份有限公司 | Method and equipment for improving awakening performance |
CN113539274A (en) * | 2021-06-15 | 2021-10-22 | 复旦大学附属肿瘤医院 | Voice processing method and device |
CN113707154A (en) * | 2021-09-03 | 2021-11-26 | 上海瑾盛通信科技有限公司 | Model training method and device, electronic equipment and readable storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101390155A (en) * | 2006-02-21 | 2009-03-18 | 索尼电脑娱乐公司 | Voice recognition with speaker adaptation and registration with pitch |
KR20100003672A (en) * | 2008-07-01 | 2010-01-11 | (주)디유넷 | Speech recognition apparatus and method using visual information |
CN101944359A (en) * | 2010-07-23 | 2011-01-12 | 杭州网豆数字技术有限公司 | Voice recognition method facing specific crowd |
CN103024530A (en) * | 2012-12-18 | 2013-04-03 | 天津三星电子有限公司 | Intelligent television voice response system and method |
CN103236259A (en) * | 2013-03-22 | 2013-08-07 | 乐金电子研发中心(上海)有限公司 | Voice recognition processing and feedback system, voice response method |
CN105306815A (en) * | 2015-09-30 | 2016-02-03 | 努比亚技术有限公司 | Shooting mode switching device, method and mobile terminal |
CN105489221A (en) * | 2015-12-02 | 2016-04-13 | 北京云知声信息技术有限公司 | Voice recognition method and device |
-
2016
- 2016-06-06 CN CN201610394300.1A patent/CN105895105B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101390155A (en) * | 2006-02-21 | 2009-03-18 | 索尼电脑娱乐公司 | Voice recognition with speaker adaptation and registration with pitch |
KR20100003672A (en) * | 2008-07-01 | 2010-01-11 | (주)디유넷 | Speech recognition apparatus and method using visual information |
CN101944359A (en) * | 2010-07-23 | 2011-01-12 | 杭州网豆数字技术有限公司 | Voice recognition method facing specific crowd |
CN103024530A (en) * | 2012-12-18 | 2013-04-03 | 天津三星电子有限公司 | Intelligent television voice response system and method |
CN103236259A (en) * | 2013-03-22 | 2013-08-07 | 乐金电子研发中心(上海)有限公司 | Voice recognition processing and feedback system, voice response method |
CN105306815A (en) * | 2015-09-30 | 2016-02-03 | 努比亚技术有限公司 | Shooting mode switching device, method and mobile terminal |
CN105489221A (en) * | 2015-12-02 | 2016-04-13 | 北京云知声信息技术有限公司 | Voice recognition method and device |
Non-Patent Citations (4)
Title |
---|
MINGLI等: "Automatic speaker age and gender recognition using acoustic and prosodic level information fusion", 《COMPUTER SPEECH & LANGUAGE》 * |
于树本: "基于MFCC的说话人语音识别系统的研究", 《科学技术创新》 * |
周艳萍等: "《电子侦控技术》", 30 June 1998, 上海科学技术文献出版社 * |
王宏: "说话人识别SIS95话者自动识别系统", 《中国科学院机构知识库》 * |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107193972A (en) * | 2017-05-25 | 2017-09-22 | 山东浪潮云服务信息科技有限公司 | A kind of sorted users method and device based on big data |
TWI638352B (en) * | 2017-06-02 | 2018-10-11 | 元鼎音訊股份有限公司 | Electronic device capable of adjusting output sound and method of adjusting output sound |
CN107170456A (en) * | 2017-06-28 | 2017-09-15 | 北京云知声信息技术有限公司 | Method of speech processing and device |
CN108281138B (en) * | 2017-12-18 | 2020-03-31 | 百度在线网络技术(北京)有限公司 | Age discrimination model training and intelligent voice interaction method, equipment and storage medium |
CN108281138A (en) * | 2017-12-18 | 2018-07-13 | 百度在线网络技术(北京)有限公司 | Age discrimination model training and intelligent sound exchange method, equipment and storage medium |
CN108364526A (en) * | 2018-02-28 | 2018-08-03 | 上海乐愚智能科技有限公司 | A kind of music teaching method, apparatus, robot and storage medium |
CN109171644A (en) * | 2018-06-22 | 2019-01-11 | 平安科技(深圳)有限公司 | Health control method, device, computer equipment and storage medium based on voice recognition |
CN109859764A (en) * | 2019-01-04 | 2019-06-07 | 四川虹美智能科技有限公司 | A kind of sound control method and intelligent appliance |
CN110265040A (en) * | 2019-06-20 | 2019-09-20 | Oppo广东移动通信有限公司 | Training method, device, storage medium and the electronic equipment of sound-groove model |
CN110265040B (en) * | 2019-06-20 | 2022-05-17 | Oppo广东移动通信有限公司 | Voiceprint model training method and device, storage medium and electronic equipment |
CN110798318A (en) * | 2019-09-18 | 2020-02-14 | 云知声智能科技股份有限公司 | Equipment management method and device |
CN110808052A (en) * | 2019-11-12 | 2020-02-18 | 深圳市瑞讯云技术有限公司 | Voice recognition method and device and electronic equipment |
CN110853642A (en) * | 2019-11-14 | 2020-02-28 | 广东美的制冷设备有限公司 | Voice control method and device, household appliance and storage medium |
CN110853642B (en) * | 2019-11-14 | 2022-03-25 | 广东美的制冷设备有限公司 | Voice control method and device, household appliance and storage medium |
CN112908312A (en) * | 2021-01-30 | 2021-06-04 | 云知声智能科技股份有限公司 | Method and equipment for improving awakening performance |
CN112908312B (en) * | 2021-01-30 | 2022-06-24 | 云知声智能科技股份有限公司 | Method and equipment for improving awakening performance |
CN113539274A (en) * | 2021-06-15 | 2021-10-22 | 复旦大学附属肿瘤医院 | Voice processing method and device |
CN113707154A (en) * | 2021-09-03 | 2021-11-26 | 上海瑾盛通信科技有限公司 | Model training method and device, electronic equipment and readable storage medium |
CN113707154B (en) * | 2021-09-03 | 2023-11-10 | 上海瑾盛通信科技有限公司 | Model training method, device, electronic equipment and readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN105895105B (en) | 2020-05-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105895105A (en) | Speech processing method and device | |
CN110288077B (en) | Method and related device for synthesizing speaking expression based on artificial intelligence | |
CN106782536B (en) | Voice awakening method and device | |
CN108899037B (en) | Animal voiceprint feature extraction method and device and electronic equipment | |
CN107170456A (en) | Method of speech processing and device | |
CN106575500B (en) | Method and apparatus for synthesizing speech based on facial structure | |
EP3824462B1 (en) | Electronic apparatus for processing user utterance and controlling method thereof | |
CN110838289A (en) | Awakening word detection method, device, equipment and medium based on artificial intelligence | |
CN110838286A (en) | Model training method, language identification method, device and equipment | |
CN110853617B (en) | Model training method, language identification method, device and equipment | |
CN110534099A (en) | Voice wakes up processing method, device, storage medium and electronic equipment | |
CN104700843A (en) | Method and device for identifying ages | |
CN110148399A (en) | A kind of control method of smart machine, device, equipment and medium | |
CN111508511A (en) | Real-time sound changing method and device | |
CN108198569A (en) | A kind of audio-frequency processing method, device, equipment and readable storage medium storing program for executing | |
CN102404278A (en) | Song request system based on voiceprint recognition and application method thereof | |
CN104795065A (en) | Method for increasing speech recognition rate and electronic device | |
CN111414506B (en) | Emotion processing method and device based on artificial intelligence, electronic equipment and storage medium | |
Qian et al. | Computer audition for fighting the SARS-CoV-2 corona crisis—Introducing the multitask speech corpus for COVID-19 | |
KR102499299B1 (en) | Voice recognition device and its learning control method | |
CN107274903A (en) | Text handling method and device, the device for text-processing | |
CN112735371A (en) | Method and device for generating speaker video based on text information | |
CN110155075A (en) | Atmosphere apparatus control method and relevant apparatus | |
CN111243604B (en) | Training method for speaker recognition neural network model supporting multiple awakening words, speaker recognition method and system | |
CN111046674B (en) | Semantic understanding method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP03 | Change of name, title or address | ||
CP03 | Change of name, title or address |
Address after: Room 101, 1st floor, building 1, Xisanqi building materials City, Haidian District, Beijing 100096 Patentee after: Yunzhisheng Intelligent Technology Co.,Ltd. Address before: 100191 Beijing, Huayuan Road, Haidian District No. 2 peony technology building, 5 floor, A503 Patentee before: BEIJING UNISOUND INFORMATION TECHNOLOGY Co.,Ltd. |