CN105895105B - Voice processing method and device - Google Patents

Voice processing method and device Download PDF

Info

Publication number
CN105895105B
CN105895105B CN201610394300.1A CN201610394300A CN105895105B CN 105895105 B CN105895105 B CN 105895105B CN 201610394300 A CN201610394300 A CN 201610394300A CN 105895105 B CN105895105 B CN 105895105B
Authority
CN
China
Prior art keywords
age
model
age range
voice
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610394300.1A
Other languages
Chinese (zh)
Other versions
CN105895105A (en
Inventor
黄宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Unisound Intelligent Technology Co Ltd
Original Assignee
Beijing Yunzhisheng Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yunzhisheng Information Technology Co Ltd filed Critical Beijing Yunzhisheng Information Technology Co Ltd
Priority to CN201610394300.1A priority Critical patent/CN105895105B/en
Publication of CN105895105A publication Critical patent/CN105895105A/en
Application granted granted Critical
Publication of CN105895105B publication Critical patent/CN105895105B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/06Decision making techniques; Pattern matching strategies
    • G10L17/14Use of phonemic categorisation or speech recognition prior to speaker recognition or verification
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Business, Economics & Management (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Game Theory and Decision Science (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The invention relates to a voice processing method and a voice processing device, wherein the method comprises the following steps: receiving voice information input by a user; performing voiceprint recognition on the voice information, and determining the age of the user according to a recognition result; judging a target age range to which the age of the user belongs; determining a target speech processing model corresponding to the target age range; and processing the voice information by using the target voice processing model. Through this technical scheme, confirm user's age according to the speech information of user input, and then confirm the target speech processing module that corresponds according to user's age to use target speech processing model to handle speech information, like this, set up different speech processing models to different age bracket, carry out corresponding processing to the speech information of every age bracket, can be so that the treatment effect is better, improve speech processing's accuracy, promote user's use and experience.

Description

Voice processing method and device
Technical Field
The present invention relates to the field of speech processing technologies, and in particular, to a speech processing method and apparatus.
Background
Speech recognition is a cross discipline. In the last two decades, speech recognition technology has advanced significantly, starting to move from the laboratory to the market. It is expected that voice recognition technology will enter various fields such as industry, home appliances, communications, automotive electronics, medical care, home services, consumer electronics, etc. within the next 10 years. The application of speech recognition dictation machines in some fields is rated by the U.S. news community as one of ten major computer developments in 1997. Many experts consider the speech recognition technology to be one of the ten important technological development technologies in the information technology field between 2000 and 2010. The fields to which speech recognition technology relates include: signal processing, pattern recognition, probability and information theory, sound and hearing mechanisms, artificial intelligence, and the like.
Disclosure of Invention
The embodiment of the invention provides a voice processing method and a voice processing device, which are used for improving the success rate and the accuracy rate of semantic analysis on the basis of ensuring the accuracy rate of voice processing, so that the use experience of a user is improved.
According to a first aspect of the embodiments of the present invention, there is provided a speech processing method, including:
receiving voice information input by a user;
performing voiceprint recognition on the voice information, and determining the age of the user according to a recognition result;
judging a target age range to which the age of the user belongs;
determining a target speech processing model corresponding to the target age range;
and processing the voice information by using the target voice processing model.
In this embodiment, according to the age of the user determined by the voice information input by the user, and then according to the age of the user determined the corresponding target voice processing module, thereby using the target voice processing model to process the voice information, so as to set different voice processing models for different age groups, and to process the voice information of each age group in a targeted manner, so that the processing effect is better, the accuracy of the voice processing is improved, and the user experience is improved.
In one embodiment, the determining a target speech processing model corresponding to the target age range includes:
and determining a target voice processing model corresponding to the target age range according to the corresponding relation between the preset age range and the preset voice processing model.
In one embodiment, the age ranges include a first age range, a second age range, and a third age range, wherein the age in the first age range is greater than the age in the second age range, and the age in the second age range is greater than the age in the third age range, and the speech processing model corresponding to the first age range is a first speech processing model, the speech processing model corresponding to the second age range is a second speech processing model, and the speech processing model corresponding to the third age range is a third speech processing model.
In one embodiment, the first speech processing model includes a first speech model and a first semantic model, the second speech processing model includes a second speech model and a second semantic model, and the third speech processing model includes a third speech model.
In one embodiment, the age range is positively correlated with the degree of match of the corresponding speech processing model.
In this embodiment, different speech processing models may be used for processing of speech information of different ages, wherein the speech processing module includes a speech model and a semantic model, and the speech model may include an acoustic model and a language model. Specifically, the older the voice processing module, the higher the matching degree of the voice processing module can be, thereby ensuring the accuracy of the processing result.
For example, if the speech processing module of an adult requires a higher degree of exact match, the speech model and the semantic model may both adopt a model with a high degree of match.
The speech processing module of the child requires a high degree of fuzzy matching. For example, the acoustic model and the language model adopt a model with a higher matching degree, and the semantic model adopts a model with a medium matching degree.
The infant may only correspond to the acoustic model, only recognize sound, not text. The baby can not speak and only sound, so that the acoustic model can be adopted only, and the language and the semanteme are not recognized. And an acoustic model with a low degree of matching is used.
According to a second aspect of the embodiments of the present invention, there is provided a speech processing apparatus including:
the receiving module is used for receiving voice information input by a user;
the first determining module is used for carrying out voiceprint recognition on the voice information and determining the age of the user according to a recognition result;
the judging module is used for judging a target age range to which the age of the user belongs;
the second determining module is used for determining a target voice processing model corresponding to the target age range;
and the processing module is used for processing the voice information by using the target voice processing model.
In one embodiment, the second determination module is to:
and determining a target voice processing model corresponding to the target age range according to the corresponding relation between the preset age range and the preset voice processing model.
In one embodiment, the age ranges include a first age range, a second age range, and a third age range, wherein the age in the first age range is greater than the age in the second age range, and the age in the second age range is greater than the age in the third age range, and the speech processing model corresponding to the first age range is a first speech processing model, the speech processing model corresponding to the second age range is a second speech processing model, and the speech processing model corresponding to the third age range is a third speech processing model.
In one embodiment, the first speech processing model includes a first speech model and a first semantic model, the second speech processing model includes a second speech model and a second semantic model, and the third speech processing model includes a third speech model.
In one embodiment, the age range is positively correlated with the degree of match of the corresponding speech processing model.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
FIG. 1 is a flow diagram illustrating a method of speech processing according to an example embodiment.
Fig. 2 is a flowchart illustrating step S104 in a voice processing method according to an exemplary embodiment.
FIG. 3 is a block diagram illustrating a speech processing apparatus according to an example embodiment.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.
FIG. 1 is a flow diagram illustrating a method of speech processing according to an example embodiment. The voice processing method is applied to terminal equipment which can be any equipment with a voice processing function, such as a mobile phone, a computer, a digital broadcast terminal, a message transceiving equipment, a game console, a tablet equipment, a medical equipment, a body building equipment, a personal digital assistant and the like. As shown in fig. 3, the method comprises steps S101-S105:
in step S101, receiving voice information input by a user;
in step S102, performing voiceprint recognition on the voice information, and determining the age of the user according to a recognition result;
voiceprint (Voiceprint) is a spectrum of sound waves carrying verbal information displayed by an electro-acoustic apparatus. The generation of human language is a complex physiological and physical process between the human language center and the pronunciation organs, and the vocal print maps of any two people are different because the vocal organs used by a person in speaking, namely the tongue, the teeth, the larynx, the lung and the nasal cavity, are different greatly in size and shape. The speech acoustic characteristics of each person are both relatively stable and variable, not absolute, but invariant. The variation can come from physiology, pathology, psychology, simulation, camouflage and is also related to environmental interference. However, since the vocal organs of each person are different, it is possible to distinguish the voices of different persons or determine whether the voices are the same person in general.
By performing voiceprint recognition on the voice information, specific characteristics of the user, such as the age, the gender and the like of the user, can be recognized.
In step S103, a target age range to which the age of the user belongs is determined;
in one embodiment, the age ranges include a first age range, a second age range, and a third age range, wherein the age in the first age range is greater than the age in the second age range, and the age in the second age range is greater than the age in the third age range, and the speech processing model corresponding to the first age range is a first speech processing model, the speech processing model corresponding to the second age range is a second speech processing model, and the speech processing model corresponding to the third age range is a third speech processing model.
Wherein the first age range may be adult segments above 11 years, the second age range may be child segments between 3-10 years, and the third age range may be infant segments between 1-3 years. Therefore, different voice processing models are set for different age groups, and voice information of each age group is processed in a targeted mode, so that the processing effect is better.
In step S104, a target speech processing model corresponding to the target age range is determined;
in step S105, the speech information is processed using the target speech processing model.
In this embodiment, according to the age of the user determined by the voice information input by the user, and then according to the age of the user determined the corresponding target voice processing module, thereby using the target voice processing model to process the voice information, so as to set different voice processing models for different age groups, and to process the voice information of each age group in a targeted manner, so that the processing effect is better, the accuracy of the voice processing is improved, and the user experience is improved.
Fig. 2 is a flowchart illustrating step S104 in a voice processing method according to an exemplary embodiment.
As shown in fig. 2, in one embodiment, the step S104 includes the step S201:
in step S201, a target speech processing model corresponding to the target age range is determined according to a corresponding relationship between a preset age range and a preset speech processing model.
In one embodiment, the first speech processing model includes a first speech model and a first semantic model, the second speech processing model includes a second speech model and a second semantic model, and the third speech processing model includes a third speech model.
In one embodiment, the age range is positively correlated with the degree of match of the corresponding speech processing model.
In this embodiment, different speech processing models may be used for processing of speech information of different ages, wherein the speech processing module includes a speech model and a semantic model, and the speech model may include an acoustic model and a language model. Specifically, the older the voice processing module, the higher the matching degree of the voice processing module can be, thereby ensuring the accuracy of the processing result.
For example, if the speech processing module of an adult requires a higher degree of exact match, the speech model and the semantic model may both adopt a model with a high degree of match.
The speech processing module of the child requires a high degree of fuzzy matching. For example, the acoustic model and the language model adopt a model with a higher matching degree, and the semantic model adopts a model with a medium matching degree.
The infant may only correspond to the acoustic model, only recognize sound, not text. The baby can not speak and only sound, so that the acoustic model can be adopted only, and the language and the semanteme are not recognized. And an acoustic model with a low degree of matching is used.
The following are embodiments of the apparatus of the present invention that may be used to perform embodiments of the method of the present invention.
Fig. 3 is a block diagram illustrating a speech processing apparatus according to an exemplary embodiment, which may be implemented as part or all of a terminal device through software, hardware, or a combination of both. As shown in fig. 3, the speech processing apparatus includes:
a receiving module 31, configured to receive voice information input by a user;
a first determining module 32, configured to perform voiceprint recognition on the voice information, and determine an age of the user according to a recognition result;
a judging module 33, configured to judge a target age range to which the age of the user belongs;
a second determining module 34, configured to determine a target speech processing model corresponding to the target age range;
and the processing module 35 is configured to process the voice information by using the target voice processing model.
In this embodiment, according to the age of the user determined by the voice information input by the user, and then according to the age of the user determined the corresponding target voice processing module, thereby using the target voice processing model to process the voice information, so as to set different voice processing models for different age groups, and to process the voice information of each age group in a targeted manner, so that the processing effect is better, the accuracy of the voice processing is improved, and the user experience is improved.
In one embodiment, the second determination module is to:
and determining a target voice processing model corresponding to the target age range according to the corresponding relation between the preset age range and the preset voice processing model.
In one embodiment, the age ranges include a first age range, a second age range, and a third age range, wherein the age in the first age range is greater than the age in the second age range, and the age in the second age range is greater than the age in the third age range, and the speech processing model corresponding to the first age range is a first speech processing model, the speech processing model corresponding to the second age range is a second speech processing model, and the speech processing model corresponding to the third age range is a third speech processing model.
Wherein the first age range may be adult segments above 11 years, the second age range may be child segments between 3-10 years, and the third age range may be infant segments between 1-3 years. Therefore, different voice processing models are set for different age groups, and voice information of each age group is processed in a targeted mode, so that the processing effect is better.
In one embodiment, the first speech processing model includes a first speech model and a first semantic model, the second speech processing model includes a second speech model and a second semantic model, and the third speech processing model includes a third speech model.
In one embodiment, the age range is positively correlated with the degree of match of the corresponding speech processing model.
In this embodiment, different speech processing models may be used for processing of speech information of different ages, wherein the speech processing module includes a speech model and a semantic model, and the speech model may include an acoustic model and a language model. Specifically, the older the voice processing module, the higher the matching degree of the voice processing module can be, thereby ensuring the accuracy of the processing result.
For example, if the speech processing module of an adult requires a higher degree of exact match, the speech model and the semantic model may both adopt a model with a high degree of match.
The speech processing module of the child requires a high degree of fuzzy matching. For example, the acoustic model and the language model adopt a model with a higher matching degree, and the semantic model adopts a model with a medium matching degree.
The infant may only correspond to the acoustic model, only recognize sound, not text. The baby can not speak and only sound, so that the acoustic model can be adopted only, and the language and the semanteme are not recognized. And an acoustic model with a low degree of matching is used.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (4)

1. A method of speech processing, comprising:
receiving voice information input by a user;
performing voiceprint recognition on the voice information, and determining the age of the user according to a recognition result;
judging a target age range to which the age of the user belongs;
determining a target speech processing model corresponding to the target age range;
processing the voice information by using the target voice processing model;
the age range comprises a first age range, a second age range and a third age range, wherein the age in the first age range is larger than the age in the second age range, the age in the second age range is larger than the age in the third age range, the voice processing model corresponding to the first age range is a first voice processing model, the voice processing model corresponding to the second age range is a second voice processing model, and the voice processing model corresponding to the third age range is a third voice processing model;
the first voice processing model comprises a first voice model and a first semantic model, the second voice processing model comprises a second voice model and a second semantic model, and the third voice processing model comprises a third voice model;
the age range is positively correlated with the degree of match of the corresponding speech processing model.
2. The method of claim 1, wherein determining the target speech processing model corresponding to the target age range comprises:
and determining a target voice processing model corresponding to the target age range according to the corresponding relation between the preset age range and the preset voice processing model.
3. A speech processing apparatus, comprising:
the receiving module is used for receiving voice information input by a user;
the first determining module is used for carrying out voiceprint recognition on the voice information and determining the age of the user according to a recognition result;
the judging module is used for judging a target age range to which the age of the user belongs;
the second determining module is used for determining a target voice processing model corresponding to the target age range;
the processing module is used for processing the voice information by using the target voice processing model;
the age range comprises a first age range, a second age range and a third age range, wherein the age in the first age range is larger than the age in the second age range, the age in the second age range is larger than the age in the third age range, the voice processing model corresponding to the first age range is a first voice processing model, the voice processing model corresponding to the second age range is a second voice processing model, and the voice processing model corresponding to the third age range is a third voice processing model;
the first voice processing model comprises a first voice model and a first semantic model, the second voice processing model comprises a second voice model and a second semantic model, and the third voice processing model comprises a third voice model;
the age range is positively correlated with the degree of match of the corresponding speech processing model.
4. The apparatus of claim 3, wherein the second determining module is configured to:
and determining a target voice processing model corresponding to the target age range according to the corresponding relation between the preset age range and the preset voice processing model.
CN201610394300.1A 2016-06-06 2016-06-06 Voice processing method and device Active CN105895105B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610394300.1A CN105895105B (en) 2016-06-06 2016-06-06 Voice processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610394300.1A CN105895105B (en) 2016-06-06 2016-06-06 Voice processing method and device

Publications (2)

Publication Number Publication Date
CN105895105A CN105895105A (en) 2016-08-24
CN105895105B true CN105895105B (en) 2020-05-05

Family

ID=56710682

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610394300.1A Active CN105895105B (en) 2016-06-06 2016-06-06 Voice processing method and device

Country Status (1)

Country Link
CN (1) CN105895105B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107193972A (en) * 2017-05-25 2017-09-22 山东浪潮云服务信息科技有限公司 A kind of sorted users method and device based on big data
TWI638352B (en) * 2017-06-02 2018-10-11 元鼎音訊股份有限公司 Electronic device capable of adjusting output sound and method of adjusting output sound
CN107170456A (en) * 2017-06-28 2017-09-15 北京云知声信息技术有限公司 Method of speech processing and device
CN108281138B (en) * 2017-12-18 2020-03-31 百度在线网络技术(北京)有限公司 Age discrimination model training and intelligent voice interaction method, equipment and storage medium
CN108364526A (en) * 2018-02-28 2018-08-03 上海乐愚智能科技有限公司 A kind of music teaching method, apparatus, robot and storage medium
CN109171644A (en) * 2018-06-22 2019-01-11 平安科技(深圳)有限公司 Health control method, device, computer equipment and storage medium based on voice recognition
CN109859764A (en) * 2019-01-04 2019-06-07 四川虹美智能科技有限公司 A kind of sound control method and intelligent appliance
CN110265040B (en) * 2019-06-20 2022-05-17 Oppo广东移动通信有限公司 Voiceprint model training method and device, storage medium and electronic equipment
CN110798318B (en) * 2019-09-18 2022-06-24 深圳云知声信息技术有限公司 Equipment management method and device
CN110808052A (en) * 2019-11-12 2020-02-18 深圳市瑞讯云技术有限公司 Voice recognition method and device and electronic equipment
CN110853642B (en) * 2019-11-14 2022-03-25 广东美的制冷设备有限公司 Voice control method and device, household appliance and storage medium
CN112908312B (en) * 2021-01-30 2022-06-24 云知声智能科技股份有限公司 Method and equipment for improving awakening performance
CN113539274A (en) * 2021-06-15 2021-10-22 复旦大学附属肿瘤医院 Voice processing method and device
CN113707154B (en) * 2021-09-03 2023-11-10 上海瑾盛通信科技有限公司 Model training method, device, electronic equipment and readable storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101390155A (en) * 2006-02-21 2009-03-18 索尼电脑娱乐公司 Voice recognition with speaker adaptation and registration with pitch
CN103024530A (en) * 2012-12-18 2013-04-03 天津三星电子有限公司 Intelligent television voice response system and method

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20100003672A (en) * 2008-07-01 2010-01-11 (주)디유넷 Speech recognition apparatus and method using visual information
CN101944359B (en) * 2010-07-23 2012-04-25 杭州网豆数字技术有限公司 Voice recognition method facing specific crowd
CN103236259B (en) * 2013-03-22 2016-06-29 乐金电子研发中心(上海)有限公司 Voice recognition processing and feedback system, voice replying method
CN105306815A (en) * 2015-09-30 2016-02-03 努比亚技术有限公司 Shooting mode switching device, method and mobile terminal
CN105489221B (en) * 2015-12-02 2019-06-14 北京云知声信息技术有限公司 A kind of audio recognition method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101390155A (en) * 2006-02-21 2009-03-18 索尼电脑娱乐公司 Voice recognition with speaker adaptation and registration with pitch
CN103024530A (en) * 2012-12-18 2013-04-03 天津三星电子有限公司 Intelligent television voice response system and method

Also Published As

Publication number Publication date
CN105895105A (en) 2016-08-24

Similar Documents

Publication Publication Date Title
CN105895105B (en) Voice processing method and device
Hou et al. Audio-visual speech enhancement using multimodal deep convolutional neural networks
CN106782536B (en) Voice awakening method and device
CN105654952B (en) Electronic device, server and method for outputting voice
US11475897B2 (en) Method and apparatus for response using voice matching user category
CN106575500B (en) Method and apparatus for synthesizing speech based on facial structure
CN103943104B (en) A kind of voice messaging knows method for distinguishing and terminal unit
CN107170456A (en) Method of speech processing and device
Tran et al. Improvement to a NAM-captured whisper-to-speech system
CN104700843A (en) Method and device for identifying ages
CN106128467A (en) Method of speech processing and device
CN104538043A (en) Real-time emotion reminder for call
WO2020253128A1 (en) Voice recognition-based communication service method, apparatus, computer device, and storage medium
CN102404278A (en) Song request system based on voiceprint recognition and application method thereof
CN110148399A (en) A kind of control method of smart machine, device, equipment and medium
CN111312222A (en) Awakening and voice recognition model training method and device
CN109377979B (en) Method and system for updating welcome language
WO2017108142A1 (en) Linguistic model selection for adaptive automatic speech recognition
CN112735371A (en) Method and device for generating speaker video based on text information
US20180197535A1 (en) Systems and Methods for Human Speech Training
CN112580669A (en) Training method and device for voice information
CN106653003A (en) Voice recognition method and device
CN113539274A (en) Voice processing method and device
CN113555027B (en) Voice emotion conversion method and device, computer equipment and storage medium
CN111128127A (en) Voice recognition processing method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address

Address after: Room 101, 1st floor, building 1, Xisanqi building materials City, Haidian District, Beijing 100096

Patentee after: Yunzhisheng Intelligent Technology Co.,Ltd.

Address before: 100191 Beijing, Huayuan Road, Haidian District No. 2 peony technology building, 5 floor, A503

Patentee before: BEIJING UNISOUND INFORMATION TECHNOLOGY Co.,Ltd.

CP03 Change of name, title or address