CN102237088B - Device and method for acquiring speech recognition multi-information text - Google Patents

Device and method for acquiring speech recognition multi-information text Download PDF

Info

Publication number
CN102237088B
CN102237088B CN2011101651010A CN201110165101A CN102237088B CN 102237088 B CN102237088 B CN 102237088B CN 2011101651010 A CN2011101651010 A CN 2011101651010A CN 201110165101 A CN201110165101 A CN 201110165101A CN 102237088 B CN102237088 B CN 102237088B
Authority
CN
China
Prior art keywords
information
text
speech recognition
intensity
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN2011101651010A
Other languages
Chinese (zh)
Other versions
CN102237088A (en
Inventor
张峰
黄伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHANGHAI GEAK ELECTRONICS Co.,Ltd.
Original Assignee
Shengle Information Technolpogy Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shengle Information Technolpogy Shanghai Co Ltd filed Critical Shengle Information Technolpogy Shanghai Co Ltd
Priority to CN2011101651010A priority Critical patent/CN102237088B/en
Publication of CN102237088A publication Critical patent/CN102237088A/en
Application granted granted Critical
Publication of CN102237088B publication Critical patent/CN102237088B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides a device and a method for acquiring a speech recognition multi-information text. After a speech audio frequency is converted into pure text information by speech recognition, individual character pronunciation speed, individual character pronunciation strength and individual character pronunciation intonation in the speech audio frequency are integrated into the initially-generated pure text information in a certain expression way to generate multi-information text information. The device and the method for acquiring the speech recognition multi-information text can be widely used for information release platforms such as micro blogs, short messages, signature files and the like.

Description

Speech recognition multi-information text deriving means and method
Technical field
The present invention relates to the speech recognition technology of computer field, particularly a kind of speech recognition multi-information text deriving means and method.
Background technology
Recent two decades comes, and speech recognition technology is obtained marked improvement, has obtained to use more and more widely.Expectation is in coming 10 years, and speech recognition technology will enter the every field such as industry, household electrical appliances, communication, automotive electronics, medical treatment, home services, consumption electronic product.
So-called speech recognition refers to the automatic Understanding people's such as computing machine or machinery voice.For example, by utilizing speech recognition, computing machine or machinery can be moved according to people's voice, the phonetic modification that perhaps can make the people is literal.The main method that adopts is in the speech recognition, extracts the physical features such as frequency spectrum that the voice that send have, and compares with the physical features model of pre-stored vowel, consonant or word, finally obtains with the identical expressing information of people's voice content.But in the prior art, the text message that obtains by speech recognition technology can only be plain text information usually, described plain text information refers to the text message that the literal format size is unified, do not have special symbol except punctuation mark, and all mention that plain text information part all refers to this meaning in the instructions.Therefore a lot of valuable information in the voice, information such as speaker's word speed, stress, tone can't show in the plain text information after speech recognition.
Summary of the invention
The technical problem to be solved in the present invention provides a kind of speech recognition multi-information text deriving means and method, usually can only be plain text information to solve the text message that obtains by speech recognition technology in the prior art, a lot of valuable information in the voice be cashed problem out in can't the text message after speech recognition.
For solving the problems of the technologies described above, the invention provides a kind of speech recognition rich text deriving means, comprising:
Plain text information and one time generation module are used for by speech recognition speech audio being converted to plain text information, are used for obtaining simultaneously the one time of speech audio, determine the one word speed by the length of described one time;
The rich text generation module is used for the text message with the many information of described plain text Information generation.
Optionally, also comprise one intensity computing module, be used for obtaining one intensity according to described one Time Calculation.
Optionally, described rich text generation module is used for integrating in described plain text information the text message of the many information of Information generation of described one word speed and/or described one intensity.
Optionally, also comprise individual character intonation computing module, be used for obtaining the one intonation according to described one Time Calculation.
Optionally, described rich text generation module is used for integrating in described plain text information the text message of the many information of Information generation of described one word speed and/or described one intensity and/or one intonation.
The present invention also provides a kind of speech recognition multi-information text acquisition methods, may further comprise the steps:
Step 1 is converted to plain text information by speech recognition with speech audio, obtains simultaneously the one time in the speech audio, and then determines the one word speed by the length of described one time;
Step 2 is with the text message of the many information of described plain text Information generation.
Optionally, in the described step 2, in described plain text information, integrate the text message of the many information of Information generation of described one word speed.
Optionally, between described step 1 and step 2, also comprise the step that obtains one intensity and/or one intonation according to described one Time Calculation.
Optionally, in the described step 2, in described plain text information, integrate the text message of the many information of Information generation of described one word speed and/or described one intensity and/or described one intonation.
Optionally, described one intonation utilizes the described one time to calculate by the fundamental frequency extractive technique.
Optionally, described one intensity by calculate described one in the time average of intensity of phonation obtain.
Speech recognition multi-information text deriving means of the present invention and method also are integrated into the one word speed in the speech audio, one intensity, one intonation the text message that generates many information in the plain text information of initial generation by certain manifestation mode after by speech recognition speech audio being converted to plain text information.Speech recognition multi-information text deriving means of the present invention and method can be widely used in the information promulgating platforms such as microblogging, note and signature.
Description of drawings
Fig. 1 is an embodiment configuration diagram of speech recognition multi-information text deriving means of the present invention;
Fig. 2 is another embodiment configuration diagram of speech recognition multi-information text deriving means of the present invention;
Fig. 3 is speech recognition multi-information text acquisition methods one embodiment schematic flow sheet of the present invention;
Fig. 4 is another embodiment schematic flow sheet of speech recognition multi-information text acquisition methods of the present invention;
Fig. 5 is the schematic diagram of the text message of a kind of many information of the present invention;
Fig. 6 is the schematic diagram of the text message of the many information of another kind of the present invention.
Embodiment
For above-mentioned purpose of the present invention, feature and advantage can be become apparent more, the below is described in detail the specific embodiment of the present invention.
The text message of many information of the present invention represents that system and method can utilize multiple substitute mode to realize; the below is illustrated by preferred embodiment; certainly the present invention is not limited to this specific embodiment, and the known general replacement of one of ordinary skilled in the art is encompassed in protection scope of the present invention undoubtedly.
The invention provides a kind of speech recognition rich text deriving means.
Embodiment one
Please referring to Fig. 1, Fig. 1 is an embodiment configuration diagram of speech recognition multi-information text deriving means of the present invention.As shown in Figure 1, speech recognition multi-information text deriving means of the present invention comprises:
Plain text information and one time generation module, be used for by speech recognition speech audio being converted to plain text information, be used for obtaining simultaneously the one time of speech audio, be start time and the concluding time of one, and then determine the one word speed by the length of described one time.The described one time obtains when speech audio is converted to plain text information in the process of speech recognition automatically.
The rich text generation module is used for the text message in the many information of Information generation of described plain text information integration one word speed.
According to the one word speed that obtains, represent word speed by literal spacing or the literal width that changes in the plain text information, perhaps represent word speed, the perhaps combination of above several method by adding symbol.
For example, the plain text information that the generation by described speech recognition plain text information generating module obtains is: good refreshing, be extracted into mobile phone.
Represent word speed by the literal spacing that changes plain text information, obtain the text message of many information: good refreshing, be extracted into mobile phone.
By changing the literal width means word speed of plain text information, obtain the text message of many information: good refreshing, be extracted into mobile phone.
Represent word speed by in plain text information, adding symbol, obtain the text message of many information: good~~refreshing, the mobile phone of drawing a lottery~be extracted into~~.
Embodiment two
Please referring to Fig. 2, Fig. 2 is another embodiment configuration diagram of speech recognition multi-information text deriving means of the present invention.As shown in Figure 2, speech recognition multi-information text deriving means of the present invention comprises:
Plain text information and one time generation module, be used for by speech recognition speech audio being converted to plain text information, be used for obtaining simultaneously the one time of speech audio, be start time and the concluding time of one, and then determine the one word speed by the length of described one time.The described one time obtains when speech audio is converted to plain text information in the process of speech recognition automatically.
One intensity computing module is used for obtaining one intensity according to the one Time Calculation that obtains.Utilize the described one time that obtains, calculate the average of intensity of phonation in the one time period, can obtain the intensity of phonation of each word.
Individual character intonation computing module is used for obtaining the one intonation according to the one Time Calculation that obtains.Described one intonation obtains by the fundamental frequency extractive technique.The frequency of vocal cord vibration when the fundamental frequency in the fundamental frequency extractive technique refers to send out voiced sound in the phonation.Existing multiple fundamental frequency extraction algorithm mainly contains the correlation method of time domain, Cepstrum Method of frequency domain etc. in the prior art.
The rich text generation module is used for the text message in the many information of Information generation of described plain text information integration one word speed and/or one intensity and/or one intonation.The text message of described many information is the text message that includes expression pronunciation word speed and/or pronunciation intonation and/or intensity of phonation implication content.
1) according to the one word speed that obtains, represent word speed by literal spacing or the literal width that changes in the plain text information, perhaps represent word speed, the perhaps combination of above several method by adding symbol.
For example, the plain text information that the generation by described speech recognition plain text information generating module obtains is: good refreshing, be extracted into mobile phone.
Represent word speed by the literal spacing that changes plain text information, obtain the text message of many information: good refreshing, be extracted into mobile phone.
By changing the literal width means word speed of plain text information, obtain the text message of many information: good refreshing, be extracted into mobile phone.
Represent word speed by in plain text information, adding symbol, obtain the text message of many information: good~~refreshing, the mobile phone of drawing a lottery~be extracted into~~.
2) according to the one intensity that obtains, represent intensity of phonation, the perhaps combination of above method by changing literal size in the plain text information or text color or character script thickness.
For example, the plain text information that obtains after the processing by described speech recognition plain text information generating module is: good refreshing, be extracted into mobile phone.
By changing the literal size expression intensity of phonation of plain text information, obtain the text message of many information: good refreshing, be extracted into mobile phone.
Represent intensity of phonation by the text color that changes plain text information, obtain the text message of many information: good (redness) feel well (blueness), takes out (brown) prize and takes out (redness) and arrive mobile phone and (redness).
Represent intensity of phonation by the character script thickness that changes plain text information, obtain the text message of many information: good refreshing, be extracted into mobile phone.
3) according to the one intonation that obtains, the top by each word in plain text information or bottom add curve and represent to pronounce intonation.
For example, the plain text information that obtains after the processing by described speech recognition plain text information generating module is: good refreshing, be extracted into mobile phone.
By on plain text information Chinese word top or the bottom add the curve of representative pronunciation intonation, obtain the text message of many information as shown in Figure 5.
4) use simultaneously above-mentioned 1) to 3) the middle method of describing, one word speed, one intensity and one intonation all are integrated into the text message that generates many information in the plain text information.
For example, the plain text information that obtains after the processing by described speech recognition plain text information generating module is: good refreshing, be extracted into mobile phone.
The final rich text information that generates as shown in Figure 6.
The present invention also provides a kind of speech recognition multi-information text acquisition methods.
Embodiment three
Please referring to Fig. 3, Fig. 3 is speech recognition multi-information text acquisition methods one embodiment schematic flow sheet of the present invention.As shown in Figure 3, the invention provides a kind of speech recognition multi-information text acquisition methods, may further comprise the steps:
Step 1, by speech recognition speech audio is converted to plain text information, obtain simultaneously the one time in the speech audio, i.e. the start time of one and concluding time, and then determine the word speed of one by the length of described one time.The described one time obtains when speech audio is converted to plain text information in the process of speech recognition automatically.
Step 2, the text message of the many information of Information generation of integration one word speed in described plain text information.
Embodiment four
Please referring to Fig. 4, Fig. 4 is another embodiment schematic flow sheet of speech recognition multi-information text acquisition methods of the present invention.As shown in Figure 4, the invention provides a kind of speech recognition multi-information text acquisition methods, may further comprise the steps:
Step 1, by speech recognition speech audio is converted to plain text information, obtain simultaneously the one time in the speech audio, i.e. the start time of one and concluding time, and then determine the word speed of one by the length of described one time.The described one time obtains when speech audio is converted to plain text information in the process of speech recognition automatically.
Step 2 obtains one intensity and/or one intonation according to the one Time Calculation that obtains.
When calculating described one intensity, utilize the described one time that obtains, calculate the average of intensity of phonation in the one time period, can obtain the intensity of phonation of each word.
Described one intonation calculates by the fundamental frequency extractive technique.
Step 3, the text message of the many information of Information generation of integration one word speed and/or one intensity and/or one intonation in described plain text information.
Speech recognition multi-information text deriving means of the present invention and method also are integrated into the one word speed in the speech audio, one intensity, one intonation the text message that generates many information in the plain text information of initial generation by certain manifestation mode after by speech recognition speech audio being converted to plain text information.Speech recognition multi-information text deriving means of the present invention and method can be widely used in the information promulgating platforms such as microblogging, note and signature.
Obviously, those skilled in the art can carry out various changes and modification to the present invention and not break away from the spirit and scope of the present invention.Like this, if of the present invention these are revised and modification belongs within the scope of claim of the present invention and equivalent technologies thereof, then the present invention also is intended to comprise these changes and modification interior.

Claims (7)

1. a speech recognition rich text deriving means is characterized in that, comprising:
Plain text information and one time generation module, be used for by speech recognition speech audio being converted to plain text information, be used for obtaining simultaneously the one time of described speech audio, determine the one word speed by the length of described one time;
The rich text generation module, be used for the text message with the many information of described plain text Information generation, namely in described plain text information, integrate the text message of the many information of Information generation of described one word speed and/or one intensity and/or one intonation;
Individual character intonation computing module is used for obtaining the one intonation according to described one Time Calculation.
2. speech recognition rich text deriving means as claimed in claim 1 is characterized in that, also comprises one intensity computing module, is used for obtaining one intensity according to described one Time Calculation.
3. speech recognition rich text deriving means as claimed in claim 2, it is characterized in that, described rich text generation module is used for integrating in described plain text information the text message of the many information of Information generation of described one word speed and/or described one intensity.
4. a speech recognition multi-information text acquisition methods is characterized in that, may further comprise the steps:
Step 1 is converted to plain text information by speech recognition with speech audio, obtains simultaneously the one time in the speech audio, and then determines the one word speed by the length of described one time;
Step 2 is with the text message of the many information of described plain text Information generation;
Between described step 1 and step 2, also comprise the step that obtains one intensity and/or one intonation according to described one Time Calculation;
In the described step 2, in described plain text information, integrate the text message of the many information of Information generation of described one word speed and/or described one intensity and/or described one intonation.
5. speech recognition multi-information text acquisition methods as claimed in claim 4 is characterized in that, in the described step 2, integrates the text message of the many information of Information generation of described one word speed in described plain text information.
6. speech recognition multi-information text acquisition methods as claimed in claim 4 is characterized in that, described one intonation utilizes the described one time to calculate by the fundamental frequency extractive technique.
7. speech recognition multi-information text acquisition methods as claimed in claim 4 is characterized in that, described one intensity by calculate described one in the time average of intensity of phonation obtain.
CN2011101651010A 2011-06-17 2011-06-17 Device and method for acquiring speech recognition multi-information text Active CN102237088B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2011101651010A CN102237088B (en) 2011-06-17 2011-06-17 Device and method for acquiring speech recognition multi-information text

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2011101651010A CN102237088B (en) 2011-06-17 2011-06-17 Device and method for acquiring speech recognition multi-information text

Publications (2)

Publication Number Publication Date
CN102237088A CN102237088A (en) 2011-11-09
CN102237088B true CN102237088B (en) 2013-10-23

Family

ID=44887675

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2011101651010A Active CN102237088B (en) 2011-06-17 2011-06-17 Device and method for acquiring speech recognition multi-information text

Country Status (1)

Country Link
CN (1) CN102237088B (en)

Families Citing this family (69)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8677377B2 (en) 2005-09-08 2014-03-18 Apple Inc. Method and apparatus for building an intelligent automated assistant
US9318108B2 (en) 2010-01-18 2016-04-19 Apple Inc. Intelligent automated assistant
US8977255B2 (en) 2007-04-03 2015-03-10 Apple Inc. Method and system for operating a multi-function portable electronic device using voice-activation
US8676904B2 (en) 2008-10-02 2014-03-18 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US10706373B2 (en) 2011-06-03 2020-07-07 Apple Inc. Performing actions associated with task items that represent tasks to perform
CN101923734B (en) * 2010-07-15 2012-07-04 严皓 Highway vehicle traveling path recognition system based on mobile network and realization method thereof
US10417037B2 (en) 2012-05-15 2019-09-17 Apple Inc. Systems and methods for integrating third party services with a digital assistant
TWI484475B (en) * 2012-06-05 2015-05-11 Quanta Comp Inc Method for displaying words, voice-to-text device and computer program product
JP2016508007A (en) 2013-02-07 2016-03-10 アップル インコーポレイテッド Voice trigger for digital assistant
US10652394B2 (en) 2013-03-14 2020-05-12 Apple Inc. System and method for processing voicemail
US10748529B1 (en) 2013-03-15 2020-08-18 Apple Inc. Voice activated device for use with a voice-based digital assistant
US10176167B2 (en) 2013-06-09 2019-01-08 Apple Inc. System and method for inferring user intent from speech inputs
CN103310273A (en) * 2013-06-26 2013-09-18 南京邮电大学 Method for articulating Chinese vowels with tones and based on DIVA model
CN104518951B (en) * 2013-09-29 2017-04-05 腾讯科技(深圳)有限公司 A kind of method and device for replying social networking application information
US10170123B2 (en) 2014-05-30 2019-01-01 Apple Inc. Intelligent assistant for home automation
US9715875B2 (en) 2014-05-30 2017-07-25 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US9966065B2 (en) 2014-05-30 2018-05-08 Apple Inc. Multi-command single utterance input method
US10186282B2 (en) * 2014-06-19 2019-01-22 Apple Inc. Robust end-pointing of speech signals using speaker recognition
US9338493B2 (en) 2014-06-30 2016-05-10 Apple Inc. Intelligent automated assistant for TV user interactions
US9886953B2 (en) 2015-03-08 2018-02-06 Apple Inc. Virtual assistant activation
US9721566B2 (en) 2015-03-08 2017-08-01 Apple Inc. Competing devices responding to voice triggers
US10460227B2 (en) 2015-05-15 2019-10-29 Apple Inc. Virtual assistant in a communication session
US10200824B2 (en) 2015-05-27 2019-02-05 Apple Inc. Systems and methods for proactively identifying and surfacing relevant content on a touch-sensitive device
US20160378747A1 (en) 2015-06-29 2016-12-29 Apple Inc. Virtual assistant for media playback
US10747498B2 (en) 2015-09-08 2020-08-18 Apple Inc. Zero latency digital assistant
US10671428B2 (en) 2015-09-08 2020-06-02 Apple Inc. Distributed personal assistant
US10740384B2 (en) 2015-09-08 2020-08-11 Apple Inc. Intelligent automated assistant for media search and playback
US10331312B2 (en) 2015-09-08 2019-06-25 Apple Inc. Intelligent automated assistant in a media environment
CN105353957A (en) * 2015-10-28 2016-02-24 深圳市金立通信设备有限公司 Information display method and terminal
US10691473B2 (en) 2015-11-06 2020-06-23 Apple Inc. Intelligent automated assistant in a messaging environment
US10956666B2 (en) 2015-11-09 2021-03-23 Apple Inc. Unconventional virtual assistant interactions
US10223066B2 (en) 2015-12-23 2019-03-05 Apple Inc. Proactive assistance based on dialog communication between devices
US10586535B2 (en) 2016-06-10 2020-03-10 Apple Inc. Intelligent digital assistant in a multi-tasking environment
DK201670540A1 (en) 2016-06-11 2018-01-08 Apple Inc Application integration with a digital assistant
DK179415B1 (en) 2016-06-11 2018-06-14 Apple Inc Intelligent device arbitration and control
US11204787B2 (en) 2017-01-09 2021-12-21 Apple Inc. Application integration with a digital assistant
DK201770383A1 (en) 2017-05-09 2018-12-14 Apple Inc. User interface for correcting recognition errors
US10726832B2 (en) 2017-05-11 2020-07-28 Apple Inc. Maintaining privacy of personal information
DK179496B1 (en) 2017-05-12 2019-01-15 Apple Inc. USER-SPECIFIC Acoustic Models
DK201770429A1 (en) 2017-05-12 2018-12-14 Apple Inc. Low-latency intelligent automated assistant
DK179745B1 (en) 2017-05-12 2019-05-01 Apple Inc. SYNCHRONIZATION AND TASK DELEGATION OF A DIGITAL ASSISTANT
US20180336892A1 (en) 2017-05-16 2018-11-22 Apple Inc. Detecting a trigger of a digital assistant
US10303715B2 (en) 2017-05-16 2019-05-28 Apple Inc. Intelligent automated assistant for media exploration
CN108133706B (en) * 2017-12-21 2020-10-27 深圳市沃特沃德股份有限公司 Semantic recognition method and device
US10818288B2 (en) 2018-03-26 2020-10-27 Apple Inc. Natural assistant interaction
US11145294B2 (en) 2018-05-07 2021-10-12 Apple Inc. Intelligent automated assistant for delivering content from user experiences
US10928918B2 (en) 2018-05-07 2021-02-23 Apple Inc. Raise to speak
DK180639B1 (en) 2018-06-01 2021-11-04 Apple Inc DISABILITY OF ATTENTION-ATTENTIVE VIRTUAL ASSISTANT
DK179822B1 (en) 2018-06-01 2019-07-12 Apple Inc. Voice interaction at a primary device to access call functionality of a companion device
US10892996B2 (en) 2018-06-01 2021-01-12 Apple Inc. Variable latency device coordination
CN110830852B (en) * 2018-08-07 2022-08-12 阿里巴巴(中国)有限公司 Video content processing method and device
US11475898B2 (en) 2018-10-26 2022-10-18 Apple Inc. Low-latency multi-speaker speech recognition
US11638059B2 (en) 2019-01-04 2023-04-25 Apple Inc. Content playback on multiple devices
US11348573B2 (en) 2019-03-18 2022-05-31 Apple Inc. Multimodality in digital assistant systems
US11475884B2 (en) 2019-05-06 2022-10-18 Apple Inc. Reducing digital assistant latency when a language is incorrectly determined
DK201970509A1 (en) 2019-05-06 2021-01-15 Apple Inc Spoken notifications
US11307752B2 (en) 2019-05-06 2022-04-19 Apple Inc. User configurable task triggers
US11423908B2 (en) 2019-05-06 2022-08-23 Apple Inc. Interpreting spoken requests
US11140099B2 (en) 2019-05-21 2021-10-05 Apple Inc. Providing message response suggestions
US11289073B2 (en) 2019-05-31 2022-03-29 Apple Inc. Device text to speech
DK201970511A1 (en) 2019-05-31 2021-02-15 Apple Inc Voice identification in digital assistant systems
US11496600B2 (en) 2019-05-31 2022-11-08 Apple Inc. Remote execution of machine-learned models
DK180129B1 (en) 2019-05-31 2020-06-02 Apple Inc. User activity shortcut suggestions
US11360641B2 (en) 2019-06-01 2022-06-14 Apple Inc. Increasing the relevance of new available information
US11488406B2 (en) 2019-09-25 2022-11-01 Apple Inc. Text detection using global geometry estimators
US11183193B1 (en) 2020-05-11 2021-11-23 Apple Inc. Digital assistant hardware abstraction
US11755276B2 (en) 2020-05-12 2023-09-12 Apple Inc. Reducing description length based on confidence
CN111611208A (en) * 2020-05-27 2020-09-01 北京太极华保科技股份有限公司 File storage and query method and device and storage medium
CN112530213B (en) * 2020-12-25 2022-06-03 方湘 Chinese tone learning method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1336634A (en) * 2000-07-28 2002-02-20 国际商业机器公司 Method and device for recognizing acoustic language according to base sound information
JP2004212665A (en) * 2002-12-27 2004-07-29 Toshiba Corp Apparatus and method for varying speaking speed
US7155391B2 (en) * 2000-07-31 2006-12-26 Micron Technology, Inc. Systems and methods for speech recognition and separate dialect identification
CN101777347A (en) * 2009-12-07 2010-07-14 中国科学院自动化研究所 Model complementary Chinese accent identification method and system
JP2011014021A (en) * 2009-07-03 2011-01-20 Nippon Hoso Kyokai <Nhk> Character information presentation control device and program

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101462932B1 (en) * 2008-05-28 2014-12-04 엘지전자 주식회사 Mobile terminal and text correction method
CN101727900A (en) * 2009-11-24 2010-06-09 北京中星微电子有限公司 Method and equipment for detecting user pronunciation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1336634A (en) * 2000-07-28 2002-02-20 国际商业机器公司 Method and device for recognizing acoustic language according to base sound information
US7155391B2 (en) * 2000-07-31 2006-12-26 Micron Technology, Inc. Systems and methods for speech recognition and separate dialect identification
JP2004212665A (en) * 2002-12-27 2004-07-29 Toshiba Corp Apparatus and method for varying speaking speed
JP2011014021A (en) * 2009-07-03 2011-01-20 Nippon Hoso Kyokai <Nhk> Character information presentation control device and program
CN101777347A (en) * 2009-12-07 2010-07-14 中国科学院自动化研究所 Model complementary Chinese accent identification method and system

Also Published As

Publication number Publication date
CN102237088A (en) 2011-11-09

Similar Documents

Publication Publication Date Title
CN102237088B (en) Device and method for acquiring speech recognition multi-information text
WO2022052481A1 (en) Artificial intelligence-based vr interaction method, apparatus, computer device, and medium
CN105096940B (en) Method and apparatus for carrying out speech recognition
KR20140121580A (en) Apparatus and method for automatic translation and interpretation
CN104811559B (en) Noise-reduction method, communication means and mobile terminal
US20180068662A1 (en) Generation of text from an audio speech signal
EP1557821A3 (en) Segmental tonal modeling for tonal languages
EP2851895A3 (en) Speech recognition using variable-length context
CN108766441A (en) A kind of sound control method and device based on offline Application on Voiceprint Recognition and speech recognition
ATE514162T1 (en) DYNAMIC CONTEXT GENERATION FOR LANGUAGE RECOGNITION
CN105210147B (en) Method, apparatus and computer-readable recording medium for improving at least one semantic unit set
Yağanoğlu Real time wearable speech recognition system for deaf persons
WO2004100126A3 (en) Method for statistical language modeling in speech recognition
CN112309365A (en) Training method and device of speech synthesis model, storage medium and electronic equipment
JP2010157081A (en) Response generation device and program
KR102607373B1 (en) Apparatus and method for recognizing emotion in speech
CN105765654A (en) Hearing assistance device with fundamental frequency modification
CN106653002A (en) Literal live broadcasting method and platform
EP1280137A1 (en) Method for speaker identification
CN102411929A (en) Voiceprint authentication system and implementation method thereof
CN113327586A (en) Voice recognition method and device, electronic equipment and storage medium
CN104361787A (en) System and method for converting signals
CN104200807B (en) A kind of ERP sound control methods
CN112489634A (en) Language acoustic model training method and device, electronic equipment and computer medium
CN108831503B (en) Spoken language evaluation method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: SHANGHAI GUOKE ELECTRONIC CO., LTD.

Free format text: FORMER OWNER: SHENGYUE INFORMATION TECHNOLOGY (SHANGHAI) CO., LTD.

Effective date: 20140210

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20140210

Address after: 201203 Shanghai Guo Shou Jing Road, Zhangjiang hi tech Park No. 356 building 3 room 127

Patentee after: Shanghai Guoke Electronic Co., Ltd.

Address before: 201203 Shanghai City, Pudong New Area Shanghai City, Guo Shou Jing Road, Zhangjiang hi tech Park No. 356 building 3 Room 102

Patentee before: Shengle Information Technology (Shanghai) Co., Ltd.

CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: Room 127, building 3, 356 GuoShouJing Road, China (Shanghai) pilot Free Trade Zone, Pudong New Area, Shanghai, 200120

Patentee after: SHANGHAI GEAK ELECTRONICS Co.,Ltd.

Address before: Room 127, building 3, 356 GuoShouJing Road, Zhangjiang hi tech park, Shanghai, 201203

Patentee before: Shanghai Nutshell Electronics Co.,Ltd.