WO2017173721A1 - Speech recognition method and device - Google Patents

Speech recognition method and device Download PDF

Info

Publication number
WO2017173721A1
WO2017173721A1 PCT/CN2016/083516 CN2016083516W WO2017173721A1 WO 2017173721 A1 WO2017173721 A1 WO 2017173721A1 CN 2016083516 W CN2016083516 W CN 2016083516W WO 2017173721 A1 WO2017173721 A1 WO 2017173721A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
correspondence
disabled person
standard
speech
Prior art date
Application number
PCT/CN2016/083516
Other languages
French (fr)
Chinese (zh)
Inventor
潘春岭
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2017173721A1 publication Critical patent/WO2017173721A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes

Abstract

Provided are a speech recognition method and device. The method comprises: establishing correspondence relationships between pronunciations of words frequently used by a person having a speech impediment and standard pronunciations of the words (S101); and receiving a speech input from the person having a speech impediment, recognizing, according to the established correspondence relationships, a corresponding standard pronunciation, and executing an operation corresponding to the recognized standard pronunciation (S102). Establishment of correspondence relationships between pronunciations of words used by a person having a speech impediment and standard pronunciations realizes accurate recognition of the speech of said person, facilitates accurate expression of their thoughts or purposes, and causes a controlled device to correctly execute a voice command from said person, facilitating their ability to express themselves through speech and enabling them to gain confidence in daily life.

Description

一种语音识别方法和装置Speech recognition method and device 技术领域Technical field
本发明涉及语音识别技术领域,尤指一种语音识别方法和装置。The invention relates to the field of speech recognition technology, in particular to a speech recognition method and device.
背景技术Background technique
目前,随着语音识别技术的不断发展,越来越多的设备(比如手机、电视机、空调器等家用电器)都可以通过语音控制来执行相应的功能,例如:受控设备检测到语音控制指令时,可以根据检测到的控制指令来执行相应的操作,因此,语音交互给用户的日常生活带来了很多便利。At present, with the continuous development of voice recognition technology, more and more devices (such as mobile phones, televisions, air conditioners, etc.) can perform corresponding functions through voice control, for example, controlled devices detect voice control. When the command is executed, the corresponding operation can be performed according to the detected control command, and therefore, the voice interaction brings a lot of convenience to the user's daily life.
现有技术中,对于来自不同国家或者不同地区的人们,受控设备可以通过很多语音翻译系统对不同国家的语言或者不同地区的方言进行翻译,根据翻译后的控制指令来执行相应的操作。In the prior art, for people from different countries or different regions, the controlled device can translate the dialects of different countries or different regions through a plurality of voice translation systems, and perform corresponding operations according to the translated control instructions.
但是,采用现有的技术,对于由于后期疾病造成的语音障碍者,例如:中风等导致的语音障碍的患者,他们可以朗读简单的文字,有强烈的会话欲望,但是,目前的受控设备却无法准确识别他们的语音以进行语音交互,使得不利于患者的病情恢复,丧失了生活的信心。However, with the existing technology, for patients with speech disorders caused by late disease, such as stroke, the patients with speech disorders, they can read simple words, have a strong desire for conversation, but the current controlled devices The inability to accurately identify their voice for voice interaction makes it unfavorable for the patient's condition to recover and loses confidence in life.
发明内容Summary of the invention
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。The following is an overview of the topics detailed in this document. This Summary is not intended to limit the scope of the claims.
本发明实施例提供了一种语音识别方法和装置,能够准确识别语言障碍者的语音,以便正确使受控设备进行语音交互。The embodiment of the invention provides a voice recognition method and device, which can accurately identify the voice of a language disabled person, so as to correctly enable the controlled device to perform voice interaction.
第一方面,本发明实施例提供了一种语音识别方法,包括:In a first aspect, an embodiment of the present invention provides a voice recognition method, including:
建立语音障碍者的常用生活用语语音与标准语音的对应关系;Establish a correspondence between common life language voices and standard voices of voice disabled persons;
接收语音障碍者的语音,根据建立的对应关系识别出对应的标准语音并执行识别出的标准语音相应的操作。Receiving the voice of the voice disabled person, identifying the corresponding standard voice according to the established correspondence relationship and performing the corresponding operation of the recognized standard voice.
可选地,所述建立语音障碍者的常用生活用语语音与标准语音的对应关系包括: Optionally, the correspondence between the common life language voice and the standard voice of the voice disabled person is:
提取所述语音障碍者的常用生活用语语音中的词组或者文字的语音,与标准语音中的词组或者文字的语音建立所述对应关系。The speech of the phrase or the text in the common life language voice of the voice disabled person is extracted, and the corresponding relationship is established with the voice of the phrase or the text in the standard voice.
可选地,所述建立语音障碍者的常用生活用语语音与标准语音的对应关系之后,还包括:Optionally, after the establishing the correspondence between the common life language voice of the voice disabled person and the standard voice, the method further includes:
存储所述建立的对应关系,并上传到云服务器进行备份。The established correspondence is stored and uploaded to the cloud server for backup.
可选地,所述建立语音障碍者的常用生活用语语音与标准语音的对应关系之后,还包括:Optionally, after the establishing the correspondence between the common life language voice of the voice disabled person and the standard voice, the method further includes:
复核所述语音障碍者的语音与标准语音的对应关系,修正所述对应关系中复核错误的对应关系。The correspondence between the voice of the voice disabled person and the standard voice is reviewed, and the corresponding relationship of the review errors in the correspondence relationship is corrected.
可选地,所述建立语音障碍者的常用生活用语语音与标准语音的对应关系之后,还包括:Optionally, after the establishing the correspondence between the common life language voice of the voice disabled person and the standard voice, the method further includes:
定期统计所述语音障碍者的语音与标准语音的对应关系的使用频率,根据使用频率更新所述数据库。The frequency of use of the correspondence between the voice of the voice disabled person and the standard voice is periodically counted, and the database is updated according to the frequency of use.
可选地,所述建立语音障碍者的常用生活用语语音与标准语音的对应关系之前,还包括:录入所述语音障碍者朗读常用生活用语的语音。Optionally, before the establishing the correspondence between the common life language voice of the voice disabled person and the standard voice, the method further includes: recording the voice of the voice disabled person to read the common life language.
第二方面,本发明实施例提供的一种语音识别装置,包括:语音智能处理模块和语音识别模块;其中,A second aspect of the present invention provides a voice recognition apparatus, including: a voice intelligent processing module and a voice recognition module;
语音智能处理模块,设置为建立语音障碍者的常用生活用语语音与标准语音的对应关系;The voice intelligent processing module is configured to establish a correspondence between the common life language voice of the voice disabled person and the standard voice;
语音识别模块,设置为接收语音障碍者的语音,根据建立的对应关系识别出对应的标准语音并执行识别出的标准语音相应的操作。The voice recognition module is configured to receive the voice of the voice disabled person, identify the corresponding standard voice according to the established correspondence, and perform the corresponding operation of the recognized standard voice.
可选地,所述语音智能处理模块具体设置为:Optionally, the voice intelligent processing module is specifically configured to:
提取所述语音障碍者的常用生活用语语音中出的词组或者文字的语音,与标准语音中的词组或者文字的语音建立所述对应关系。The speech of the phrase or the text in the common life language voice of the voice disabled person is extracted, and the corresponding relationship is established with the voice of the phrase or the text in the standard voice.
可选地,所述语音智能处理模块还设置为:存储所述建立的对应关系,并上传到云服务器进行备份。Optionally, the voice intelligent processing module is further configured to: store the established correspondence, and upload the file to the cloud server for backup.
可选地,所述语音智能处理模块还设置为:复核所述语音障碍者的语音 与标准语音的对应关系,修正所述对应关系中复核错误的对应关系。Optionally, the voice intelligent processing module is further configured to: review voice of the voice disabled person Corresponding relationship with the standard voice, correcting the correspondence of the review errors in the correspondence.
可选地,所述语音智能处理模块还设置为:Optionally, the voice intelligent processing module is further configured to:
定期统计所述语音障碍者的语音与标准语音的对应关系的使用频率,根据使用频率更新所述对应关系。The frequency of use of the correspondence between the voice of the voice disabled person and the standard voice is periodically counted, and the correspondence relationship is updated according to the frequency of use.
可选地,该装置还包括:语音录入模块,设置为录入所述语音障碍者朗读常用生活用语的语音。Optionally, the device further includes: a voice input module, configured to record the voice of the voice disabled person to read the common life language.
本发明实施例再提供了一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令用于执行上述任一语音识别方法。The embodiment of the invention further provides a computer readable storage medium storing computer executable instructions for performing any of the above voice recognition methods.
本发明实施例通过建立语音障碍者语音与标准语音的对应关系,实现了准确识别语言障碍者的语音,为语音障碍者思想意图的真实表达提供了便利,正确使受控设备进行了语音交互,更加有利于患者语言表达的恢复,树立他们对生活的信心。The embodiment of the invention realizes the accurate recognition of the voice of the language disabled by establishing the correspondence between the voice of the voice disabled person and the standard voice, and provides convenience for the true expression of the mind intention of the voice disabled person, and correctly enables the controlled device to perform the voice interaction. It is more conducive to the recovery of patients' language expression and establish their confidence in life.
本发明实施例的其它特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者通过实施本发明而了解。本发明的目的和其他优点可通过在说明书、权利要求书以及附图中所特别指出的结构来实现和获得。Other features and advantages of the embodiments of the invention will be set forth in the description in the description which The objectives and other advantages of the invention may be realized and obtained by means of the structure particularly pointed in the appended claims.
在阅读并理解了附图和详细描述后,可以明白其他方面。Other aspects will be apparent upon reading and understanding the drawings and detailed description.
附图概述BRIEF abstract
此处所说明的附图用来提供对本发明实施例的进一步理解,构成本申请的一部分,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:The drawings are intended to provide a further understanding of the embodiments of the present invention, and are intended to be a part of the present invention, and the description of the present invention is not intended to limit the invention. In the drawing:
图1为本发明实施例提供的一种语音识别方法实施例一的流程示意图;FIG. 1 is a schematic flowchart of Embodiment 1 of a voice recognition method according to an embodiment of the present disclosure;
图2为本发明实施例提供的一种语音识别装置实施例一的结构示意图。FIG. 2 is a schematic structural diagram of Embodiment 1 of a voice recognition apparatus according to an embodiment of the present invention.
本发明的较佳实施方式Preferred embodiment of the invention
为使本发明的目的、技术方案和优点更加清楚明白,下文中将结合附图对本发明的实施例进行详细说明。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互任意组合。 The embodiments of the present invention will be described in detail below with reference to the accompanying drawings. It should be noted that, in the case of no conflict, the features in the embodiments and the embodiments in the present application may be arbitrarily combined with each other.
在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行。并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。The steps illustrated in the flowchart of the figures may be executed in a computer system such as a set of computer executable instructions. Also, although logical sequences are shown in the flowcharts, in some cases the steps shown or described may be performed in a different order than the ones described herein.
本发明实施例涉及的方法可以应用于语音障碍者,该语音障碍者是由于后期疾病造成的语音障碍者,他们可以朗读简单的文字,有强烈的会话欲望,却无法准确识别他们的语音进行语音交互,例如:中风等导致的语音障碍的患者等,他们可以通过装有语音识别装置的智能设备,例如:手机、平板电脑、智能机器人等,能准备识别他们的语音所表达的真实意图,帮助他们执行相应的操作,但并不限于此。The method according to the embodiment of the present invention can be applied to a voice disabled person who is a voice disabled person due to a late stage disease, who can read a simple text, have a strong desire for conversation, but cannot accurately recognize their voice for voice. Interactions, such as patients with speech disorders caused by strokes, etc., they can prepare to recognize the true intention expressed by their voice through smart devices equipped with voice recognition devices, such as mobile phones, tablets, intelligent robots, etc. They perform the corresponding operations, but are not limited to this.
本发明实施例涉及的方法,旨在解决现有技术中无法准确识别语音障碍者的语音,以便正确使受控设备进行语音交互,从而使得无法表达出真实的思想意图,不利于患者的病情恢复的技术问题。The method according to the embodiment of the present invention is to solve the problem that the voice of the voice disabled person cannot be accurately recognized in the prior art, so as to correctly make the controlled device perform the voice interaction, thereby making it impossible to express the true intention of the mind, which is not conducive to the recovery of the patient's condition. Technical problem.
下面以具体地实施例对本发明的技术方案进行详细说明。下面这几个具体的实施例可以相互结合,对于相同或相似的概念或过程可能在某些实施例不再赘述。The technical solutions of the present invention will be described in detail below with specific embodiments. The following specific embodiments may be combined with each other, and the same or similar concepts or processes may not be described in some embodiments.
图1为本发明实施例提供的一种语音识别方法实施例一的流程示意图。本实施例涉及的是实现准确识别语音障碍者的语音方法的具体过程。如图1所示,该方法包括:FIG. 1 is a schematic flowchart diagram of Embodiment 1 of a voice recognition method according to an embodiment of the present invention. This embodiment relates to a specific process for realizing a voice method for accurately identifying a voice disabled person. As shown in Figure 1, the method includes:
S101、建立语音障碍者的常用生活用语语音与标准语音的对应关系。S101. Establish a correspondence between a common life language voice and a standard voice of a voice disabled person.
具体的,对接收的语音障碍者的常用生活用语语音中的词组发音或单个文字的发音进行分离与提取,将语音障碍者的语音与标准语音建立一对一的对应关系,可以形成数据库,但并不以此为限。Specifically, the phrase pronunciation of the commonly used living language voice of the received voice disabled person or the pronunciation of the single text is separated and extracted, and the one-to-one correspondence between the voice of the voice disabled person and the standard voice is established, and a database can be formed, but Not limited to this.
S102、接收语音障碍者的语音,根据建立的对应关系识别出对应的标准语音并执行识别出的标准语音相应的操作。S102. Receive a voice of a voice disabled person, identify a corresponding standard voice according to the established correspondence, and perform an operation corresponding to the recognized standard voice.
具体的,接收语音障碍者的语音,通过对所接收的语音进行分离、甄别,与对应关系的语音对比,识别出对应的标准语音,这样,准确识别出了语言障碍者的语音并执行所要进行的语音动作,就可以真实表达语音障碍者的思想意图,进行播放,正确使受控设备进行了语音交互,从而方便和家人进行 交流,也可以识别出语音障碍者。Specifically, the voice of the voice-disturbed person is received, and the corresponding standard voice is recognized by separating and discriminating the received voice, and the voice of the corresponding relationship is compared, so that the voice of the language-disabled person is accurately recognized and performed. The voice action can truly express the thought intention of the voice disabled person, play it, and correctly make the controlled device perform the voice interaction, so that it is convenient for the family to carry out Communication can also identify people with speech disabilities.
本发明实施例提供的一种语音识别方法,通过建立语音障碍者语音与标准语音的对应关系,实现了准确识别语言障碍者的语音,为语音障碍者思想意图的真实表达提供了便利,正确使受控设备进行了语音交互,更加有利于患者语言表达的恢复,树立他们对生活的信心。A speech recognition method provided by an embodiment of the present invention realizes the accurate recognition of the speech of a language disabled person by establishing a correspondence relationship between the voice disabled person's voice and the standard voice, and provides a convenient representation for the true expression of the mentally impaired person's thought intention, and correctly The controlled interaction of the devices is more conducive to the recovery of the patient's language expression and establish their confidence in life.
可选地,在上述实施例的基础上,在建立所述语音障碍者的常用生活用语语音与标准语音的对应关系之前,还包括:Optionally, on the basis of the foregoing embodiment, before establishing the correspondence between the common life language voice of the voice disabled person and the standard voice, the method further includes:
录入语音障碍者朗读常用生活用语的语音。The voice-disabled person reads the voice of common life words.
本申请中的语音障碍者为由于后期疾病造成的语音障碍者,这些语音障碍者可以朗读简单的文字,有强烈的会话欲望,例如:中风等导致的语音障碍的患者等,录入语音障碍者的常用生活用语语音,该常用生活用语可以提前准备的5000字的文章或者短句或者词组等,这5000字的内容是通过筛选和语音障碍者的生活息息相关的生活常用语,同时还可以根据《现代汉语常用字表》中的常用字(2500字)和次常用字(1000字)两个部分来遴选文章,通过计算机抽样检测,这些常用字在语言中的覆盖率达到99.48%,通过筛选出常用字以便满足语音障碍者的沟通交流,但并不限于此。The speech impaired person in the present application is a speech disorder caused by a late stage disease, and these speech impaired persons can read a simple text and have a strong desire for conversation, for example, a patient with a speech disorder caused by a stroke, etc., and a person with a speech impairment Commonly used words of life, the common vocabulary can be prepared in advance of 5000 words of articles or short sentences or phrases, etc., the content of 5,000 words is through the screening of life-related words that are closely related to the lives of people with speech disabilities, but also according to "modern The common words (2500 words) and the second common words (1000 words) in the Chinese Common Word List are used to select articles. Through computer sampling, the coverage of these common words in the language reaches 99.48%. Words to meet the communication of people with speech disabilities, but not limited to this.
通过提前录入语音障碍者常用的生活用语的语音,便于后续数据库的建立,更有利于快速识别语音障碍者发出的语音来表达他们真实的思想意图。By recording the speech of the living language commonly used by the voice-disabled person in advance, it is convenient to establish the subsequent database, and it is more convenient to quickly recognize the voices of the voice-disabled person to express their true intentions.
可选地,在上述实施例的基础上,在上述S101步骤中建立所述语音障碍者的常用生活用语语音与标准语音的对应关系包括:Optionally, on the basis of the foregoing embodiment, establishing a correspondence between the common life language voice of the voice disabled person and the standard voice in the step S101 is:
提取所述语音障碍者的常用生活用语语音中的词组或者文字的语音,与标准语音中的词组或者文字的语音建立一一对应关系。The speech of the phrase or the text in the common life language voice of the voice disabled person is extracted, and the one-to-one correspondence is established with the voice of the phrase or the text in the standard voice.
具体的,通过实现语音障碍者语音的语音分拆、断句、断词,提取出语音障碍者语音中的词组或者文字与标准语音中的词组或者文字的语音的一一对应关系,其中,对于断句、断词的分拆方法,可以加入人为的条件设置,如:词与词的间隔在几毫秒之间等,从而来保证分拆的准确性,可以将建立的一一对应关系形成数据库,但并不限于此。Specifically, by realizing the speech separation, sentence breaking, and word breaking of the voice of the voice disabled person, the one-to-one correspondence between the phrase or the character in the voice of the voice disabled person and the voice of the phrase or the text in the standard voice is extracted, wherein The method of splitting the word breaks can be set by artificial conditions, such as: the interval between words and words is between a few milliseconds, etc., so as to ensure the accuracy of the split, and the established one-to-one correspondence can be formed into a database, but Not limited to this.
通过将语音障碍者的常用生活用语语音中的词组或者文字进行提取、拆 分,从而便于与标准语音的词组或者文字建立一一对应的关系,提高了数据库的精准性。By extracting or disassembling phrases or words from the common life language of a voice-disabled person Points, so as to facilitate the one-to-one correspondence with the standard phonetic phrases or words, improve the accuracy of the database.
可选地,在上述实施例的基础上,在上述S101步骤建立所述语音障碍者的常用生活用语语音与标准语音的对应关系之后,还包括:Optionally, on the basis of the foregoing embodiment, after establishing the correspondence between the common life language voice of the voice disabled person and the standard voice in the step S101, the method further includes:
存储建立的对应关系,并上传到云服务器进行备份。Store the established correspondence and upload it to the cloud server for backup.
具体的,可以将建立后的对应关系在设备内进行存储,并上传到云服务器进行备份,例如:可以将建立好的对应关系存储在手机上,并通过手机上传到云服务器上,这样,方便调用对应关系,也避免了更换设备后导致建立好的对应关系的丢失。Specifically, the established correspondence may be stored in the device and uploaded to the cloud server for backup. For example, the established correspondence may be stored on the mobile phone and uploaded to the cloud server through the mobile phone, so that it is convenient. Calling the corresponding relationship also avoids the loss of the established correspondence after the device is replaced.
通过对建立后的数据库进行存储并上传到云服务器进行备份,可以方便用户使用,随时随地可以调用数据库。By storing the uploaded database and uploading it to the cloud server for backup, it can be conveniently used by the user, and the database can be called anytime and anywhere.
可选地,在上述实施例的基础上,在上述S101步骤建立所述语音障碍者的常用生活用语语音与标准语音的对应关系之后,还包括:Optionally, on the basis of the foregoing embodiment, after establishing the correspondence between the common life language voice of the voice disabled person and the standard voice in the step S101, the method further includes:
复核所述语音障碍者的语音与标准语音的对应关系,修正所述对应关系中复核错误的对应关系。The correspondence between the voice of the voice disabled person and the standard voice is reviewed, and the corresponding relationship of the review errors in the correspondence relationship is corrected.
具体的,由于语音障碍者的语音发音是一个非正常发音的过程,但却有规律可循,发音并不是随意的,发音的方式基本也是固定的,其中,对于数据库的采集并不可能一次成功,需要有个修正和完善的过程,因此需要语音障碍者自己或是家人对于数据库进行复核,可以通过所述语音智能处理模块对语音障碍者语音进行分拆提取,同时找出对应的标准语音,然后进行合成,并予以播放进行复读监听,确定对应关系是否正确,复核语音障碍者的语音与标准语音的对应关系不正确,可以通过修正对应关系来确保数据库的正确性,其中,对于一直出现错误的对应关系,还可以通过强制建立某种词组语音的对应关系,完成数据库的建立。Specifically, since the voice pronunciation of the voice disabled person is a process of abnormal pronunciation, but there are rules to follow, the pronunciation is not arbitrary, and the pronunciation method is basically fixed. Among them, the database collection is not likely to be successful once. There is a need for a correction and improvement process. Therefore, the voice-disabled person or the family member needs to review the database, and the voice intelligent processing module can be used to separate and extract the voice of the voice-disabled person, and at the same time find the corresponding standard voice. Then, the synthesis is performed, and the playback is performed for repeat listening, to determine whether the correspondence is correct, and the correspondence between the voice of the voice disabled person and the standard voice is incorrect, and the correctness of the database can be ensured by correcting the correspondence, wherein the error always occurs. The correspondence relationship can also be completed by forcibly establishing the correspondence relationship of a certain phrase voice.
通过对对应关系的复核和修正,保证了对应关系中语音障碍者的语音与标准语音的正确对应,从而保证了更准确地识别语音障碍者的真实意图。Through the review and correction of the corresponding relationship, the correct correspondence between the voice of the voice disabled person and the standard voice is ensured, thereby ensuring more accurate recognition of the true intention of the voice disabled person.
可选地,在上述实施例的基础上,在上述S101步骤建立所述语音障碍者的常用生活用语语音与标准语音的对应关系之后,还包括: Optionally, on the basis of the foregoing embodiment, after establishing the correspondence between the common life language voice of the voice disabled person and the standard voice in the step S101, the method further includes:
定期统计所述语音障碍者的语音与标准语音的对应关系的使用频率,根据使用频率更新所述对应关系。The frequency of use of the correspondence between the voice of the voice disabled person and the standard voice is periodically counted, and the correspondence relationship is updated according to the frequency of use.
具体的,可以根据语音障碍者的语音能力的恢复过程,定期统计所述语音障碍者的语音与标准语音的对应关系的使用频率,根据使用频率来更新所述对应关系,这样便于语音障碍者对于自己习惯性的语音进行重新构建,有利于语音障碍者的语音康复,便于实现语音障碍者语言的真实意图。Specifically, according to the recovery process of the voice ability of the voice disabled person, the frequency of use of the correspondence between the voice of the voice disabled person and the standard voice may be periodically counted, and the corresponding relationship is updated according to the frequency of use, so that the voice disabled person is convenient for Reconstructing the habitual voice of oneself is conducive to the speech rehabilitation of the voice-disabled person, and is convenient for realizing the true intention of the language of the voice-disabled person.
通过定期统计语音障碍者的语音与标准语音的对应关系的使用频率,根据使用频率更新所述数据库,更好地帮助了语音障碍者进行的语音康复训练,便于实现语音障碍者语言的真实意图。By periodically counting the frequency of use of the correspondence between the voice of the voice disabled person and the standard voice, updating the database according to the frequency of use, the voice rehabilitation training performed by the voice disabled person is better assisted, and the true intention of the language of the voice disabled person is facilitated.
图2为本发明提供的一种语音识别装置实施例一的结构示意图,如图2所示,一种语音识别装置,包括语音智能处理模块10和语音识别模块20;2 is a schematic structural diagram of a first embodiment of a speech recognition apparatus according to the present invention. As shown in FIG. 2, a speech recognition apparatus includes a speech intelligent processing module 10 and a speech recognition module 20;
语音智能处理模块10,设置为建立语音障碍者的常用生活用语语音与标准语音的对应关系;The voice intelligent processing module 10 is configured to establish a correspondence between a common life language voice of the voice disabled person and a standard voice;
语音识别模块20,设置为接收语音障碍者的语音,根据建立的对应关系识别出对应的标准语音并执行所述标准语音相应的操作。The voice recognition module 20 is configured to receive the voice of the voice disabled person, identify the corresponding standard voice according to the established correspondence, and perform the corresponding operation of the standard voice.
本发明实施例提供的一种语音识别装置通过建立语音障碍者语音与标准语音的对应关系,实现了准确识别语言障碍者的语音,为语音障碍者思想意图的真实表达提供了便利,正确使受控设备进行了语音交互,更加有利于患者语言表达的恢复,树立他们对生活的信心。The voice recognition device provided by the embodiment of the present invention realizes the accurate recognition of the voice of the language disabled person by establishing the correspondence relationship between the speech disabled person's voice and the standard voice, and provides convenience for the true expression of the mentally impaired person's thought intention, and correctly corrects The voice interaction of the control device is more conducive to the recovery of the patient's language expression and establish their confidence in life.
可选地,在上述实施例的基础上,该装置还包括:语音录入模块30;Optionally, based on the foregoing embodiment, the device further includes: a voice input module 30;
语音录入模块30,设置为录入语音障碍者朗读常用生活用语的语音。The voice entry module 30 is configured to record the voice of the common life language by the voice disabled person.
本发明实施例提供的装置,可以执行上述方法实施例,其实现原理和技术效果类似,在此不再赘述。The device provided by the embodiment of the present invention may perform the foregoing method embodiments, and the implementation principles and technical effects thereof are similar, and details are not described herein again.
可选地,在上述实施例的基础上,所述语音智能处理模块,具体设置为:Optionally, on the basis of the foregoing embodiment, the voice intelligent processing module is specifically configured to:
提取所述语音障碍者的常用生活用语语音中的词组或者文字的语音,与标准语音中的词组或者文字的语音建立一一对应关系。The speech of the phrase or the text in the common life language voice of the voice disabled person is extracted, and the one-to-one correspondence is established with the voice of the phrase or the text in the standard voice.
本发明实施例提供的装置,可以执行上述方法实施例,其实现原理和技术效果类似,在此不再赘述。 The device provided by the embodiment of the present invention may perform the foregoing method embodiments, and the implementation principles and technical effects thereof are similar, and details are not described herein again.
可选地,在上述实施例的基础上,所述语音智能处理模块,还设置为:Optionally, on the basis of the foregoing embodiment, the voice intelligent processing module is further configured to:
存储建立的对应关系数据库,并上传到云服务器进行备份。Store the established relational database and upload it to the cloud server for backup.
可选地,在上述实施例的基础上,所述语音智能处理模块,还设置为:Optionally, on the basis of the foregoing embodiment, the voice intelligent processing module is further configured to:
复核所述语音障碍者的语音与标准语音的对应关系,修正所述对应关系中复核错误的对应关系。The correspondence between the voice of the voice disabled person and the standard voice is reviewed, and the corresponding relationship of the review errors in the correspondence relationship is corrected.
本发明实施例提供的装置,可以执行上述方法实施例,其实现原理和技术效果类似,在此不再赘述。The device provided by the embodiment of the present invention may perform the foregoing method embodiments, and the implementation principles and technical effects thereof are similar, and details are not described herein again.
可选地,在上述实施例的基础上,所述语音智能处理模块,还设置为:Optionally, on the basis of the foregoing embodiment, the voice intelligent processing module is further configured to:
定期统计所述语音障碍者的语音与标准语音的对应关系的使用频率,根据使用频率更新所述对应关系。The frequency of use of the correspondence between the voice of the voice disabled person and the standard voice is periodically counted, and the correspondence relationship is updated according to the frequency of use.
本发明实施例提供的装置,可以执行上述方法实施例,其实现原理和技术效果类似,在此不再赘述。The device provided by the embodiment of the present invention may perform the foregoing method embodiments, and the implementation principles and technical effects thereof are similar, and details are not described herein again.
虽然本发明所揭露的实施方式如上,但所述的内容仅为便于理解本发明而采用的实施方式,并非用以限定本发明。任何本发明所属领域内的技术人员,在不脱离本发明所揭露的精神和范围的前提下,可以在实施的形式及细节上进行任何的修改与变化,但本发明的专利保护范围,仍须以所附的权利要求书所界定的范围为准。While the embodiments of the present invention have been described above, the described embodiments are merely for the purpose of understanding the invention and are not intended to limit the invention. Any modification and variation in the form and details of the embodiments may be made by those skilled in the art without departing from the spirit and scope of the invention. The scope defined by the appended claims shall prevail.
工业实用性Industrial applicability
本发明实施例提出的语音识别方法和装置,包括:建立语音障碍者的常用生活用语语音与标准语音的对应关系;接收语音障碍者的语音,根据建立的对应关系识别出对应的标准语音并执行识别出的标准语音相应的操作,通过建立语音障碍者语音与标准语音的对应关系,实现了准确识别语言障碍者的语音,为语音障碍者思想意图的真实表达提供了便利,正确使受控设备进行了语音交互,更加有利于患者语言表达的恢复,树立他们对生活的信心。 The voice recognition method and device provided by the embodiment of the present invention comprise: establishing a correspondence between a common life language voice of a voice disabled person and a standard voice; receiving a voice of a voice disabled person, identifying a corresponding standard voice according to the established correspondence relationship, and performing The corresponding operation of the recognized standard voice realizes the accurate recognition of the voice of the language disabled by establishing the corresponding relationship between the voice of the voice disabled person and the standard voice, and provides convenience for the true expression of the mental intention of the voice disabled person, and correctly controls the controlled device. The voice interaction is more conducive to the recovery of the patient's language expression and establish their confidence in life.

Claims (13)

  1. 一种语音识别方法,包括:A speech recognition method comprising:
    建立语音障碍者的常用生活用语语音与标准语音的对应关系;Establish a correspondence between common life language voices and standard voices of voice disabled persons;
    接收语音障碍者的语音,根据建立的对应关系识别出对应的标准语音并执行识别出的标准语音相应的操作。Receiving the voice of the voice disabled person, identifying the corresponding standard voice according to the established correspondence relationship and performing the corresponding operation of the recognized standard voice.
  2. 根据权利要求1所述的语音识别方法,其中,所述建立语音障碍者的常用生活用语语音与标准语音的对应关系包括:The speech recognition method according to claim 1, wherein the correspondence between the common life language speech and the standard speech of the establishment of the speech impaired person comprises:
    提取所述语音障碍者的常用生活用语语音中的词组或者文字的语音,与标准语音中的词组或者文字的语音建立所述对应关系。The speech of the phrase or the text in the common life language voice of the voice disabled person is extracted, and the corresponding relationship is established with the voice of the phrase or the text in the standard voice.
  3. 根据权利要求1或2所述的语音识别方法,所述建立所述语音障碍者的常用生活用语语音与标准语音的对应关系之后,还包括:The speech recognition method according to claim 1 or 2, after the establishing the correspondence between the common life language voice of the voice disabled person and the standard voice, the method further includes:
    存储所述建立的对应关系,并上传到云服务器进行备份。The established correspondence is stored and uploaded to the cloud server for backup.
  4. 根据权利要求1或2所述的语音识别方法,所述建立语音障碍者的常用生活用语语音与标准语音的对应关系之后,还包括:The speech recognition method according to claim 1 or 2, after the establishing a correspondence between the common life language voice of the voice disabled person and the standard voice, the method further includes:
    复核所述语音障碍者的语音与标准语音的对应关系,修正所述对应关系中复核错误的对应关系。The correspondence between the voice of the voice disabled person and the standard voice is reviewed, and the corresponding relationship of the review errors in the correspondence relationship is corrected.
  5. 根据权利要求1或2所述的语音识别方法,所述建立所述语音障碍者的常用生活用语语音与标准语音的对应关系之后,还包括:The speech recognition method according to claim 1 or 2, after the establishing the correspondence between the common life language voice of the voice disabled person and the standard voice, the method further includes:
    定期统计所述语音障碍者的语音与标准语音的对应关系的使用频率,根据使用频率更新所述对应关系。The frequency of use of the correspondence between the voice of the voice disabled person and the standard voice is periodically counted, and the correspondence relationship is updated according to the frequency of use.
  6. 根据权利要求1所述的语音识别方法,所述建立语音障碍者的常用生活用语语音与标准语音的对应关系之前,还包括:录入所述语音障碍者朗读常用生活用语的语音。The speech recognition method according to claim 1, wherein before the establishing a correspondence between the common life language voice of the voice disabled person and the standard voice, the method further comprises: recording the voice of the voice disabled person to read the common life language.
  7. 一种语音识别装置,包括语音智能处理模块和语音识别模块;其中,A voice recognition device, comprising a voice intelligent processing module and a voice recognition module; wherein
    语音智能处理模块,设置为建立语音障碍者的常用生活用语语音与标准语音的对应关系;The voice intelligent processing module is configured to establish a correspondence between the common life language voice of the voice disabled person and the standard voice;
    语音识别模块,设置为接收语音障碍者的语音,根据建立的对应关系识 别出对应的标准语音并执行识别出的标准语音相应的操作。a voice recognition module, configured to receive a voice of a voice disabled person, according to the established correspondence relationship Do not output the corresponding standard voice and perform the corresponding operation of the recognized standard voice.
  8. 根据权利要求7所述的语音识别装置,所述语音智能处理模块具体设置为:The speech recognition apparatus according to claim 7, wherein the speech intelligent processing module is specifically configured to:
    提取所述语音障碍者的常用生活用语语音中的词组或者文字的语音,与标准语音中的词组或者文字的语音建立所述对应关系。The speech of the phrase or the text in the common life language voice of the voice disabled person is extracted, and the corresponding relationship is established with the voice of the phrase or the text in the standard voice.
  9. 根据权利要求7或8所述的语音识别装置,所述语音智能处理模块还设置为:The speech recognition apparatus according to claim 7 or 8, wherein the speech intelligent processing module is further configured to:
    存储所述建立的对应关系,并上传到云服务器进行备份。The established correspondence is stored and uploaded to the cloud server for backup.
  10. 根据权利要求7或8所述的语音识别装置,所述语音智能处理模块还设置为:The speech recognition apparatus according to claim 7 or 8, wherein the speech intelligent processing module is further configured to:
    复核所述语音障碍者的语音与标准语音的对应关系,修正所述对应关系中复核错误的对应关系。The correspondence between the voice of the voice disabled person and the standard voice is reviewed, and the corresponding relationship of the review errors in the correspondence relationship is corrected.
  11. 根据权利要求7或8所述的语音识别装置,所述语音智能处理模块还设置为:The speech recognition apparatus according to claim 7 or 8, wherein the speech intelligent processing module is further configured to:
    定期统计所述语音障碍者的语音与标准语音的对应关系的使用频率,根据使用频率更新所述数据库。The frequency of use of the correspondence between the voice of the voice disabled person and the standard voice is periodically counted, and the database is updated according to the frequency of use.
  12. 根据权利要求7所述的语音识别装置,还包括:语音录入模块,设置为录入所述语音障碍者朗读常用生活用语的语音。The speech recognition apparatus according to claim 7, further comprising: a voice entry module configured to input the voice of the voice disabled person to read the common life language.
  13. 一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令用于执行权1~权6任一项的语音识别方法。 A computer readable storage medium storing computer executable instructions for performing the speech recognition method of any one of claims 1 to 6.
PCT/CN2016/083516 2016-04-06 2016-05-26 Speech recognition method and device WO2017173721A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610211607.3 2016-04-06
CN201610211607.3A CN107274886B (en) 2016-04-06 2016-04-06 Voice recognition method and device

Publications (1)

Publication Number Publication Date
WO2017173721A1 true WO2017173721A1 (en) 2017-10-12

Family

ID=60000784

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/083516 WO2017173721A1 (en) 2016-04-06 2016-05-26 Speech recognition method and device

Country Status (2)

Country Link
CN (1) CN107274886B (en)
WO (1) WO2017173721A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108174030B (en) * 2017-12-26 2020-11-17 努比亚技术有限公司 Customized voice control implementation method, mobile terminal and readable storage medium
CN108089836A (en) * 2017-12-29 2018-05-29 上海与德科技有限公司 A kind of assisted learning method and robot based on robot
CN108447473A (en) * 2018-03-06 2018-08-24 深圳市沃特沃德股份有限公司 Voice translation method and device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1760976A (en) * 2005-11-08 2006-04-19 高丹 Hand held type auxiliary apparatus for language barrier
CN101281745A (en) * 2008-05-23 2008-10-08 深圳市北科瑞声科技有限公司 Interactive system for vehicle-mounted voice
CN101464729A (en) * 2009-01-05 2009-06-24 清华大学 Independent desire expression method based on auditory sense cognition neural signal
CN101599270A (en) * 2008-06-02 2009-12-09 海尔集团公司 Voice server and voice control method
CN102036033A (en) * 2010-12-31 2011-04-27 Tcl集团股份有限公司 Method for remotely controlling television with voice and remote voice control
CN102074234A (en) * 2009-11-19 2011-05-25 财团法人资讯工业策进会 Voice variation model building device and method as well as voice recognition system and method
CN103236261A (en) * 2013-04-02 2013-08-07 四川长虹电器股份有限公司 Speaker-dependent voice recognizing method

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1901041B (en) * 2005-07-22 2011-08-31 康佳集团股份有限公司 Voice dictionary forming method and voice identifying system and its method
EP2553679A2 (en) * 2010-03-30 2013-02-06 NVOQ Incorporated Hierarchical quick note to allow dictated code phrases to be transcribed to standard clauses
US9443507B2 (en) * 2013-07-15 2016-09-13 GM Global Technology Operations LLC System and method for controlling a speech recognition system
CN104992707A (en) * 2015-05-19 2015-10-21 四川大学 Cleft palate voice glottal stop automatic identification algorithm and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1760976A (en) * 2005-11-08 2006-04-19 高丹 Hand held type auxiliary apparatus for language barrier
CN101281745A (en) * 2008-05-23 2008-10-08 深圳市北科瑞声科技有限公司 Interactive system for vehicle-mounted voice
CN101599270A (en) * 2008-06-02 2009-12-09 海尔集团公司 Voice server and voice control method
CN101464729A (en) * 2009-01-05 2009-06-24 清华大学 Independent desire expression method based on auditory sense cognition neural signal
CN102074234A (en) * 2009-11-19 2011-05-25 财团法人资讯工业策进会 Voice variation model building device and method as well as voice recognition system and method
CN102036033A (en) * 2010-12-31 2011-04-27 Tcl集团股份有限公司 Method for remotely controlling television with voice and remote voice control
CN103236261A (en) * 2013-04-02 2013-08-07 四川长虹电器股份有限公司 Speaker-dependent voice recognizing method

Also Published As

Publication number Publication date
CN107274886B (en) 2021-10-15
CN107274886A (en) 2017-10-20

Similar Documents

Publication Publication Date Title
Shillingford et al. Large-scale visual speech recognition
US11514891B2 (en) Named entity recognition method, named entity recognition equipment and medium
EP3179475A1 (en) Voice wakeup method, apparatus and system
JP2017058674A (en) Apparatus and method for speech recognition, apparatus and method for training transformation parameter, computer program and electronic apparatus
CN113327609B (en) Method and apparatus for speech recognition
US20030130847A1 (en) Method of training a computer system via human voice input
WO2017166966A9 (en) Method and apparatus for constructing speech decoding network in digital speech recognition, and storage medium
EP2940684A1 (en) Voice recognizing method and system for personalized user information
US20150081270A1 (en) Speech translation apparatus, speech translation method, and non-transitory computer readable medium thereof
CN110648690A (en) Audio evaluation method and server
WO2019218467A1 (en) Method and apparatus for dialect recognition in voice and video calls, terminal device, and medium
CN109461436A (en) A kind of correcting method and system of speech recognition pronunciation mistake
US10366173B2 (en) Device and method of simultaneous interpretation based on real-time extraction of interpretation unit
CN105261246A (en) Spoken English error correcting system based on big data mining technology
CN112818680B (en) Corpus processing method and device, electronic equipment and computer readable storage medium
WO2017173721A1 (en) Speech recognition method and device
US10395645B2 (en) Method, apparatus, and computer-readable recording medium for improving at least one semantic unit set
CN108595406B (en) User state reminding method and device, electronic equipment and storage medium
Mohammed et al. Quranic verses verification using speech recognition techniques
CN105869622B (en) Chinese hot word detection method and device
CN109686365B (en) Voice recognition method and voice recognition system
CN107886940B (en) Voice translation processing method and device
CN109074809B (en) Information processing apparatus, information processing method, and computer-readable storage medium
KR102596521B1 (en) Method and system for analyzing language development disorder and behavior development disorder by processing video information input to the camera and audio information input to the microphone in real time
CN113870857A (en) Voice control scene method and voice control scene system

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16897654

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 16897654

Country of ref document: EP

Kind code of ref document: A1