CN109360565A - A method of precision of identifying speech is improved by establishing resources bank - Google Patents

A method of precision of identifying speech is improved by establishing resources bank Download PDF

Info

Publication number
CN109360565A
CN109360565A CN201811508983.4A CN201811508983A CN109360565A CN 109360565 A CN109360565 A CN 109360565A CN 201811508983 A CN201811508983 A CN 201811508983A CN 109360565 A CN109360565 A CN 109360565A
Authority
CN
China
Prior art keywords
scene
input
voice
voice messaging
resource
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811508983.4A
Other languages
Chinese (zh)
Inventor
杨铭
许斌锋
戚群朗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Electric Power Information Technology Co Ltd
Original Assignee
Jiangsu Electric Power Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Electric Power Information Technology Co Ltd filed Critical Jiangsu Electric Power Information Technology Co Ltd
Priority to CN201811508983.4A priority Critical patent/CN109360565A/en
Publication of CN109360565A publication Critical patent/CN109360565A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/226Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics
    • G10L2015/228Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics of application context

Abstract

The method that the invention discloses a kind of to improve precision of identifying speech by establishing resources bank, configures proprietary identification resource corresponding with customized voice scene, and universal identification resource corresponding with universal phonetic scene;Establishing includes the proprietary speech recognition library for identifying resource and the universal identification resource, to identify the voice messaging using the speech recognition library according to the input scene of voice messaging.The present invention, which is realized, carries out speech recognition according to identification resource corresponding with voice input scene, improves accuracy of identification and treatment effeciency.

Description

A method of precision of identifying speech is improved by establishing resources bank
Technical field
The application belongs to technical field of voice recognition, more particularly to a kind of by establishing resources bank raising precision of identifying speech Method.
Background technique
With the development of mobile internet, large screen cell phone is at mainstream, no matter keyboard or hand-written, have various limitations. Phonitic entry method will become mainstream input method, more favourable.Since voice input is more natural, learning cost is lower, slowly by more Multi-user is received.Either child or old man can quickly learn to use, and get used to this input mode.
Existing speech recognition technology has used a large amount of living scene data for training, defeated under different scenes to identify The voice entered, thus it is too low for some customization scene Recognition precision, it can not be identified for some customization scenes, waste processing Resource reduces treatment effeciency.
Summary of the invention
Of the existing technology in order to overcome the problems, such as, the object of the present invention is to provide one kind to improve language by establishing resources bank The method of sound accuracy of identification realizes and carries out speech recognition according to identification resource corresponding with voice input scene, improves knowledge Other precision and treatment effeciency.
The purpose of the present invention is achieved through the following technical solutions:
A method of precision of identifying speech is improved by establishing resources bank, it is characterised in that the following steps are included:
Step 101, proprietary identification resource corresponding with customized voice scene is configured, and corresponding with universal phonetic scene general Identify resource;
Step 102, establishing includes the proprietary speech recognition library for identifying resource and the universal identification resource, according to voice The input scene of information identifies the voice messaging using the speech recognition library.
Step 101 is specifically: when user needs to carry out voice input, to man machine language's input interface input voice letter Then breath is identified the voice messaging of user's input, to be performed corresponding processing based on recognition result;Different voices It is different to carry out respective treated process based on recognition result for input application;
It is searched according to recognition result to user feedback after being identified to the voice messaging of user's input for phonetic search application Hitch fruit;Alternatively, being directed to instant messaging application, after being identified to the voice messaging of user's input, converted according to recognition result It is shown in input frame at text information;
For the voice messaging inputted under different scenes, proprietary identification resource corresponding with customized voice scene, Yi Jiyu are configured The corresponding universal identification resource of universal phonetic scene;
Step 102 is specifically: according to preconfigured proprietary identification resource corresponding with customized voice scene, and and common language The corresponding universal identification resource of sound field scape, establishing includes the proprietary speech recognition for identifying resource and the universal identification resource Library;When receiving the voice messaging of user's input, the input scene of voice messaging is determined, and determine the input field of voice messaging The type of scape, i.e. input scene are customized voice scene or universal phonetic scene, to obtain and input from speech recognition library The corresponding identification resource of scene type identifies the voice messaging of input.
The present invention, which is realized, carries out speech recognition according to identification resource corresponding with voice input scene, improves identification essence Degree and treatment effeciency.
Detailed description of the invention
Fig. 1 is flow chart of the invention;
The flow chart of the audio recognition method of another embodiment of Fig. 2.
The flow chart of the audio recognition method of another embodiment of Fig. 3.
Specific embodiment
Embodiments herein is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end Same or similar label indicates same or similar element or element with the same or similar functions.Below with reference to attached The embodiment of figure description is exemplary, it is intended to for explaining the application, and should not be understood as the limitation to the application.
A method of precision of identifying speech being improved by establishing resources bank, as shown in Figure 1, the audio recognition method packet It includes:
Step 101, proprietary identification resource corresponding with customized voice scene is configured, and corresponding with universal phonetic scene general Identify resource.Specifically, audio recognition method provided in an embodiment of the present invention is applied to that there is the terminal of speech voice input function to set In standby.Under normal circumstances, terminal device realizes speech voice input function by man machine language's interactive interface, and specific voice input connects Mouth can be the equipment such as microphone.
It should be noted that terminal device can be mentioned by being able to access that the application of man machine language's interactive interface for user It inputs and services for voice, which can be selected according to actual needs, such as: the navigation with speech voice input function is answered With, search engine etc., the present embodiment to this with no restriction.When user needs to carry out voice input, connect to man machine language's input Then mouth input voice information identifies the voice messaging of user's input, to be located accordingly based on recognition result Reason.Different voices inputs application, and it is different to carry out respective treated process based on recognition result.Such as:
It is searched according to recognition result to user feedback after being identified to the voice messaging of user's input for phonetic search application Hitch fruit;Alternatively, being directed to instant messaging application, after being identified to the voice messaging of user's input, converted according to recognition result It is shown in input frame at text information.
For the voice messaging inputted under different scenes, in order to improve the precision and process performance of speech recognition, this implementation The speech recognition modeling that example provides configures proprietary identification resource corresponding with customized voice scene first, and with common language sound field The corresponding universal identification resource of scape.
It should be noted that the type of customized voice scene has very much, different customized voice scenes corresponds to different special There is identification resource, particular content can be configured and select according to the needs of different application scene, and the present embodiment does not do this Limitation, such as may include:
For the voice scene of digital map navigation, corresponding proprietary identification resource is place name identification resource;Alternatively, being directed to electric business platform Voice scene, corresponding proprietary identification resource be electric business product name identify resource;Or the voice field for film search Scape, corresponding proprietary identification resource are that movie name identifies resource.
Step 102, establishing includes the proprietary speech recognition library for identifying resource and the universal identification resource, with basis The input scene of voice messaging identifies the voice messaging using the speech recognition library.Specifically, according to it is preconfigured with The corresponding proprietary identification resource of customized voice scene, and universal identification resource corresponding with universal phonetic scene, foundation includes The speech recognition library of proprietary the identification resource and the universal identification resource.In turn, when the voice letter for receiving user's input When breath, the input scene of voice messaging is determined, and determine the type of the input scene of voice messaging, i.e., input scene is customization language Sound field scape or universal phonetic scene, to obtain identification resource corresponding with input scene type to input from speech recognition library Voice messaging identified.
The audio recognition method of the embodiment of the present application, by configuring proprietary identification resource corresponding with customized voice scene, And universal identification resource corresponding with universal phonetic scene;Establish includes that the proprietary identification resource and the universal identification provide The speech recognition library in source, to identify the voice messaging using the speech recognition library according to the input scene of voice messaging.By This, realizes the customization that environment-identification is carried out for different vertical class scenes, according to identification resource corresponding with voice input scene Speech recognition is carried out, accuracy of identification and treatment effeciency are improved.
Fig. 2 is the flow chart of the audio recognition method of the application another embodiment.As shown in Fig. 2, step 102 it Afterwards, can with the following steps are included:
Step 201, the voice messaging of input is received.
Step 202, according to the determining input scene with the voice messaging of preset scene acquisition strategy.
Specifically, the voice messaging for receiving user's input, according to preset scene acquisition strategy it is determining with it is currently received The corresponding input scene of voice messaging.It is obtained it should be noted that different scenes can be preset according to the actual application Take strategy, the present embodiment with no restriction, such as may include: to this
Example one: the input scene of the voice messaging is determined according to application program;
Specifically, the application program that voice input is currently carried out according to user determines the input scene of the voice messaging.Example Such as: user is to digital map navigation application input voice information, it is determined that the input scene of the voice messaging is digital map navigation.
Example two: the input scene of the voice messaging is based on context determined;
Specifically, the input scene of the voice messaging is determined according to the context of user and other users session log.Example Such as: in instant messaging application, user is convenient content of travelling with the conversation content before other users, then the voice letter The input scene of breath is tourism scene.
Example three: the input scene of the voice messaging is determined according to geographical location information.
Specifically, the current geographical location information of user is obtained by the GPS information of terminal device, and then according to user Current geographical location information determines the input scene of the voice messaging.Such as: it is obtained when by the GPS information of terminal device When the current geographical location information of user is movie theatre, then the input scene of the voice messaging is film scene.
Step 203, the voice messaging of input is identified according to the input scene and the speech recognition library.
Specifically, according to the input scene of current speech information, and the speech recognition library that pre-establishes is to the language of input Message breath is identified, is specifically included:
If the input scene of current speech is preparatory customized voice scene, obtained and the customization language from speech recognition library The corresponding proprietary identification resource of sound field scape, and the voice messaging is identified using proprietary identification resource;
If the input scene of current speech is not preparatory customized voice scene, universal identification money is obtained from speech recognition library Source, and the voice messaging is identified using proprietary identification resource.
Based on embodiment illustrated in fig. 1, the audio recognition method of the embodiment of the present application is further advanced by the language for receiving input Message breath, according to the determining input scene with the voice messaging of preset scene acquisition strategy, according to the input scene and The speech recognition library identifies the voice messaging of input.Hereby it is achieved that according to knowledge corresponding with voice input scene Other resource carries out speech recognition, improves accuracy of identification and treatment effeciency.
Fig. 3 is that the flow chart of the audio recognition method of the application another embodiment is described as follows referring to Fig. 3:
Step 1: after receiving voice messaging, judging whether to believe according to preset scene acquisition strategy is determining with the voice The input scene of breath.
Step 2: if the input scene of voice messaging can not be determined, using the universal identification resource to the voice Information is identified.
Step 3: if can determine the input scene of voice messaging, judging whether it is the voice scene customized in advance.
Step 4: if the input scene is preparatory customized voice scene, using in the speech recognition library with it is described fixed The corresponding proprietary identification resource of voice scene processed, identifies the voice messaging;
Step 5: if the input scene is not customized voice scene, using the universal identification in the speech recognition library Resource identifies the voice messaging.
Based on embodiment illustrated in fig. 1, the audio recognition method of the embodiment of the present application is further advanced by the language for receiving input Message breath, according to the determining input scene with the voice messaging of preset scene acquisition strategy, according to the input scene and The speech recognition library identifies the voice messaging of input.Hereby it is achieved that according to knowledge corresponding with voice input scene Other resource carries out speech recognition, improves accuracy of identification and treatment effeciency.

Claims (4)

1. a kind of method for improving precision of identifying speech by establishing resources bank, it is characterised in that the following steps are included:
Step 101, proprietary identification resource corresponding with customized voice scene is configured, and corresponding with universal phonetic scene general Identify resource;
Step 102, establishing includes the proprietary speech recognition library for identifying resource and the universal identification resource, according to voice The input scene of information identifies the voice messaging using the speech recognition library.
2. the method according to claim 1 for improving precision of identifying speech by establishing resources bank, it is characterised in that: step 101 are specifically: when user needs to carry out voice input, to man machine language's input interface input voice information, then to user The voice messaging of input is identified, to be performed corresponding processing based on recognition result;Different voices inputs application, is based on It is different that recognition result carries out respective treated process;
It is searched according to recognition result to user feedback after being identified to the voice messaging of user's input for phonetic search application Hitch fruit;Alternatively, being directed to instant messaging application, after being identified to the voice messaging of user's input, converted according to recognition result It is shown in input frame at text information;
For the voice messaging inputted under different scenes, proprietary identification resource corresponding with customized voice scene, Yi Jiyu are configured The corresponding universal identification resource of universal phonetic scene;
Step 102 is specifically: according to preconfigured proprietary identification resource corresponding with customized voice scene, and and common language The corresponding universal identification resource of sound field scape, establishing includes the proprietary speech recognition for identifying resource and the universal identification resource Library;When receiving the voice messaging of user's input, the input scene of voice messaging is determined, and determine the input field of voice messaging The type of scape, i.e. input scene are customized voice scene or universal phonetic scene, to obtain and input from speech recognition library The corresponding identification resource of scene type identifies the voice messaging of input.
3. the method according to claim 2 for improving precision of identifying speech by establishing resources bank, it is characterised in that: in step It is further comprising the steps of after rapid 102:
Step 201, the voice messaging of input is received;
Step 202, according to the determining input scene with the voice messaging of preset scene acquisition strategy;
Step 203, the voice messaging of input is identified according to the input scene and the speech recognition library.
4. the method according to claim 2 for improving precision of identifying speech by establishing resources bank, it is characterised in that: in step It is further comprising the steps of after rapid 102:
Step 1: after receiving voice messaging, judging whether to believe according to preset scene acquisition strategy is determining with the voice The input scene of breath;
Step 2: if the input scene of voice messaging can not be determined, using the universal identification resource to the voice messaging It is identified;
Step 3: if can determine the input scene of voice messaging, judging whether it is the voice scene customized in advance;
Step 4: if the input scene is preparatory customized voice scene, using in the speech recognition library with the customization language The corresponding proprietary identification resource of sound field scape, identifies the voice messaging;
Step 5: if the input scene is not customized voice scene, using the universal identification in the speech recognition library Resource identifies the voice messaging.
CN201811508983.4A 2018-12-11 2018-12-11 A method of precision of identifying speech is improved by establishing resources bank Pending CN109360565A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811508983.4A CN109360565A (en) 2018-12-11 2018-12-11 A method of precision of identifying speech is improved by establishing resources bank

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811508983.4A CN109360565A (en) 2018-12-11 2018-12-11 A method of precision of identifying speech is improved by establishing resources bank

Publications (1)

Publication Number Publication Date
CN109360565A true CN109360565A (en) 2019-02-19

Family

ID=65332116

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811508983.4A Pending CN109360565A (en) 2018-12-11 2018-12-11 A method of precision of identifying speech is improved by establishing resources bank

Country Status (1)

Country Link
CN (1) CN109360565A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110544480A (en) * 2019-09-05 2019-12-06 苏州思必驰信息科技有限公司 Voice recognition resource switching method and device
CN111048091A (en) * 2019-12-30 2020-04-21 苏州思必驰信息科技有限公司 Voice recognition method, voice recognition equipment and computer readable storage medium
CN112687261A (en) * 2020-12-15 2021-04-20 苏州思必驰信息科技有限公司 Speech recognition training and application method and device
US11289095B2 (en) 2019-12-30 2022-03-29 Yandex Europe Ag Method of and system for translating speech to text
CN116386644A (en) * 2023-06-07 2023-07-04 百融至信(北京)科技有限公司 Route control method and device for ASR resource side

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103674012A (en) * 2012-09-21 2014-03-26 高德软件有限公司 Voice customizing method and device and voice identification method and device
CN105448292A (en) * 2014-08-19 2016-03-30 北京羽扇智信息科技有限公司 Scene-based real-time voice recognition system and method
CN105719649A (en) * 2016-01-19 2016-06-29 百度在线网络技术(北京)有限公司 Voice recognition method and device
WO2017219495A1 (en) * 2016-06-21 2017-12-28 宇龙计算机通信科技(深圳)有限公司 Speech recognition method and system
CN107644642A (en) * 2017-09-20 2018-01-30 广东欧珀移动通信有限公司 Method for recognizing semantics, device, storage medium and electronic equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103674012A (en) * 2012-09-21 2014-03-26 高德软件有限公司 Voice customizing method and device and voice identification method and device
CN105448292A (en) * 2014-08-19 2016-03-30 北京羽扇智信息科技有限公司 Scene-based real-time voice recognition system and method
CN105719649A (en) * 2016-01-19 2016-06-29 百度在线网络技术(北京)有限公司 Voice recognition method and device
WO2017219495A1 (en) * 2016-06-21 2017-12-28 宇龙计算机通信科技(深圳)有限公司 Speech recognition method and system
CN107644642A (en) * 2017-09-20 2018-01-30 广东欧珀移动通信有限公司 Method for recognizing semantics, device, storage medium and electronic equipment

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110544480A (en) * 2019-09-05 2019-12-06 苏州思必驰信息科技有限公司 Voice recognition resource switching method and device
CN110544480B (en) * 2019-09-05 2022-03-11 思必驰科技股份有限公司 Voice recognition resource switching method and device
CN111048091A (en) * 2019-12-30 2020-04-21 苏州思必驰信息科技有限公司 Voice recognition method, voice recognition equipment and computer readable storage medium
US11289095B2 (en) 2019-12-30 2022-03-29 Yandex Europe Ag Method of and system for translating speech to text
CN112687261A (en) * 2020-12-15 2021-04-20 苏州思必驰信息科技有限公司 Speech recognition training and application method and device
CN116386644A (en) * 2023-06-07 2023-07-04 百融至信(北京)科技有限公司 Route control method and device for ASR resource side
CN116386644B (en) * 2023-06-07 2024-03-22 百融至信(北京)科技有限公司 Route control method and device for ASR resource side

Similar Documents

Publication Publication Date Title
CN105719649B (en) Audio recognition method and device
CN109360565A (en) A method of precision of identifying speech is improved by establishing resources bank
CN112804400B (en) Customer service call voice quality inspection method and device, electronic equipment and storage medium
CN102802114B (en) Method and system for screening seat by using voices
CN109753560B (en) Information processing method and device of intelligent question-answering system
CN106328124A (en) Voice recognition method based on user behavior characteristics
CN103699530A (en) Method and equipment for inputting texts in target application according to voice input information
CN111583931A (en) Service data processing method and device
CN104751847A (en) Data acquisition method and system based on overprint recognition
CN111027291A (en) Method and device for adding punctuation marks in text and training model and electronic equipment
CN110517668A (en) A kind of Chinese and English mixing voice identifying system and method
CN109545203A (en) Audio recognition method, device, equipment and storage medium
CN104731874A (en) Evaluation information generation method and device
CN114155853A (en) Rejection method, device, equipment and storage medium
CN111178081A (en) Semantic recognition method, server, electronic device and computer storage medium
CN111354362A (en) Method and device for assisting hearing-impaired communication
CN113901837A (en) Intention understanding method, device, equipment and storage medium
CN112242143B (en) Voice interaction method and device, terminal equipment and storage medium
CN110047473B (en) Man-machine cooperative interaction method and system
CN112837672A (en) Method and device for determining conversation affiliation, electronic equipment and storage medium
KR20210065629A (en) Chatbot integration agent platform system and service method thereof
CN108717851A (en) A kind of audio recognition method and device
CN112002325B (en) Multi-language voice interaction method and device
CN111554300B (en) Audio data processing method, device, storage medium and equipment
CN112331201A (en) Voice interaction method and device, storage medium and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190219

RJ01 Rejection of invention patent application after publication