CN105719649A - Voice recognition method and device - Google Patents

Voice recognition method and device Download PDF

Info

Publication number
CN105719649A
CN105719649A CN201610035394.3A CN201610035394A CN105719649A CN 105719649 A CN105719649 A CN 105719649A CN 201610035394 A CN201610035394 A CN 201610035394A CN 105719649 A CN105719649 A CN 105719649A
Authority
CN
China
Prior art keywords
scene
voice messaging
input
identification resource
speech recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610035394.3A
Other languages
Chinese (zh)
Other versions
CN105719649B (en
Inventor
穆向禹
张东栋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201610035394.3A priority Critical patent/CN105719649B/en
Publication of CN105719649A publication Critical patent/CN105719649A/en
Application granted granted Critical
Publication of CN105719649B publication Critical patent/CN105719649B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a voice recognition method and a device. The method comprises steps of configuring special recognition resources corresponding to customized voice scenes, and general recognition resources corresponding to general voice scenes; and establishing a voice recognition base including the special recognition resources and the general recognition resources so as to use the voice recognition base to recognize voice information according to input scenes of the voice information. According to the invention, voice recognition for recognition resources corresponding to the voice input scenes is achieved and recognition precision and processing efficiency are improved.

Description

Audio recognition method and device
Technical field
The application relates to technical field of voice recognition, particularly relates to a kind of audio recognition method and device.
Background technology
Along with the development of mobile Internet, giant-screen mobile phone becomes main flow, no matter keyboard or hand-written, all has various restriction.Phonitic entry method will become main flow input method, more favourable.Owing to phonetic entry is more natural, learning cost is lower, is slowly accepted by more users.No matter it is child or old man, can both quickly learn to use, and this input mode accustomed to using.
Existing speech recognition technology employs substantial amounts of living scene data for training, to identify the voice of input under different scene, thus too low for some customization scene Recognition precision, for some customization scene None-identified, waste process resource, reduce treatment effeciency.
Summary of the invention
One of technical problem that the application is intended to solve in correlation technique at least to a certain extent.
For this, first purpose of the application is in that to propose a kind of audio recognition method, the method achieves and carries out speech recognition according to the identification resource corresponding with phonetic entry scene, improves accuracy of identification and treatment effeciency.
Second purpose of the application is in that to propose a kind of speech recognition equipment.
For reaching above-mentioned purpose, the application first aspect embodiment proposes a kind of audio recognition method, including: configure the proprietary identification resource corresponding with customized voice scene and the universal identification resource corresponding with universal phonetic scene;Set up the speech recognition library including described proprietary identification resource and described universal identification resource, with the input scene according to voice messaging, adopt voice messaging described in described speech recognition library identification.
The audio recognition method of the embodiment of the present application, by configuring the proprietary identification resource corresponding with customized voice scene, and the universal identification resource corresponding with universal phonetic scene;Set up the speech recognition library including described proprietary identification resource and described universal identification resource, with the input scene according to voice messaging, adopt voice messaging described in described speech recognition library identification.Hereby it is achieved that carry out speech recognition according to the identification resource corresponding with phonetic entry scene, improve accuracy of identification and treatment effeciency.
For reaching above-mentioned purpose, the application second aspect embodiment proposes a kind of speech recognition equipment, including: configuration module, for configuring the proprietary identification resource corresponding with customized voice scene and the universal identification resource corresponding with universal phonetic scene;Set up module, for setting up the speech recognition library including described proprietary identification resource and described universal identification resource, with the input scene according to voice messaging, adopt voice messaging described in described speech recognition library identification.
The speech recognition equipment of the embodiment of the present application, by configuring the proprietary identification resource corresponding with customized voice scene, and the universal identification resource corresponding with universal phonetic scene;Set up the speech recognition library including described proprietary identification resource and described universal identification resource, with the input scene according to voice messaging, adopt voice messaging described in described speech recognition library identification.Hereby it is achieved that carry out speech recognition according to the identification resource corresponding with phonetic entry scene, improve accuracy of identification and treatment effeciency.
Accompanying drawing explanation
The present invention above-mentioned and/or that add aspect and advantage will be apparent from easy to understand from the following description of the accompanying drawings of embodiments, wherein:
Fig. 1 is the flow chart of the audio recognition method of one embodiment of the application;
Fig. 2 is the flow chart of the audio recognition method of another embodiment of the application;
Fig. 3 is the flow chart of the audio recognition method of another embodiment of the application;
Fig. 4 is the structural representation of the speech recognition equipment of one embodiment of the application;
Fig. 5 is the structural representation of the speech recognition equipment of another embodiment of the application.
Detailed description of the invention
Being described below in detail embodiments herein, the example of described embodiment is shown in the drawings, and wherein same or similar label represents same or similar element or has the element of same or like function from start to finish.The embodiment described below with reference to accompanying drawing is illustrative of, it is intended to be used for explaining the application, and it is not intended that restriction to the application.
Below with reference to the accompanying drawings audio recognition method and the device of the embodiment of the present application are described.
Fig. 1 is the flow chart of the audio recognition method of one embodiment of the application.
As it is shown in figure 1, this audio recognition method includes:
Step 101, configures the proprietary identification resource corresponding with customized voice scene and the universal identification resource corresponding with universal phonetic scene.
Specifically, the audio recognition method that the embodiment of the present invention provides is applied to be had in the terminal unit of speech voice input function.Generally, terminal unit realizes speech voice input function by man machine language's interactive interface, and concrete speech input interface can be the equipment such as mike.
It should be noted that, what terminal unit can pass through to be able to access that man machine language's interactive interface should for providing the user phonetic entry service, this application can select according to actual needs, such as: having the navigation application of speech voice input function, search engine etc., this is not limited as by the present embodiment.
When user needs to carry out phonetic entry, to man machine language's input interface input voice information, then the voice messaging of user's input is identified, in order to process accordingly based on recognition result.Different phonetic entry application, the process carrying out respective handling based on recognition result is different.Such as:
Apply for phonetic search, after the voice messaging of user's input is identified, according to recognition result to user feedback Search Results;Or,
Applying for instant messaging, the voice messaging that user is inputted converts word-information display in input frame according to recognition result after being identified.
For the voice messaging of input under different scenes, in order to improve precision and the process performance of speech recognition, first the speech recognition modeling that the present embodiment provides configures the proprietary identification resource corresponding with customized voice scene and the universal identification resource corresponding with universal phonetic scene.
It should be noted that, the type of customized voice scene has a lot, the proprietary identification resource that different customized voice scenes is corresponding different, and particular content can be configured according to the needs of different application scene and select, the present embodiment is without limitation, for instance may include that
For the voice scene of digital map navigation, corresponding proprietary identification resource is place name identification resource;Or,
For the voice scene of electricity business's platform, corresponding proprietary identification resource is electricity business's trade name identification resource;Or,
For the voice scene of film search, corresponding proprietary identification resource is movie name identification resource.
Step 102, sets up the speech recognition library including described proprietary identification resource and described universal identification resource, with the input scene according to voice messaging, adopts voice messaging described in described speech recognition library identification.
Specifically, according to the proprietary identification resource corresponding with customized voice scene being pre-configured with, and the universal identification resource corresponding with universal phonetic scene, set up the speech recognition library including described proprietary identification resource and described universal identification resource.
And then, when receiving the voice messaging of user's input, determine the input scene of voice messaging, and determine the type of the input scene of voice messaging, namely input scene is customized voice scene or universal phonetic scene, thus obtaining the identification resource corresponding with input scene type from speech recognition library, the voice messaging of input is identified.
The audio recognition method of the embodiment of the present application, by configuring the proprietary identification resource corresponding with customized voice scene, and the universal identification resource corresponding with universal phonetic scene;Set up the speech recognition library including described proprietary identification resource and described universal identification resource, with the input scene according to voice messaging, adopt voice messaging described in described speech recognition library identification.Hereby it is achieved that be identified the customization of environment for different vertical class scenes, carry out speech recognition according to the identification resource corresponding with phonetic entry scene, improve accuracy of identification and treatment effeciency.
Fig. 2 is the flow chart of the audio recognition method of another embodiment of the application.
As in figure 2 it is shown, after step 102, it is also possible to comprise the following steps:
Step 201, receives the voice messaging of input.
Step 202, determines the input scene with described voice messaging according to default scene acquisition strategy.
Specifically, receive the voice messaging of user's input, determine the input scene corresponding with currently received voice messaging according to default scene acquisition strategy.
It should be noted that can need to pre-set different scene acquisition strategies according to practical application, this is not limited as by the present embodiment, for instance may include that
Example one: determine the input scene of described voice messaging according to application program;
Specifically, the application program currently carrying out phonetic entry according to user determines the input scene of described voice messaging.Such as: user is to digital map navigation application input voice information, it is determined that the input scene of described voice messaging is digital map navigation.
Or,
Example two: based on context determine the input scene of described voice messaging;
Specifically, the input scene of described voice messaging is determined according to the context of user Yu other user session records.Such as: in instant communications applications, the conversation content before user and other users is convenient content of travelling, then the input scene of described voice messaging is tourism scene.
Or,
Example three: determine the input scene of described voice messaging according to geographical location information.
Specifically, obtain, by the GPS information of terminal unit, the geographical location information that user is current, and then determine the input scene of described voice messaging according to the geographical location information that user is current.Such as: when the geographical location information that the GPS information acquisition user by terminal unit is current is movie theatre, then the input scene of described voice messaging is film scene.
Step 203, is identified the voice messaging of input according to described input scene and described speech recognition library.
Specifically, the input scene according to current speech information, and the speech recognition library pre-build to input voice messaging be identified, specifically include:
If the input scene of current speech is customized voice scene in advance, then from speech recognition library, obtain the proprietary identification resource corresponding with described customized voice scene, and apply proprietary identification resource described voice messaging is identified;
If the input scene of current speech is not customized voice scene in advance, from speech recognition library, obtains universal identification resource, and apply proprietary identification resource described voice messaging is identified.
Based on embodiment illustrated in fig. 1, the audio recognition method of the embodiment of the present application, it is further advanced by the voice messaging receiving input, determine the input scene with described voice messaging according to default scene acquisition strategy, according to described input scene and described speech recognition library, the voice messaging of input is identified.Hereby it is achieved that carry out speech recognition according to the identification resource corresponding with phonetic entry scene, improve accuracy of identification and treatment effeciency.
Fig. 3 is the flow chart of the audio recognition method of another embodiment of the application, referring to Fig. 3, is described as follows:
Step 1: after receiving voice messaging, it may be judged whether the input scene with described voice messaging can be determined according to default scene acquisition strategy.
Step 2: if the input scene of voice messaging can not be determined, then apply described universal identification resource and described voice messaging be identified.
Step 3: if can determine the input scene of voice messaging, then determine whether the voice scene customized in advance.
Step 4: if described input scene is customized voice scene in advance, then applies proprietary identification resource corresponding with described customized voice scene in described speech recognition library, described voice messaging is identified;
Step 5: if described input scene is not customized voice scene, then apply the described universal identification resource in described speech recognition library, described voice messaging be identified.
Based on embodiment illustrated in fig. 1, the audio recognition method of the embodiment of the present application, it is further advanced by the voice messaging receiving input, determine the input scene with described voice messaging according to default scene acquisition strategy, according to described input scene and described speech recognition library, the voice messaging of input is identified.Hereby it is achieved that carry out speech recognition according to the identification resource corresponding with phonetic entry scene, improve accuracy of identification and treatment effeciency.
In order to realize above-described embodiment, the application also proposes a kind of speech recognition equipment.
Fig. 4 is the structural representation of the speech recognition equipment of one embodiment of the application.
As shown in Figure 4, this speech recognition equipment includes:
Configuration module 11, for configuring the proprietary identification resource corresponding with customized voice scene and the universal identification resource corresponding with universal phonetic scene;
Specifically, described proprietary identification resource includes at least one of:
Place name identification resource, search hot word identification resource, electricity business's trade name identification resource, movie name identification resource.
Set up module 12, for setting up the speech recognition library including described proprietary identification resource and described universal identification resource, with the input scene according to voice messaging, adopt voice messaging described in described speech recognition library identification.
It should be noted that the aforementioned explanation to audio recognition method embodiment is also applied for the speech recognition equipment of this embodiment, repeat no more herein.
The speech recognition equipment of the embodiment of the present application, by configuring the proprietary identification resource corresponding with customized voice scene, and the universal identification resource corresponding with universal phonetic scene;Set up the speech recognition library including described proprietary identification resource and described universal identification resource, with the input scene according to voice messaging, adopt voice messaging described in described speech recognition library identification.Hereby it is achieved that be identified the customization of environment for different vertical class scenes, carry out speech recognition according to the identification resource corresponding with phonetic entry scene, improve accuracy of identification and treatment effeciency.
Fig. 5 is the structural representation of the speech recognition equipment of another embodiment of the application, as it is shown in figure 5, based on embodiment illustrated in fig. 4, described device also includes:
Receiver module 13, for receiving the voice messaging of input;
Acquisition module 14, for determining the input scene with described voice messaging according to the scene acquisition strategy preset;
Identification module 15, for being identified the voice messaging of input according to described input scene and described speech recognition library.
In one embodiment, described acquisition module 14 is used for: the application program currently carrying out phonetic entry according to user determines the input scene of described voice messaging;
Or,
In one embodiment, described acquisition module 14 is used for: determine the input scene of described voice messaging according to the context of user Yu other user session records;
Or,
In one embodiment, described acquisition module 14 is used for: determine the input scene of described voice messaging according to the geographical location information that user is current.
In one embodiment, described identification module 15 is used for:
If described input scene is described customized voice scene, then applies proprietary identification resource corresponding with described customized voice scene in described speech recognition library, described voice messaging is identified;
If described input scene is not described customized voice scene, then applies the described universal identification resource in described speech recognition library, described voice messaging is identified;
In another embodiment, described identification module 15 is additionally operable to:
If described input scene can not be determined, then apply described universal identification resource and described voice messaging is identified.
It should be noted that the aforementioned explanation to audio recognition method embodiment is also applied for the speech recognition equipment of this embodiment, repeat no more herein.
Based on embodiment illustrated in fig. 4, the speech recognition equipment of the embodiment of the present application, it is further advanced by the voice messaging receiving input, determine the input scene with described voice messaging according to default scene acquisition strategy, according to described input scene and described speech recognition library, the voice messaging of input is identified.Hereby it is achieved that carry out speech recognition according to the identification resource corresponding with phonetic entry scene, improve accuracy of identification and treatment effeciency.
In the description of this specification, specific features, structure, material or feature that the description of reference term " embodiment ", " some embodiments ", " example ", " concrete example " or " some examples " etc. means in conjunction with this embodiment or example describe are contained at least one embodiment or the example of the application.In this manual, the schematic representation of above-mentioned term is necessarily directed to identical embodiment or example.And, the specific features of description, structure, material or feature can combine in one or more embodiments in office or example in an appropriate manner.Additionally, when not conflicting, the feature of the different embodiments described in this specification or example and different embodiment or example can be carried out combining and combining by those skilled in the art.
Additionally, term " first ", " second " are only for descriptive purposes, and it is not intended that indicate or imply relative importance or the implicit quantity indicating indicated technical characteristic.Thus, define " first ", the feature of " second " can express or implicitly include at least one this feature.In the description of the present application, " multiple " are meant that at least two, for instance two, three etc., unless otherwise expressly limited specifically.
Describe in flow chart or in this any process described otherwise above or method and be construed as, represent and include the module of code of executable instruction of one or more step for realizing custom logic function or process, fragment or part, and the scope of the preferred implementation of the application includes other realization, wherein can not press order that is shown or that discuss, including according to involved function by basic mode simultaneously or in the opposite order, performing function, this should be understood by embodiments herein person of ordinary skill in the field.
Represent in flow charts or in this logic described otherwise above and/or step, such as, it is considered the sequencing list of executable instruction for realizing logic function, may be embodied in any computer-readable medium, use for instruction execution system, device or equipment (such as computer based system, including the system of processor or other can from instruction execution system, device or equipment instruction fetch the system performing instruction), or use in conjunction with these instruction execution systems, device or equipment.For the purpose of this specification, " computer-readable medium " can be any can comprise, store, communicate, propagate or transmission procedure is for instruction execution system, device or equipment or the device that uses in conjunction with these instruction execution systems, device or equipment.The example more specifically (non-exhaustive list) of computer-readable medium includes following: have the electrical connection section (electronic installation) of one or more wiring, portable computer diskette box (magnetic device), random access memory (RAM), read only memory (ROM), erasable edit read only memory (EPROM or flash memory), fiber device, and portable optic disk read only memory (CDROM).Additionally, computer-readable medium can even is that the paper that can print described program thereon or other suitable media, because can such as by paper or other media be carried out optical scanning, then carry out editing, interpreting or be processed to electronically obtain described program with other suitable methods if desired, be then stored in computer storage.
Should be appreciated that each several part of the application can realize with hardware, software, firmware or their combination.In the above-described embodiment, multiple steps or method can realize with the storage software or firmware in memory and by suitable instruction execution system execution.Such as, if realized with hardware, the same in another embodiment, can realize by any one in following technology well known in the art or their combination: there is the discrete logic of logic gates for data signal realizes logic function, there is the special IC of suitable combination logic gate circuit, programmable gate array (PGA), field programmable gate array (FPGA) etc..
Those skilled in the art are appreciated that realizing all or part of step that above-described embodiment method carries can be by the hardware that program carrys out instruction relevant and complete, described program can be stored in a kind of computer-readable recording medium, this program upon execution, including the step one or a combination set of of embodiment of the method.
Additionally, each functional unit in each embodiment of the application can be integrated in a processing module, it is also possible to be that unit is individually physically present, it is also possible to two or more unit are integrated in a module.Above-mentioned integrated module both can adopt the form of hardware to realize, it would however also be possible to employ the form of software function module realizes.If described integrated module is using the form realization of software function module and as independent production marketing or use, it is also possible to be stored in a computer read/write memory medium.
Storage medium mentioned above can be read only memory, disk or CD etc..Although above it has been shown and described that embodiments herein, it is understandable that, above-described embodiment is illustrative of, it is impossible to be interpreted as the restriction to the application, and above-described embodiment can be changed in scope of the present application, revises, replace and modification by those of ordinary skill in the art.

Claims (12)

1. an audio recognition method, it is characterised in that comprise the following steps:
Configure the proprietary identification resource corresponding with customized voice scene and the universal identification resource corresponding with universal phonetic scene;
Set up the speech recognition library including described proprietary identification resource and described universal identification resource, with the input scene according to voice messaging, adopt voice messaging described in described speech recognition library identification.
2. the method for claim 1, it is characterised in that described proprietary identification resource includes at least one of:
Place name identification resource, search hot word identification resource, electricity business's trade name identification resource, movie name identification resource.
3. method as claimed in claim 1 or 2, it is characterised in that also include:
Receive the voice messaging of input;
The input scene with described voice messaging is determined according to default scene acquisition strategy;
According to described input scene and described speech recognition library, the voice messaging of input is identified.
4. method as claimed in claim 3, it is characterised in that the input scene with described voice messaging is determined in the scene acquisition strategy that described basis is preset, including:
The application program currently carrying out phonetic entry according to user determines the input scene of described voice messaging;
Or,
The input scene of described voice messaging determined in context according to user Yu other user session records;
Or,
The input scene of described voice messaging is determined according to the geographical location information that user is current.
5. method as claimed in claim 3, it is characterised in that described according to described input scene and described speech recognition library, the voice messaging of input is identified, including:
If described input scene is described customized voice scene, then applies proprietary identification resource corresponding with described customized voice scene in described speech recognition library, described voice messaging is identified;
If described input scene is not described customized voice scene, then applies the described universal identification resource in described speech recognition library, described voice messaging is identified.
6. method as claimed in claim 3, it is characterised in that also include:
If described input scene can not be determined, then apply described universal identification resource and described voice messaging is identified.
7. a speech recognition equipment, it is characterised in that including:
Configuration module, for configuring the proprietary identification resource corresponding with customized voice scene and the universal identification resource corresponding with universal phonetic scene;
Set up module, for setting up the speech recognition library including described proprietary identification resource and described universal identification resource, with the input scene according to voice messaging, adopt voice messaging described in described speech recognition library identification.
8. device as claimed in claim 7, it is characterised in that described proprietary identification resource includes at least one of:
Place name identification resource, search hot word identification resource, electricity business's trade name identification resource, movie name identification resource.
9. device as claimed in claim 7 or 8, it is characterised in that also include:
Receiver module, for receiving the voice messaging of input;
Acquisition module, for determining the input scene with described voice messaging according to the scene acquisition strategy preset;
Identification module, for being identified the voice messaging of input according to described input scene and described speech recognition library.
10. device as claimed in claim 9, it is characterised in that described acquisition module is used for:
The application program currently carrying out phonetic entry according to user determines the input scene of described voice messaging;
Or,
The input scene of described voice messaging determined in context according to user Yu other user session records;
Or,
The input scene of described voice messaging is determined according to the geographical location information that user is current.
11. device as claimed in claim 9, it is characterised in that described identification module is used for:
If described input scene is described customized voice scene, then applies proprietary identification resource corresponding with described customized voice scene in described speech recognition library, described voice messaging is identified;
If described input scene is not described customized voice scene, then applies the described universal identification resource in described speech recognition library, described voice messaging is identified.
12. device as claimed in claim 9, it is characterised in that described identification module is additionally operable to:
If described input scene can not be determined, then apply described universal identification resource and described voice messaging is identified.
CN201610035394.3A 2016-01-19 2016-01-19 Audio recognition method and device Active CN105719649B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610035394.3A CN105719649B (en) 2016-01-19 2016-01-19 Audio recognition method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610035394.3A CN105719649B (en) 2016-01-19 2016-01-19 Audio recognition method and device

Publications (2)

Publication Number Publication Date
CN105719649A true CN105719649A (en) 2016-06-29
CN105719649B CN105719649B (en) 2019-07-05

Family

ID=56147425

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610035394.3A Active CN105719649B (en) 2016-01-19 2016-01-19 Audio recognition method and device

Country Status (1)

Country Link
CN (1) CN105719649B (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107463700A (en) * 2017-08-15 2017-12-12 北京百度网讯科技有限公司 For obtaining the method, apparatus and equipment of information
CN107728783A (en) * 2017-09-25 2018-02-23 联想(北京)有限公司 Artificial intelligence process method and its system
WO2018177233A1 (en) * 2017-03-31 2018-10-04 阿里巴巴集团控股有限公司 Voice function control method and apparatus
WO2018223796A1 (en) * 2017-06-07 2018-12-13 腾讯科技(深圳)有限公司 Speech recognition method, storage medium, and speech recognition device
CN109087639A (en) * 2018-08-02 2018-12-25 泰康保险集团股份有限公司 Method for voice recognition, device, electronic equipment and computer-readable medium
CN109360565A (en) * 2018-12-11 2019-02-19 江苏电力信息技术有限公司 A method of precision of identifying speech is improved by establishing resources bank
CN109671421A (en) * 2018-12-25 2019-04-23 苏州思必驰信息科技有限公司 The customization and implementation method navigated offline and device
CN109920429A (en) * 2017-12-13 2019-06-21 上海擎感智能科技有限公司 It is a kind of for vehicle-mounted voice recognition data processing method and system
CN110299136A (en) * 2018-03-22 2019-10-01 上海擎感智能科技有限公司 A kind of processing method and its system for speech recognition
CN110349575A (en) * 2019-05-22 2019-10-18 深圳壹账通智能科技有限公司 Method, apparatus, electronic equipment and the storage medium of speech recognition
CN111049996A (en) * 2019-12-26 2020-04-21 苏州思必驰信息科技有限公司 Multi-scene voice recognition method and device and intelligent customer service system applying same
CN111161739A (en) * 2019-12-28 2020-05-15 科大讯飞股份有限公司 Speech recognition method and related product
CN111292740A (en) * 2018-11-21 2020-06-16 财团法人工业技术研究院 Speech recognition system and method, and computer program product
CN111312235A (en) * 2018-12-11 2020-06-19 阿里巴巴集团控股有限公司 Voice interaction method, device and system
CN111312233A (en) * 2018-12-11 2020-06-19 阿里巴巴集团控股有限公司 Voice data identification method, device and system
CN111583909A (en) * 2020-05-18 2020-08-25 科大讯飞股份有限公司 Voice recognition method, device, equipment and storage medium
CN112687261A (en) * 2020-12-15 2021-04-20 苏州思必驰信息科技有限公司 Speech recognition training and application method and device
CN113223510A (en) * 2020-01-21 2021-08-06 青岛海尔电冰箱有限公司 Refrigerator and equipment voice interaction method and computer readable storage medium thereof
CN113470619A (en) * 2021-06-30 2021-10-01 北京有竹居网络技术有限公司 Speech recognition method, apparatus, medium, and device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6738784B1 (en) * 2000-04-06 2004-05-18 Dictaphone Corporation Document and information processing system
CN101329868A (en) * 2008-07-31 2008-12-24 林超 Speech recognition optimizing system aiming at locale language use preference and method thereof
US20100332521A1 (en) * 2004-09-07 2010-12-30 Stuart Robert O More efficient search algorithm (Mesa) using prioritized search sequencing
CN103674012A (en) * 2012-09-21 2014-03-26 高德软件有限公司 Voice customizing method and device and voice identification method and device
CN104240698A (en) * 2014-09-24 2014-12-24 上海伯释信息科技有限公司 Voice recognition method
CN105225665A (en) * 2015-10-15 2016-01-06 桂林电子科技大学 A kind of audio recognition method and speech recognition equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6738784B1 (en) * 2000-04-06 2004-05-18 Dictaphone Corporation Document and information processing system
US20100332521A1 (en) * 2004-09-07 2010-12-30 Stuart Robert O More efficient search algorithm (Mesa) using prioritized search sequencing
CN101329868A (en) * 2008-07-31 2008-12-24 林超 Speech recognition optimizing system aiming at locale language use preference and method thereof
CN103674012A (en) * 2012-09-21 2014-03-26 高德软件有限公司 Voice customizing method and device and voice identification method and device
CN104240698A (en) * 2014-09-24 2014-12-24 上海伯释信息科技有限公司 Voice recognition method
CN105225665A (en) * 2015-10-15 2016-01-06 桂林电子科技大学 A kind of audio recognition method and speech recognition equipment

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018177233A1 (en) * 2017-03-31 2018-10-04 阿里巴巴集团控股有限公司 Voice function control method and apparatus
US10991371B2 (en) 2017-03-31 2021-04-27 Advanced New Technologies Co., Ltd. Voice function control method and apparatus
US10643615B2 (en) 2017-03-31 2020-05-05 Alibaba Group Holding Limited Voice function control method and apparatus
WO2018223796A1 (en) * 2017-06-07 2018-12-13 腾讯科技(深圳)有限公司 Speech recognition method, storage medium, and speech recognition device
CN107463700A (en) * 2017-08-15 2017-12-12 北京百度网讯科技有限公司 For obtaining the method, apparatus and equipment of information
CN107463700B (en) * 2017-08-15 2020-09-08 北京百度网讯科技有限公司 Method, device and equipment for acquiring information
CN107728783A (en) * 2017-09-25 2018-02-23 联想(北京)有限公司 Artificial intelligence process method and its system
CN109920429A (en) * 2017-12-13 2019-06-21 上海擎感智能科技有限公司 It is a kind of for vehicle-mounted voice recognition data processing method and system
CN110299136A (en) * 2018-03-22 2019-10-01 上海擎感智能科技有限公司 A kind of processing method and its system for speech recognition
CN109087639A (en) * 2018-08-02 2018-12-25 泰康保险集团股份有限公司 Method for voice recognition, device, electronic equipment and computer-readable medium
CN111292740A (en) * 2018-11-21 2020-06-16 财团法人工业技术研究院 Speech recognition system and method, and computer program product
CN111292740B (en) * 2018-11-21 2023-05-30 财团法人工业技术研究院 Speech recognition system and method thereof
CN109360565A (en) * 2018-12-11 2019-02-19 江苏电力信息技术有限公司 A method of precision of identifying speech is improved by establishing resources bank
CN111312235A (en) * 2018-12-11 2020-06-19 阿里巴巴集团控股有限公司 Voice interaction method, device and system
CN111312233A (en) * 2018-12-11 2020-06-19 阿里巴巴集团控股有限公司 Voice data identification method, device and system
CN109671421B (en) * 2018-12-25 2020-07-10 苏州思必驰信息科技有限公司 Off-line navigation customizing and implementing method and device
CN109671421A (en) * 2018-12-25 2019-04-23 苏州思必驰信息科技有限公司 The customization and implementation method navigated offline and device
CN110349575A (en) * 2019-05-22 2019-10-18 深圳壹账通智能科技有限公司 Method, apparatus, electronic equipment and the storage medium of speech recognition
WO2020233363A1 (en) * 2019-05-22 2020-11-26 深圳壹账通智能科技有限公司 Speech recognition method and device, electronic apparatus, and storage medium
CN111049996A (en) * 2019-12-26 2020-04-21 苏州思必驰信息科技有限公司 Multi-scene voice recognition method and device and intelligent customer service system applying same
CN111161739A (en) * 2019-12-28 2020-05-15 科大讯飞股份有限公司 Speech recognition method and related product
CN111161739B (en) * 2019-12-28 2023-01-17 科大讯飞股份有限公司 Speech recognition method and related product
WO2021129439A1 (en) * 2019-12-28 2021-07-01 科大讯飞股份有限公司 Voice recognition method and related product
CN113223510B (en) * 2020-01-21 2022-09-20 青岛海尔电冰箱有限公司 Refrigerator and equipment voice interaction method and computer readable storage medium thereof
CN113223510A (en) * 2020-01-21 2021-08-06 青岛海尔电冰箱有限公司 Refrigerator and equipment voice interaction method and computer readable storage medium thereof
CN111583909A (en) * 2020-05-18 2020-08-25 科大讯飞股份有限公司 Voice recognition method, device, equipment and storage medium
CN111583909B (en) * 2020-05-18 2024-04-12 科大讯飞股份有限公司 Voice recognition method, device, equipment and storage medium
CN112687261A (en) * 2020-12-15 2021-04-20 苏州思必驰信息科技有限公司 Speech recognition training and application method and device
CN113470619A (en) * 2021-06-30 2021-10-01 北京有竹居网络技术有限公司 Speech recognition method, apparatus, medium, and device
CN113470619B (en) * 2021-06-30 2023-08-18 北京有竹居网络技术有限公司 Speech recognition method, device, medium and equipment

Also Published As

Publication number Publication date
CN105719649B (en) 2019-07-05

Similar Documents

Publication Publication Date Title
CN105719649A (en) Voice recognition method and device
CN106372059B (en) Data inputting method and device
CN111400518B (en) Method, device, terminal, server and system for generating and editing works
CN107463700B (en) Method, device and equipment for acquiring information
CN107733722B (en) Method and apparatus for configuring voice service
CN110717337A (en) Information processing method, device, computing equipment and storage medium
KR20190021409A (en) Method and apparatus for playing voice
CN105047198A (en) Voice error correction processing method and apparatus
CN105491126A (en) Service providing method and service providing device based on artificial intelligence
CN106528692A (en) Dialogue control method and device based on artificial intelligence
CN109360565A (en) A method of precision of identifying speech is improved by establishing resources bank
CN111261151A (en) Voice processing method and device, electronic equipment and storage medium
CN115952272A (en) Method, device and equipment for generating dialogue information and readable storage medium
CN106067310A (en) Recording data processing method and processing device
CN113395538B (en) Sound effect rendering method and device, computer readable medium and electronic equipment
CN110413834B (en) Voice comment modification method, system, medium and electronic device
CN107680584B (en) Method and device for segmenting audio
CN113821652A (en) Model data processing method and device, electronic equipment and computer readable medium
CN109147791A (en) A kind of shorthand system and method
CN112242143A (en) Voice interaction method and device, terminal equipment and storage medium
CN104468926A (en) Method and device for controlling contact persons in mobile terminal
CN111027332B (en) Method and device for generating translation model
CN113761865A (en) Sound and text realignment and information presentation method and device, electronic equipment and storage medium
JP6944920B2 (en) Smart interactive processing methods, equipment, equipment and computer storage media
CN113221514A (en) Text processing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant