CN105719649B - Audio recognition method and device - Google Patents
Audio recognition method and device Download PDFInfo
- Publication number
- CN105719649B CN105719649B CN201610035394.3A CN201610035394A CN105719649B CN 105719649 B CN105719649 B CN 105719649B CN 201610035394 A CN201610035394 A CN 201610035394A CN 105719649 B CN105719649 B CN 105719649B
- Authority
- CN
- China
- Prior art keywords
- scene
- input
- voice messaging
- voice
- identification
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
Abstract
The application proposes a kind of audio recognition method and device, wherein this method comprises: configuration proprietary identification resource corresponding with customized voice scene, and universal identification resource corresponding with universal phonetic scene;Establishing includes the proprietary speech recognition library for identifying resource and the universal identification resource, to identify the voice messaging using the speech recognition library according to the input scene of voice messaging.By audio recognition method provided by the present application and device, realizes and speech recognition is carried out according to identification resource corresponding with voice input scene, improve accuracy of identification and treatment effeciency.
Description
Technical field
This application involves technical field of voice recognition more particularly to a kind of audio recognition methods and device.
Background technique
With the development of mobile internet, large screen cell phone is at mainstream, no matter keyboard or hand-written, have various limitations.
Phonitic entry method will become mainstream input method, more favourable.Since voice input is more natural, learning cost is lower, slowly by more
Multi-user is received.Either child or old man can quickly learn to use, and get used to this input mode.
Existing speech recognition technology has used a large amount of living scene data for training, defeated under different scenes to identify
The voice entered, thus it is too low for some customization scene Recognition precision, it can not be identified for some customization scenes, waste processing
Resource reduces treatment effeciency.
Summary of the invention
The application is intended to solve at least some of the technical problems in related technologies.
For this purpose, first purpose of the application is to propose a kind of audio recognition method, the method achieve basis and languages
The corresponding identification resource of sound input scene carries out speech recognition, improves accuracy of identification and treatment effeciency.
Second purpose of the application is to propose a kind of speech recognition equipment.
In order to achieve the above object, the application first aspect embodiment proposes a kind of audio recognition method, comprising: configure and fixed
The corresponding proprietary identification resource of voice scene processed, and universal identification resource corresponding with universal phonetic scene;Establishing includes institute
The speech recognition library for stating proprietary identification resource and the universal identification resource, with according to the input scene of voice messaging, using institute
It states speech recognition library and identifies the voice messaging.
The audio recognition method of the embodiment of the present application, by configuring proprietary identification resource corresponding with customized voice scene,
And universal identification resource corresponding with universal phonetic scene;Establish includes that the proprietary identification resource and the universal identification provide
The speech recognition library in source, to identify the voice messaging using the speech recognition library according to the input scene of voice messaging.By
This, realizes and carries out speech recognition according to identification resource corresponding with voice input scene, improves accuracy of identification and processing effect
Rate.
In order to achieve the above object, the application second aspect embodiment proposes a kind of speech recognition equipment, comprising: configuration mould
Block, for configuring proprietary identification resource corresponding with customized voice scene, and universal identification corresponding with universal phonetic scene
Resource;Module is established, includes the proprietary speech recognition library for identifying resource and the universal identification resource for establishing, with root
According to the input scene of voice messaging, the voice messaging is identified using the speech recognition library.
The speech recognition equipment of the embodiment of the present application, by configuring proprietary identification resource corresponding with customized voice scene,
And universal identification resource corresponding with universal phonetic scene;Establish includes that the proprietary identification resource and the universal identification provide
The speech recognition library in source, to identify the voice messaging using the speech recognition library according to the input scene of voice messaging.By
This, realizes and carries out speech recognition according to identification resource corresponding with voice input scene, improves accuracy of identification and processing effect
Rate.
Detailed description of the invention
Above-mentioned and/or additional aspect and advantage of the invention will become from the following description of the accompanying drawings of embodiments
Obviously and it is readily appreciated that, in which:
Fig. 1 is the flow chart of the audio recognition method of the application one embodiment;
Fig. 2 is the flow chart of the audio recognition method of the application another embodiment;
Fig. 3 is the flow chart of the audio recognition method of the application another embodiment;
Fig. 4 is the structural schematic diagram of the speech recognition equipment of the application one embodiment;
Fig. 5 is the structural schematic diagram of the speech recognition equipment of the application another embodiment.
Specific embodiment
Embodiments herein is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end
Same or similar label indicates same or similar element or element with the same or similar functions.Below with reference to attached
The embodiment of figure description is exemplary, it is intended to for explaining the application, and should not be understood as the limitation to the application.
Below with reference to the accompanying drawings the audio recognition method and device of the embodiment of the present application are described.
Fig. 1 is the flow chart of the audio recognition method of the application one embodiment.
As shown in Figure 1, the audio recognition method includes:
Step 101, proprietary identification resource corresponding with customized voice scene is configured, and corresponding with universal phonetic scene
Universal identification resource.
Specifically, audio recognition method provided in an embodiment of the present invention is applied to the terminal device with speech voice input function
In.Under normal circumstances, terminal device realizes speech voice input function, specific speech input interface by man machine language's interactive interface
It can be the equipment such as microphone.
It should be noted that terminal device can be mentioned by being able to access that the application of man machine language's interactive interface for user
It inputs and services for voice, which can be selected according to actual needs, such as: the navigation with speech voice input function is answered
With, search engine etc., the present embodiment to this with no restriction.
It is then defeated to user to man machine language's input interface input voice information when user needs to carry out voice input
The voice messaging entered is identified, to be performed corresponding processing based on recognition result.Different voices inputs application, based on knowledge
It is different that other result carries out respective treated process.Such as:
It is anti-to user according to recognition result after being identified to the voice messaging of user's input for phonetic search application
Present search result;Alternatively,
For instant messaging application, after being identified to the voice messaging of user's input, converted according to recognition result written
Word information is shown in input frame.
For the voice messaging inputted under different scenes, in order to improve the precision and process performance of speech recognition, this implementation
The speech recognition modeling that example provides configures proprietary identification resource corresponding with customized voice scene first, and with common language sound field
The corresponding universal identification resource of scape.
It should be noted that the type of customized voice scene has very much, different customized voice scenes corresponds to different special
There is identification resource, particular content can be configured and select according to the needs of different application scene, and the present embodiment does not do this
Limitation, such as may include:
For the voice scene of digital map navigation, corresponding proprietary identification resource is place name identification resource;Alternatively,
For the voice scene of electric business platform, corresponding proprietary identification resource is that electric business product name identifies resource;Alternatively,
For the voice scene of film search, corresponding proprietary identification resource is that movie name identifies resource.
Step 102, establishing includes the proprietary speech recognition library for identifying resource and the universal identification resource, with basis
The input scene of voice messaging identifies the voice messaging using the speech recognition library.
Specifically, according to preconfigured proprietary identification resource corresponding with customized voice scene, and and universal phonetic
The corresponding universal identification resource of scene, establishing includes the proprietary speech recognition for identifying resource and the universal identification resource
Library.
In turn, it when receiving the voice messaging of user's input, determines the input scene of voice messaging, and determines voice letter
The type of the input scene of breath, i.e. input scene are customized voice scene or universal phonetic scene, thus from speech recognition library
Identification resource corresponding with input scene type is obtained to identify the voice messaging of input.
The audio recognition method of the embodiment of the present application, by configuring proprietary identification resource corresponding with customized voice scene,
And universal identification resource corresponding with universal phonetic scene;Establish includes that the proprietary identification resource and the universal identification provide
The speech recognition library in source, to identify the voice messaging using the speech recognition library according to the input scene of voice messaging.By
This, realizes the customization that environment-identification is carried out for different vertical class scenes, according to identification resource corresponding with voice input scene
Speech recognition is carried out, accuracy of identification and treatment effeciency are improved.
Fig. 2 is the flow chart of the audio recognition method of the application another embodiment.
As shown in Fig. 2, after step 102, can with the following steps are included:
Step 201, the voice messaging of input is received.
Step 202, according to the determining input scene with the voice messaging of preset scene acquisition strategy.
Specifically, the voice messaging for receiving user's input, according to preset scene acquisition strategy it is determining with it is currently received
The corresponding input scene of voice messaging.
It should be noted that different scene acquisition strategies, the present embodiment can be preset according to the actual application
With no restriction to this, such as may include:
Example one: the input scene of the voice messaging is determined according to application program;
Specifically, the application program that voice input is currently carried out according to user determines the input field of the voice messaging
Scape.Such as: user is to digital map navigation application input voice information, it is determined that the input scene of the voice messaging is led for map
Boat.
Alternatively,
Example two: the input scene of the voice messaging is based on context determined;
Specifically, the input field of the voice messaging is determined according to the context of user and other users session log
Scape.Such as: in instant messaging application, user is convenient content of travelling with the conversation content before other users, then described
The input scene of voice messaging is tourism scene.
Alternatively,
Example three: the input scene of the voice messaging is determined according to geographical location information.
Specifically, the current geographical location information of user is obtained by the GPS information of terminal device, and then according to user
Current geographical location information determines the input scene of the voice messaging.Such as: it is obtained when by the GPS information of terminal device
When the current geographical location information of user is movie theatre, then the input scene of the voice messaging is film scene.
Step 203, the voice messaging of input is identified according to the input scene and the speech recognition library.
Specifically, according to the input scene of current speech information, and the speech recognition library that pre-establishes is to the language of input
Message breath is identified, is specifically included:
If the input scene of current speech be preparatory customized voice scene, from speech recognition library obtain with it is described fixed
The corresponding proprietary identification resource of voice scene processed, and the voice messaging is identified using proprietary identification resource;
If the input scene of current speech is not preparatory customized voice scene, universal identification is obtained from speech recognition library
Resource, and the voice messaging is identified using proprietary identification resource.
Based on embodiment illustrated in fig. 1, the audio recognition method of the embodiment of the present application is further advanced by the language for receiving input
Message breath, according to the determining input scene with the voice messaging of preset scene acquisition strategy, according to the input scene and
The speech recognition library identifies the voice messaging of input.Hereby it is achieved that according to knowledge corresponding with voice input scene
Other resource carries out speech recognition, improves accuracy of identification and treatment effeciency.
Fig. 3 is that the flow chart of the audio recognition method of the application another embodiment is described as follows referring to Fig. 3:
Step 1: after receiving voice messaging, judging whether being capable of and institute predicate determining according to preset scene acquisition strategy
The input scene of message breath.
Step 2: if the input scene of voice messaging can not be determined, using the universal identification resource to the voice
Information is identified.
Step 3: if can determine the input scene of voice messaging, judging whether it is the voice scene customized in advance.
Step 4: if the input scene is preparatory customized voice scene, using in the speech recognition library with it is described fixed
The corresponding proprietary identification resource of voice scene processed, identifies the voice messaging;
Step 5: if the input scene is not customized voice scene, using described general in the speech recognition library
It identifies resource, the voice messaging is identified.
Based on embodiment illustrated in fig. 1, the audio recognition method of the embodiment of the present application is further advanced by the language for receiving input
Message breath, according to the determining input scene with the voice messaging of preset scene acquisition strategy, according to the input scene and
The speech recognition library identifies the voice messaging of input.Hereby it is achieved that according to knowledge corresponding with voice input scene
Other resource carries out speech recognition, improves accuracy of identification and treatment effeciency.
In order to realize above-described embodiment, the application also proposes a kind of speech recognition equipment.
Fig. 4 is the structural schematic diagram of the speech recognition equipment of the application one embodiment.
As shown in figure 4, the speech recognition equipment includes:
Configuration module 11, for configuring corresponding with customized voice scene proprietary identification resource, and with common language sound field
The corresponding universal identification resource of scape;
Specifically, the proprietary identification resource includes at least one of:
Place name identification resource, search hot word identification resource, electric business product name identification resource, movie name identify resource.
Module 12 is established, includes the proprietary speech recognition for identifying resource and the universal identification resource for establishing
Library, to identify the voice messaging using the speech recognition library according to the input scene of voice messaging.
It should be noted that the aforementioned voice for being also applied for the embodiment to the explanation of audio recognition method embodiment
Identification device, details are not described herein again.
The speech recognition equipment of the embodiment of the present application, by configuring proprietary identification resource corresponding with customized voice scene,
And universal identification resource corresponding with universal phonetic scene;Establish includes that the proprietary identification resource and the universal identification provide
The speech recognition library in source, to identify the voice messaging using the speech recognition library according to the input scene of voice messaging.By
This, realizes the customization that environment-identification is carried out for different vertical class scenes, according to identification resource corresponding with voice input scene
Speech recognition is carried out, accuracy of identification and treatment effeciency are improved.
Fig. 5 is the structural schematic diagram of the speech recognition equipment of the application another embodiment, as shown in figure 5, being based on Fig. 4 institute
Show embodiment, described device further include:
Receiving module 13, voice messaging for receiving input;
Module 14 is obtained, for according to the determining input scene with the voice messaging of preset scene acquisition strategy;
Identification module 15, for being known according to the input scene and the speech recognition library to the voice messaging of input
Not.
In one embodiment, the acquisition module 14 is used for: the application program of voice input is currently carried out according to user
Determine the input scene of the voice messaging;
Alternatively,
In one embodiment, the acquisition module 14 is used for: according to the context of user and other users session log
Determine the input scene of the voice messaging;
Alternatively,
In one embodiment, the module 14 that obtains is used for: according to the current geographical location information determination of user
The input scene of voice messaging.
In one embodiment, the identification module 15 is used for:
If the input scene is the customized voice scene, using in the speech recognition library with the customized voice
The corresponding proprietary identification resource of scene, identifies the voice messaging;
If the input scene is not the customized voice scene, using the general knowledge in the speech recognition library
Other resource, identifies the voice messaging;
In another embodiment, the identification module 15 is also used to:
If the input scene can not be determined, the voice messaging is known using the universal identification resource
Not.
It should be noted that the aforementioned voice for being also applied for the embodiment to the explanation of audio recognition method embodiment
Identification device, details are not described herein again.
Embodiment based on shown in Fig. 4, the speech recognition equipment of the embodiment of the present application are further advanced by the language for receiving input
Message breath, according to the determining input scene with the voice messaging of preset scene acquisition strategy, according to the input scene and
The speech recognition library identifies the voice messaging of input.Hereby it is achieved that according to knowledge corresponding with voice input scene
Other resource carries out speech recognition, improves accuracy of identification and treatment effeciency.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show
The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example
Point is contained at least one embodiment or example of the application.In the present specification, schematic expression of the above terms are not
It must be directed to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be in office
It can be combined in any suitable manner in one or more embodiment or examples.In addition, without conflicting with each other, the skill of this field
Art personnel can tie the feature of different embodiments or examples described in this specification and different embodiments or examples
It closes and combines.
In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance
Or implicitly indicate the quantity of indicated technical characteristic.Define " first " as a result, the feature of " second " can be expressed or
Implicitly include at least one this feature.In the description of the present application, the meaning of " plurality " is at least two, such as two, three
It is a etc., unless otherwise specifically defined.
Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes
It is one or more for realizing custom logic function or process the step of executable instruction code module, segment or portion
Point, and the range of the preferred embodiment of the application includes other realization, wherein can not press shown or discussed suitable
Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, Lai Zhihang function, this should be by the application
Embodiment person of ordinary skill in the field understood.
Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use
In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for
Instruction execution system, device or equipment (such as computer based system, including the system of processor or other can be held from instruction
The instruction fetch of row system, device or equipment and the system executed instruction) it uses, or combine these instruction execution systems, device or set
It is standby and use.For the purpose of this specification, " computer-readable medium ", which can be, any may include, stores, communicates, propagates or pass
Defeated program is for instruction execution system, device or equipment or the dress used in conjunction with these instruction execution systems, device or equipment
It sets.The more specific example (non-exhaustive list) of computer-readable medium include the following: there is the electricity of one or more wirings
Interconnecting piece (electronic device), portable computer diskette box (magnetic device), random access memory (RAM), read-only memory
(ROM), erasable edit read-only storage (EPROM or flash memory), fiber device and portable optic disk is read-only deposits
Reservoir (CDROM).In addition, computer-readable medium can even is that the paper that can print described program on it or other are suitable
Medium, because can then be edited, be interpreted or when necessary with it for example by carrying out optical scanner to paper or other media
His suitable method is handled electronically to obtain described program, is then stored in computer storage.
It should be appreciated that each section of the application can be realized with hardware, software, firmware or their combination.Above-mentioned
In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage
Or firmware is realized.It, and in another embodiment, can be under well known in the art for example, if realized with hardware
Any one of column technology or their combination are realized: having a logic gates for realizing logic function to data-signal
Discrete logic, with suitable combinational logic gate circuit specific integrated circuit, programmable gate array (PGA), scene
Programmable gate array (FPGA) etc..
Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries
It suddenly is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer-readable storage medium
In matter, which when being executed, includes the steps that one or a combination set of embodiment of the method.
It, can also be in addition, can integrate in a processing module in each functional unit in each embodiment of the application
It is that each unit physically exists alone, can also be integrated in two or more units in a module.Above-mentioned integrated mould
Block both can take the form of hardware realization, can also be realized in the form of software function module.The integrated module is such as
Fruit is realized and when sold or used as an independent product in the form of software function module, also can store in a computer
In read/write memory medium.
Storage medium mentioned above can be read-only memory, disk or CD etc..Although having been shown and retouching above
Embodiments herein is stated, it is to be understood that above-described embodiment is exemplary, and should not be understood as the limit to the application
System, those skilled in the art can be changed above-described embodiment, modify, replace and become within the scope of application
Type.
Claims (12)
1. a kind of audio recognition method, which comprises the following steps:
Configure proprietary identification resource corresponding with customized voice scene, and universal identification corresponding with universal phonetic scene money
Source;Wherein, different customized voice scenes corresponds to different proprietary identification resources;
Establishing includes the proprietary speech recognition library for identifying resource and the universal identification resource, according to the defeated of voice messaging
Enter scene, and determine the type of the input scene, is obtained from the speech recognition library corresponding with the type of the input scene
Identification resource the voice messaging is identified.
2. the method as described in claim 1, which is characterized in that the proprietary identification resource includes at least one of:
Place name identification resource, search hot word identification resource, electric business product name identification resource, movie name identify resource.
3. method according to claim 1 or 2, which is characterized in that further include:
Receive the voice messaging of input;
According to the determining input scene with the voice messaging of preset scene acquisition strategy;
The voice messaging of input is identified according to the input scene and the speech recognition library.
4. method as claimed in claim 3, which is characterized in that described according to preset scene acquisition strategy determination and institute's predicate
The input scene of message breath, comprising:
The application program that voice input is currently carried out according to user determines the input scene of the voice messaging;
Alternatively,
The input scene of the voice messaging is determined according to the context of user and other users session log;
Alternatively,
The input scene of the voice messaging is determined according to the current geographical location information of user.
5. method as claimed in claim 3, which is characterized in that described according to the input scene and the speech recognition library pair
The voice messaging of input is identified, comprising:
If the input scene is the customized voice scene, using in the speech recognition library with the customized voice scene
Corresponding proprietary identification resource, identifies the voice messaging;
Universal identification money if the input scene is not the customized voice scene, in the application speech recognition library
Source identifies the voice messaging.
6. method as claimed in claim 3, which is characterized in that further include:
If the input scene can not be determined, the voice messaging is identified using the universal identification resource.
7. a kind of speech recognition equipment characterized by comprising
Configuration module is used to configure proprietary identification resource corresponding with customized voice scene, and corresponding with universal phonetic scene
Universal identification resource;Wherein, different customized voice scenes corresponds to different proprietary identification resources;
Module is established, includes the proprietary speech recognition library for identifying resource and the universal identification resource for establishing, with root
It according to the input scene of voice messaging, and determines the type of the input scene, is obtained and the input from the speech recognition library
The corresponding identification resource of the type of scene identifies the voice messaging.
8. device as claimed in claim 7, which is characterized in that the proprietary identification resource includes at least one of:
Place name identification resource, search hot word identification resource, electric business product name identification resource, movie name identify resource.
9. device as claimed in claim 7 or 8, which is characterized in that further include:
Receiving module, voice messaging for receiving input;
Module is obtained, for according to the determining input scene with the voice messaging of preset scene acquisition strategy;
Identification module, for being identified according to the input scene and the speech recognition library to the voice messaging of input.
10. device as claimed in claim 9, which is characterized in that the acquisition module is used for:
The application program that voice input is currently carried out according to user determines the input scene of the voice messaging;
Alternatively,
The input scene of the voice messaging is determined according to the context of user and other users session log;
Alternatively,
The input scene of the voice messaging is determined according to the current geographical location information of user.
11. device as claimed in claim 9, which is characterized in that the identification module is used for:
If the input scene is the customized voice scene, using in the speech recognition library with the customized voice scene
Corresponding proprietary identification resource, identifies the voice messaging;
Universal identification money if the input scene is not the customized voice scene, in the application speech recognition library
Source identifies the voice messaging.
12. device as claimed in claim 9, which is characterized in that the identification module is also used to:
If the input scene can not be determined, the voice messaging is identified using the universal identification resource.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610035394.3A CN105719649B (en) | 2016-01-19 | 2016-01-19 | Audio recognition method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610035394.3A CN105719649B (en) | 2016-01-19 | 2016-01-19 | Audio recognition method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105719649A CN105719649A (en) | 2016-06-29 |
CN105719649B true CN105719649B (en) | 2019-07-05 |
Family
ID=56147425
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610035394.3A Active CN105719649B (en) | 2016-01-19 | 2016-01-19 | Audio recognition method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105719649B (en) |
Families Citing this family (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107122179A (en) * | 2017-03-31 | 2017-09-01 | 阿里巴巴集团控股有限公司 | The function control method and device of voice |
CN108288467B (en) * | 2017-06-07 | 2020-07-14 | 腾讯科技(深圳)有限公司 | Voice recognition method and device and voice recognition engine |
CN107463700B (en) * | 2017-08-15 | 2020-09-08 | 北京百度网讯科技有限公司 | Method, device and equipment for acquiring information |
CN107728783B (en) * | 2017-09-25 | 2021-05-18 | 联想(北京)有限公司 | Artificial intelligence processing method and system |
CN109920429A (en) * | 2017-12-13 | 2019-06-21 | 上海擎感智能科技有限公司 | It is a kind of for vehicle-mounted voice recognition data processing method and system |
CN110299136A (en) * | 2018-03-22 | 2019-10-01 | 上海擎感智能科技有限公司 | A kind of processing method and its system for speech recognition |
CN109087639B (en) * | 2018-08-02 | 2021-01-15 | 泰康保险集团股份有限公司 | Method, apparatus, electronic device and computer readable medium for speech recognition |
TWI698857B (en) * | 2018-11-21 | 2020-07-11 | 財團法人工業技術研究院 | Speech recognition system and method thereof, and computer program product |
CN109360565A (en) * | 2018-12-11 | 2019-02-19 | 江苏电力信息技术有限公司 | A method of precision of identifying speech is improved by establishing resources bank |
CN111312233A (en) * | 2018-12-11 | 2020-06-19 | 阿里巴巴集团控股有限公司 | Voice data identification method, device and system |
CN111312235B (en) * | 2018-12-11 | 2023-06-30 | 阿里巴巴集团控股有限公司 | Voice interaction method, device and system |
CN109671421B (en) * | 2018-12-25 | 2020-07-10 | 苏州思必驰信息科技有限公司 | Off-line navigation customizing and implementing method and device |
CN110349575A (en) * | 2019-05-22 | 2019-10-18 | 深圳壹账通智能科技有限公司 | Method, apparatus, electronic equipment and the storage medium of speech recognition |
CN111049996B (en) * | 2019-12-26 | 2021-06-15 | 思必驰科技股份有限公司 | Multi-scene voice recognition method and device and intelligent customer service system applying same |
CN111161739B (en) * | 2019-12-28 | 2023-01-17 | 科大讯飞股份有限公司 | Speech recognition method and related product |
CN113223510B (en) * | 2020-01-21 | 2022-09-20 | 青岛海尔电冰箱有限公司 | Refrigerator and equipment voice interaction method and computer readable storage medium thereof |
CN111583909B (en) * | 2020-05-18 | 2024-04-12 | 科大讯飞股份有限公司 | Voice recognition method, device, equipment and storage medium |
CN112687261B (en) * | 2020-12-15 | 2022-05-03 | 思必驰科技股份有限公司 | Speech recognition training and application method and device |
CN113470619B (en) * | 2021-06-30 | 2023-08-18 | 北京有竹居网络技术有限公司 | Speech recognition method, device, medium and equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6738784B1 (en) * | 2000-04-06 | 2004-05-18 | Dictaphone Corporation | Document and information processing system |
CN101329868A (en) * | 2008-07-31 | 2008-12-24 | 林超 | Speech recognition optimizing system aiming at locale language use preference and method thereof |
CN103674012A (en) * | 2012-09-21 | 2014-03-26 | 高德软件有限公司 | Voice customizing method and device and voice identification method and device |
CN104240698A (en) * | 2014-09-24 | 2014-12-24 | 上海伯释信息科技有限公司 | Voice recognition method |
CN105225665A (en) * | 2015-10-15 | 2016-01-06 | 桂林电子科技大学 | A kind of audio recognition method and speech recognition equipment |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8069159B2 (en) * | 2004-09-07 | 2011-11-29 | Robert O Stuart | More efficient search algorithm (MESA) using prioritized search sequencing |
-
2016
- 2016-01-19 CN CN201610035394.3A patent/CN105719649B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6738784B1 (en) * | 2000-04-06 | 2004-05-18 | Dictaphone Corporation | Document and information processing system |
CN101329868A (en) * | 2008-07-31 | 2008-12-24 | 林超 | Speech recognition optimizing system aiming at locale language use preference and method thereof |
CN103674012A (en) * | 2012-09-21 | 2014-03-26 | 高德软件有限公司 | Voice customizing method and device and voice identification method and device |
CN104240698A (en) * | 2014-09-24 | 2014-12-24 | 上海伯释信息科技有限公司 | Voice recognition method |
CN105225665A (en) * | 2015-10-15 | 2016-01-06 | 桂林电子科技大学 | A kind of audio recognition method and speech recognition equipment |
Also Published As
Publication number | Publication date |
---|---|
CN105719649A (en) | 2016-06-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105719649B (en) | Audio recognition method and device | |
KR102295935B1 (en) | Digital personal assistant interaction with impersonations and rich multimedia in responses | |
CN109101545A (en) | Natural language processing method, apparatus, equipment and medium based on human-computer interaction | |
US9154629B2 (en) | System and method for generating personalized tag recommendations for tagging audio content | |
CN105551480B (en) | Dialect conversion method and device | |
CN106372059A (en) | Information input method and information input device | |
US11308949B2 (en) | Voice assistant response system based on a tone, keyword, language or etiquette behavioral rule | |
US20180226073A1 (en) | Context-based cognitive speech to text engine | |
US11443227B2 (en) | System and method for cognitive multilingual speech training and recognition | |
CN110444229A (en) | Communication service method, device, computer equipment and storage medium based on speech recognition | |
CN108682420A (en) | A kind of voice and video telephone accent recognition method and terminal device | |
CN109360565A (en) | A method of precision of identifying speech is improved by establishing resources bank | |
US20230401978A1 (en) | Enhancing video language learning by providing catered context sensitive expressions | |
CN110517668A (en) | A kind of Chinese and English mixing voice identifying system and method | |
CN115952272A (en) | Method, device and equipment for generating dialogue information and readable storage medium | |
CN110059313A (en) | Translation processing method and device | |
Subirana | Call for a wake standard for artificial intelligence | |
Inupakutika et al. | Integration of NLP and Speech-to-text Applications with Chatbots | |
CN117370512A (en) | Method, device, equipment and storage medium for replying to dialogue | |
JP2022531994A (en) | Generation and operation of artificial intelligence-based conversation systems | |
US10681402B2 (en) | Providing relevant and authentic channel content to users based on user persona and interest | |
US20230215417A1 (en) | Using token level context to generate ssml tags | |
CN109979458A (en) | News interview original text automatic generation method and relevant device based on artificial intelligence | |
CN115705705A (en) | Video identification method, device, server and storage medium based on machine learning | |
US10657692B2 (en) | Determining image description specificity in presenting digital content |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |