CN109166581A - Audio recognition method, device, electronic equipment and computer readable storage medium - Google Patents

Audio recognition method, device, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN109166581A
CN109166581A CN201811126926.XA CN201811126926A CN109166581A CN 109166581 A CN109166581 A CN 109166581A CN 201811126926 A CN201811126926 A CN 201811126926A CN 109166581 A CN109166581 A CN 109166581A
Authority
CN
China
Prior art keywords
speech recognition
voice
voice messaging
result
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811126926.XA
Other languages
Chinese (zh)
Inventor
叶顺平
邹明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chumen Wenwen Information Technology Co Ltd
Original Assignee
Chumen Wenwen Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chumen Wenwen Information Technology Co Ltd filed Critical Chumen Wenwen Information Technology Co Ltd
Priority to CN201811126926.XA priority Critical patent/CN109166581A/en
Publication of CN109166581A publication Critical patent/CN109166581A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0623Item investigation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Abstract

The embodiment of the invention provides a kind of audio recognition method, device, electronic equipment and computer readable storage mediums, are applied to technical field of voice recognition.This method comprises: passing through the interactive voice process for being directed to user and terminal device, obtain the corresponding interactive information of interactive voice process, then speech recognition corpus is established according to interactive information, and then works as and receive the first voice messaging that user is inputted based on interactive information, speech recognition is carried out to the first voice messaging by speech recognition corpus, determines the first recognition result.The result of the speech recognition of voice messaging to be identified is limited within the scope of the speech recognition corpus established according to interactive information by the embodiment of the present invention, reduce the range of the possible corresponding recognition result information of voice messaging to be identified, so as to promoted voice messaging to be identified language identification accuracy rate, and then promoted user experience.

Description

Audio recognition method, device, electronic equipment and computer readable storage medium
Technical field
The present embodiments relate to technical field of voice recognition, more particularly to a kind of audio recognition method, device, electronics Equipment and computer readable storage medium.
Background technique
With the development of speech recognition technology, speech recognition technology enters the rank applied in wider territory Section, wherein mode by voice input provides the hot spot that the services such as retrieval, navigation become research for user, and solve because The ambiguity bring speech recognition problem of voice, which becomes, provides the key of more preferable experience service for user.
Currently, when carrying out speech recognition to one section of voice to be identified, be involved in trained obtained acoustic model and Language model can determine corresponding relationship between voice messaging to be identified and corresponding multiple text words by trained language model Each probability, wherein the corresponding text word of maximum probability by as speech recognition as a result, for example, need to be to " yiyibushe " Corresponding voice messaging carries out speech recognition, can determine voice messaging to be identified and text word " with reluctance " by language model Corresponding maximum probability, " yiyibushe " corresponding voice messaging will be identified that " with reluctance ", however, if user Issue corresponding " yiyibushe " voice messaging be because wanting to go to " clothes are not given up " clothes shop, according to existing voice identification technology, The voice messaging of corresponding " yiyibushe " will be identified that text word " with reluctance " that the recognition result and user want The result arrived is not consistent.Inventor in the specific implementation process, has found the ambiguity due to voice, it is accurate that there are speech recognitions The low problem of rate.
Summary of the invention
In view of this, the embodiment of the invention provides a kind of audio recognition method, device, electronic equipments and computer-readable Storage medium promotes the accuracy rate of speech recognition, and then promotes user experience.
To solve the above-mentioned problems, the embodiment of the present invention mainly provides the following technical solutions:
In a first aspect, a kind of audio recognition method based on interactive information is provided, this method comprises:
For the interactive voice process of user and terminal device, the corresponding interactive information of interactive voice process is obtained;
Speech recognition corpus is established according to interactive information;
When the first voice messaging for receiving user and being inputted based on interactive information, by speech recognition corpus to the first language Message breath carries out speech recognition, determines the first recognition result.
Second aspect, provides a kind of speech recognition equipment based on interactive information, which includes:
Module is obtained, for being directed to the interactive voice process of user and terminal device, it is corresponding to obtain interactive voice process Interactive information;
Module is established, for establishing speech recognition corpus according to the interactive information for obtaining module acquisition;
Receiving module, for receiving user based on the first voice messaging for obtaining the interactive information input that module obtains;
Identification module, the first language for being received by the speech recognition corpus for establishing module foundation to receiving module Message breath carries out speech recognition, determines the first recognition result.
The third aspect provides a kind of electronic equipment, which includes:
Processor, memory, communication interface and bus;
Wherein, processor, memory, communication interface complete mutual communication by bus;
Communication interface is for the information transmission between the electronic equipment and the communication equipment of relevant device;
Processor is used to call the program instruction in memory, to execute the language shown in first aspect based on interactive information Voice recognition method.
Fourth aspect provides a kind of non-transient computer readable storage medium, which is characterized in that non-transient computer can It reads storage medium and stores computer instruction, computer instruction makes computer execute the language based on interactive information shown in first aspect Voice recognition method.
By above-mentioned technical proposal, technical solution provided in an embodiment of the present invention is at least had the advantage that
The embodiment of the invention provides a kind of audio recognition method, device, electronic equipment and computer readable storage medium, Compared with the text word for corresponding to maximum probability with voice messaging to be identified is determined as the result of speech recognition by the prior art, this hair Bright embodiment obtains the corresponding interactive information of interactive voice process by the interactive voice process for user and terminal device, Then speech recognition corpus is established according to interactive information, and then works as and receives the first voice that user is inputted based on interactive information Information carries out speech recognition to the first voice messaging by speech recognition corpus, determines the first recognition result, i.e. the present invention is real The result for applying the speech recognition of voice messaging to be identified in example has been limited at the speech recognition corpus established according to interactive information Within the scope of library, the range of the possible corresponding recognition result information of voice messaging to be identified is reduced, so as to be promoted wait know The accuracy rate of the language identification of other voice messaging, and then promote user experience.
Above description is only the general introduction of technical solution of the embodiment of the present invention, in order to better understand the embodiment of the present invention Technological means, and can be implemented in accordance with the contents of the specification, and in order to allow above and other mesh of the embodiment of the present invention , feature and advantage can be more clearly understood, the special specific embodiment for lifting the embodiment of the present invention below.
Detailed description of the invention
By reading the following detailed description of the preferred embodiment, various other advantages and benefits are common for this field Technical staff will become clear.The drawings are only for the purpose of illustrating a preferred embodiment, and is not considered as to the present invention The limitation of embodiment.And throughout the drawings, the same reference numbers will be used to refer to the same parts.In the accompanying drawings:
Fig. 1 shows a kind of process signal of audio recognition method based on interactive information provided in an embodiment of the present invention Figure;
Fig. 2 shows a kind of structural representations of the speech recognition equipment based on interactive information provided in an embodiment of the present invention Figure;
Fig. 3 shows the structural representation of another speech recognition equipment based on interactive information provided in an embodiment of the present invention Figure;
Fig. 4 shows the structural schematic diagram of a kind of electronic equipment provided in an embodiment of the present invention.
Specific embodiment
Exemplary embodiments of the present disclosure are described in more detail below with reference to accompanying drawings.Although showing the disclosure in attached drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here It is limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure It is fully disclosed to those skilled in the art.
The embodiment of the invention provides a kind of audio recognition methods based on interactive information, as shown in Figure 1, this method packet It includes:
Step S101: for the interactive voice process of user and terminal device, the corresponding interaction of interactive voice process is obtained Information.
For the embodiment of the present invention, user can carry out by voice input mode or by the screen or keyboard of terminal device The input mode and tablet computer, smart phone, palm PC, wearable device, mobile internet device of touch or key (MID) and the intelligent terminals such as onboard navigation system interact, in user and above-mentioned terminal device interactive process Corresponding interactive information will be generated, will acquire the corresponding interaction generated in interactive process with the terminal device that user interacts Information.
Step S102: speech recognition corpus is established according to interactive information.
For the embodiment of the present invention, with terminal device that user interacts after the corresponding interactive information of acquisition, will deposit The interactive information obtained is stored up to establish a speech recognition corpus, the interactive information that can also be will acquire carries out conversion process It is stored again later, to establish a speech recognition corpus.
Step S103: when the first voice messaging for receiving user and inputting based on interactive information, pass through speech recognition corpus Library carries out speech recognition to the first voice messaging, determines the first recognition result.
For the embodiment of the present invention, it is anti-will to obtain corresponding terminal equipment with corresponding terminal equipment interactive process by user The corresponding interactive information of feedback, according to the corresponding interactive information that these feed back, user can determine further operation according to demand, such as To corresponding terminal equipment issue voice instruction order, corresponding terminal equipment receive user sending voice instruction order after, The voice instruction order of user will be identified by the speech recognition corpus established, to obtain corresponding speech recognition As a result.
The embodiment of the invention provides a kind of audio recognition method based on interactive information, with the prior art will with it is to be identified The result that the text word that voice messaging corresponds to maximum probability is determined as speech recognition is compared, and the embodiment of the present invention is by being directed to user With the interactive voice process of terminal device, the corresponding interactive information of interactive voice process is obtained, is then established according to interactive information Speech recognition corpus, and then when receiving the first voice messaging that user is inputted based on interactive information, pass through speech recognition language Expect that library carries out speech recognition to the first voice messaging, determines the first recognition result, i.e., voice letter to be identified in the embodiment of the present invention The result of the speech recognition of breath has been limited within the scope of the speech recognition corpus established according to interactive information, is reduced The range of the possible corresponding recognition result information of voice messaging to be identified, so as to promote the language identification of voice messaging to be identified Accuracy rate, and then promoted user experience.
The embodiment of the invention provides alternatively possible embodiments, wherein step S101 includes:
Step S1011 (not shown): when receive user input the second voice messaging, to the second voice messaging into Row speech recognition determines the second recognition result, using the second recognition result as interactive information;
For the embodiment of the present invention, user can send according to their own needs to corresponding terminal device, such as navigation equipment Voice instruction order orders when corresponding terminal device receives the voice instruction that user issues, will pass through speech recognition technology Processing is decoded to voice instruction order, available corresponding text results information, corresponding terminal device can also root Corresponding operation is executed according to the information after decoding process, and then obtains certain operation result information, terminal device will acquire this A little interactive information of the result information as user and corresponding terminal equipment.
For example, user A is intended to buy clothes near current location, corresponding voice messaging is issued to mobile phone, mobile phone is obtaining Take family sending voice command after, processing is decoded by speech recognition technology, obtained speech recognition result for " near Clothes shop ", mobile phone executes positioning and determines current geographical location information, and speech recognition result and current geographical location are believed Manner of breathing combines the search operaqtion executed, and determines corresponding operation result information, and mobile phone will acquire these operation result informations simultaneously The interactive information interacted as user A and mobile phone.
After determining the second recognition result, this method further include:
Step S104 (not shown): the second recognition result is shown;
Connect example, mobile phone can will execute the operation result information that is obtained after search operaqtion by screen or voice mode to Family A is shown.
Wherein, the first voice messaging that the user in step S103 is inputted based on interactive information, comprising:
The first voice messaging that user is inputted based on the second recognition result.
Connect example, user A know mobile phone by after screen is shown or voice prompting mode is shown result information, Ke Yigen According to the demand of oneself, corresponding voice instruction order is sent.
Wherein, step S102 includes:
Step S1021 (not shown): speech recognition corpus is established according to the second speech recognition result.
Precedent is connect, mobile phone will store the obtained operation result information after carry out search operaqtion, obtain speech recognition language Expect library, it can also be to being stored again after obtained operation result information conversion process, to obtain speech recognition corpus.
The embodiment of the present invention carries out speech recognition according to the second voice messaging of user's input and obtains the second identification information, And be shown the second recognition result to user, while speech recognition corpus is established according to the second speech recognition result, it builds Vertical speech recognition corpus matches with the interactive information generated when human-computer interaction before, is based on speech recognition corpus to be subsequent Library further executes speech recognition and the range of diminution speech recognition result provides reliable guarantee.
The embodiment of the invention provides alternatively possible embodiments, when the interactive process in step S101 is searched for voice When rope interactive process, speech recognition is carried out to the second voice messaging in step S1011, determines the second recognition result, comprising:
Step A (not shown): speech recognition is carried out to the second voice messaging, obtains the second speech recognition result.
For the embodiment of the present invention, user can send according to their own needs to corresponding terminal device, such as navigation equipment Voice instruction order orders when corresponding terminal device receives the voice instruction that user issues, will pass through speech recognition technology The speech recognition result of processing, available text word form or syllable form is decoded to voice instruction order.
Step B (not shown): being scanned for according to the second speech recognition result by general search library, is determined and the The corresponding search result of two speech recognition results, and using corresponding search result as the second recognition result.
For the embodiment of the present invention, terminal device is according to the speech recognition knot of obtained text word form or syllable form Fruit is scanned for by third party's search engine or other retrieval ports, so that corresponding search result is obtained, the search result As to the recognition result of user speech instruction order.
For example, user A will go to neighbouring clothes shop to buy clothes, corresponding voice instruction order, mobile phone are issued to mobile phone Identify that obtained speech recognition result is " neighbouring clothes shop ", then, hand by the voice instruction order to user A Machine executes corresponding positioning operation to determine current geographic position, is scanned for by scheduled search engine, to be used Multiple clothes shops including " clothes are not given up " clothes shop near current position locating for the A of family, obtained multiple clothes shops Information is the recognition result of voice instruction order.
It for the embodiment of the present invention, is identified by the second voice messaging to user, and according to the second obtained language Sound recognition result carries out corresponding search operaqtion, the second recognition result of the second voice messaging is obtained, so as to according to user's Voice command provides the coordinate indexing result information being consistent with user demand.
The embodiment of the invention provides alternatively possible embodiment, step S103 includes:
Step S1031 (not shown): speech recognition is carried out to the first voice messaging, obtains the first speech recognition knot Fruit.
Step S1032 (not shown): the first speech recognition result is scanned for by speech recognition corpus, is obtained To search result corresponding with the first speech recognition result, and using corresponding search result as the first recognition result.
Wherein, the second speech recognition result in the first speech recognition result and step A can be syllable sequence;
Wherein, speech recognition corpus can be syllable sequence corpus.
Language is passed through after receiving the first voice messaging that user issues according to interactive information for the embodiment of the present invention Sound identification technology is decoded the first voice messaging received, to obtain the first language of the first voice messaging speech recognition Then sound recognition result is scanned in speech recognition corpus using the first obtained speech recognition result information, is obtained The corresponding search result of first speech recognition result, and know corresponding search result as the first of the first voice messaging of user Other result.
For example, user A is interacted by carrying out phonetic search with mobile phone before, to obtain near user A including " clothes Do not give up " multiple clothes shops including clothes shop, obtained multiple clothes shop's information are added to speech recognition corpus, used by mobile phone " clothes are not given up " clothes shop is gone to according to obtained multiple clothes shop's information, selection in family, and issues the voice of corresponding " yiyibushe " Instruction order after mobile phone receives user instruction, indicates order according to voice of the speech recognition technology to correspondence " yiyibushe " It is decoded, obtains the syllable sequence of " yiyibushe ", then according to the corresponding syllables sequence stored in speech recognition corpus With the index relative between text word, index relative existing for " yiyibushe " syllable sequence and " clothes are not given up " is determined, thus The recognition result for determining corresponding " yiyibushe " voice instruction order is " clothes are not given up ".
For the embodiment of the present invention, by the speech recognition corpus of foundation, so that user is issued according to interactive information The recognition result of the first voice messaging be limited within the scope of speech recognition corpus, it is corresponding to reduce voice messaging The range of first recognition result information, to improve the accuracy rate of speech recognition.
The embodiment of the invention provides a kind of speech recognition equipments based on interactive information, as shown in Fig. 2, the speech recognition Device 20 may include: to obtain module 201, establish module 202, receiving module 203 and identification module 204, wherein
Module 201 is obtained, for being directed to the interactive voice process of user and terminal device, it is corresponding to obtain interactive voice process Interactive information.
Module 202 is established, for establishing speech recognition corpus according to the interactive information for obtaining the acquisition of module 201.
Receiving module 203, for receiving user based on the first voice letter for obtaining the interactive information input that module 201 obtains Breath.
Identification module 204 receives receiving module 203 for the speech recognition corpus by establishing the foundation of module 202 The first voice messaging arrived carries out speech recognition, determines the first recognition result.
It is mentioned in the executable above-mentioned one embodiment of the present invention of the speech recognition equipment based on interactive information of the present embodiment A kind of audio recognition method based on interactive information supplied, realization principle is similar, and details are not described herein again.
The embodiment of the invention provides a kind of speech recognition equipment based on interactive information, with the prior art will with it is to be identified The result that the text word that voice messaging corresponds to maximum probability is determined as speech recognition is compared, and the embodiment of the present invention is by being directed to user With the interactive voice process of terminal device, the corresponding interactive information of interactive voice process is obtained, is then established according to interactive information Speech recognition corpus, and then when receiving the first voice messaging that user is inputted based on interactive information, pass through speech recognition language Expect that library carries out speech recognition to the first voice messaging, determines the first recognition result, i.e., voice letter to be identified in the embodiment of the present invention The result of the speech recognition of breath has been limited within the scope of the speech recognition corpus established according to interactive information, is reduced The range of the possible corresponding recognition result information of voice messaging to be identified, so as to promote the language identification of voice messaging to be identified Accuracy rate, and then promoted user experience.
Speech recognition equipment the embodiment of the invention provides another kind based on interactive information, as shown in figure 3, the present embodiment Device may include: obtain module 301, establish module 302, receiving module 303 and identification module 304, wherein
Module 301 is obtained, for being directed to the interactive voice process of user and terminal device, it is corresponding to obtain interactive voice process Interactive information.
Wherein, the acquisition module 301 in Fig. 3 is same or similar with the function of acquisition module 201 in Fig. 2.
Module 302 is established, for establishing speech recognition corpus according to the interactive information for obtaining the acquisition of module 301.
Wherein, in Fig. 3 to establish module 302 same or similar with the function of establishing module 202 in Fig. 2.
Receiving module 303, for receiving user based on the first voice letter for obtaining the interactive information input that module 301 obtains Breath.
Wherein, the receiving module 303 in Fig. 3 is same or similar with the function of receiving module 203 in Fig. 2.
Identification module 304 receives receiving module 303 for the speech recognition corpus by establishing the foundation of module 302 The first voice messaging arrived carries out speech recognition, determines the first recognition result.
Wherein, the identification module 304 in Fig. 3 is same or similar with the function of identification module 204 in Fig. 2.
Specifically, receiving module 303, specifically for receiving the second voice messaging of user's input;
Identification module 304 carries out speech recognition specifically for the second voice messaging for receiving to receiving module, determines the Two recognition results;
Module 301 is obtained, specifically for using the second determining recognition result of identification module 304 as interactive information;
The device further include: display module 305;
Display module 305, for the second recognition result to be shown;
Wherein, the first voice messaging that user is inputted based on interactive information, comprising:
The first voice messaging that user is inputted based on the second recognition result.
Specifically, module 302 is established, is established specifically for the second speech recognition result determined according to identification module 304 Speech recognition corpus.
The embodiment of the present invention carries out speech recognition according to the second voice messaging of user's input and obtains the second identification information, And be shown the second recognition result to user, while speech recognition corpus is established according to the second speech recognition result, it builds Vertical speech recognition corpus matches with the interactive information generated when human-computer interaction before, is based on speech recognition corpus to be subsequent Library further executes speech recognition and the range of diminution speech recognition result provides reliable guarantee.
Specifically, when interactive voice process is phonetic search interactive process,
Identification module 304, including recognition unit 3041 and search unit 3042;
Recognition unit 3041 obtains the second speech recognition result for carrying out speech recognition to the second voice messaging;
Search unit 3042 passes through general inspection specifically for the second speech recognition result obtained according to recognition unit 3041 Suo Ku is scanned for, and determines search result corresponding with the second speech recognition result, and using corresponding search result as second Recognition result.
It for the embodiment of the present invention, is identified by the second voice messaging to user, and according to the second obtained language Sound recognition result carries out corresponding search operaqtion, the second recognition result of the second voice messaging is obtained, so as to according to user's Voice command provides the coordinate indexing result information being consistent with user demand.
Specifically, recognition unit 3041 are specifically used for carrying out speech recognition to the first voice messaging, obtain the knowledge of the first voice Other result;
Search unit 3042 scans for the first speech recognition result by speech recognition corpus, obtains and first The corresponding search result of speech recognition result, and using corresponding search result as the first recognition result.
Wherein, the first speech recognition result and the second speech recognition result are syllable sequence,
It is wherein syllable sequence corpus according to the semantics recognition corpus that the second speech recognition result is established.
For the embodiment of the present invention, by the speech recognition corpus of foundation, so that user is issued according to interactive information The recognition result of the first voice messaging be limited within the scope of speech recognition corpus, it is corresponding to reduce voice messaging The range of result information, to improve the accuracy rate of speech recognition.
One provided in the above embodiment of the present invention can be performed in the speech recognition equipment based on interactive information of the present embodiment Audio recognition method of the kind based on interactive information, realization principle is similar, and details are not described herein again.
The embodiment of the invention provides a kind of speech recognition equipment based on interactive information, with the prior art will with it is to be identified The result that the text word that voice messaging corresponds to maximum probability is determined as speech recognition is compared, and the embodiment of the present invention is by being directed to user With the interactive voice process of terminal device, the corresponding interactive information of interactive voice process is obtained, is then established according to interactive information Speech recognition corpus, and then when receiving the first voice messaging that user is inputted based on interactive information, pass through speech recognition language Expect that library carries out speech recognition to the first voice messaging, determines the first recognition result, i.e., voice letter to be identified in the embodiment of the present invention The result of the speech recognition of breath has been limited within the scope of the speech recognition corpus established according to interactive information, is reduced The range of the possible corresponding recognition result information of voice messaging to be identified, so as to promote the language identification of voice messaging to be identified Accuracy rate, and then promoted user experience.
The embodiment of the invention provides a kind of electronic equipment, as shown in figure 4, electronic equipment shown in Fig. 4 40 includes:
Processor 41, memory 42, communication interface 43 and bus 44;
Wherein, processor 41, memory 42, communication interface 43 complete mutual communication by bus 44;
Communication interface 43 is for the information transmission between the electronic equipment 40 and the communication equipment of relevant device;
Processor 41 is used to call the program instruction in memory 42, to realize Fig. 2 or shown in Fig. 3 acquisition module, build The function of formwork erection block, the function of receiving module and identification module and display module shown in Fig. 3 305.
Processor 41 can be CPU, general processor, DSP, ASIC, FPGA or other programmable logic device, crystal Pipe logical device, hardware component or any combination thereof.It, which may be implemented or executes, combines described in present disclosure Various illustrative logic blocks, module and circuit.Processor 41 is also possible to realize the combination of computing function, such as includes one The combination of a or multi-microprocessor, DSP and the combination of microprocessor etc..
Bus 44 may include an access, and information is transmitted between said modules.Bus 44 can be pci bus or EISA is total Line etc..Bus 44 can be divided into address bus, data/address bus, control bus etc..For convenient for indicating, only with a thick line in Fig. 4 It indicates, it is not intended that an only bus or a type of bus.
Memory 42 can be ROM or can store the other kinds of static storage device of static information and instruction, RAM or Person can store the other kinds of dynamic memory of information and instruction, be also possible to EEPROM, CD-ROM or other CDs are deposited Storage, optical disc storage (including compression optical disc, laser disc, optical disc, Digital Versatile Disc, Blu-ray Disc etc.), magnetic disk storage medium or Other magnetic storage apparatus of person or can be used in carry or store have instruction or data structure form desired program code And can by any other medium of computer access, but not limited to this.
Specifically, memory 42 is used to store the application code for executing application scheme, and is controlled by processor 41 System executes.Processor 41 is for executing the application code stored in memory 42, to realize Fig. 2 or embodiment illustrated in fig. 3 The movement of the speech recognition equipment based on interactive information provided.
The embodiment of the invention provides a kind of electronic equipment to be suitable for above method embodiment.Details are not described herein.
The embodiment of the invention provides a kind of electronic equipment, and the prior art will corresponding probability be most with voice messaging to be identified The result that big text word is determined as speech recognition is compared, and the embodiment of the present invention is handed over by the voice for user and terminal device Mutual process obtains the corresponding interactive information of interactive voice process, then establishes speech recognition corpus according to interactive information, in turn When the first voice messaging for receiving user and being inputted based on interactive information, by speech recognition corpus to the first voice messaging into Row speech recognition determines the first recognition result, i.e., the result quilt of the speech recognition of voice messaging to be identified in the embodiment of the present invention It has been limited within the scope of the speech recognition corpus established according to interactive information, reducing voice messaging to be identified may be right The range for the recognition result information answered, so as to promoted voice messaging to be identified language identification accuracy rate, and then promoted use Family experience.
The embodiment of the invention provides a kind of non-transient computer readable storage medium, non-transient computer readable storage mediums Matter stores computer instruction, and computer instruction makes computer execute the voice based on interactive information of any one of above-described embodiment Recognition methods.
The embodiment of the invention provides a kind of non-transient computer readable storage mediums to be suitable for above method embodiment, This is repeated no more.
The embodiment of the invention provides a kind of non-transient computer readable storage mediums, will be with language to be identified with the prior art The result that the text word that message ceases corresponding maximum probability is determined as speech recognition is compared, the embodiment of the present invention by for user with The interactive voice process of terminal device obtains the corresponding interactive information of interactive voice process, then establishes language according to interactive information Sound identifies corpus, and then when receiving the first voice messaging that user is inputted based on interactive information, passes through speech recognition corpus Library carries out speech recognition to the first voice messaging, determines the first recognition result, i.e., voice messaging to be identified in the embodiment of the present invention Speech recognition result be limited at according to interactive information establish speech recognition corpus within the scope of, reduce to Identify voice messaging may corresponding recognition result information range, so as to promote the language identification of voice messaging to be identified Accuracy rate, and then promote user experience.
It should be understood by those skilled in the art that, embodiments herein can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the application Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the application, which can be used in one or more, The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces The form of product.
The application is referring to method, the process of equipment (system) and computer program product according to the embodiment of the present application Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.
In a typical configuration, calculating equipment includes one or more processors (CPU), input/output interface, net Network interface and memory.
Memory may include the non-volatile memory in computer-readable medium, random access memory (RAM) and/ Or the forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flashRAM).Memory is computer-readable medium Example.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method Or technology come realize information store.Information can be computer readable instructions, data structure, the module of program or other data. The example of the storage medium of computer includes, but are not limited to phase change memory (PRAM), static random access memory (SRAM), moves State random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasable Programmable read only memory (EEPROM), flash memory or other memory techniques, read-only disc read only memory (CD-ROM) (CD-ROM), Digital versatile disc (DVD) or other optical storage, magnetic cassettes, tape magnetic disk storage or other magnetic storage devices Or any other non-transmission medium, can be used for storage can be accessed by a computing device information.As defined in this article, it calculates Machine readable medium does not include temporary computer readable media (transitory media), such as the data-signal and carrier wave of modulation.
It should also be noted that, the terms "include", "comprise" or its any other variant are intended to nonexcludability It include so that the process, method, commodity or the equipment that include a series of elements not only include those elements, but also to wrap Include other elements that are not explicitly listed, or further include for this process, method, commodity or equipment intrinsic want Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including element There is also other identical elements in process, method, commodity or equipment.
It will be understood by those skilled in the art that embodiments herein can provide as method, system or computer program product. Therefore, complete hardware embodiment, complete software embodiment or embodiment combining software and hardware aspects can be used in the application Form.It is deposited moreover, the application can be used to can be used in the computer that one or more wherein includes computer usable program code The shape for the computer program product implemented on storage media (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) Formula.
The above is only embodiments herein, are not intended to limit this application.To those skilled in the art, Various changes and changes are possible in this application.It is all within the spirit and principles of the present application made by any modification, equivalent replacement, Improve etc., it should be included within the scope of the claims of this application.

Claims (10)

1. a kind of audio recognition method based on interactive information characterized by comprising
For the interactive voice process of user and terminal device, the corresponding interactive information of the interactive voice process is obtained;
Speech recognition corpus is established according to the interactive information;
When the first voice messaging for receiving the user and inputting based on the interactive information, pass through the speech recognition corpus Speech recognition is carried out to first voice messaging, determines the first recognition result.
2. audio recognition method according to claim 1, which is characterized in that the voice for user and terminal device Interactive process obtains the corresponding interactive information of the interactive voice process, comprising:
When the second voice messaging for receiving user's input, speech recognition is carried out to second voice messaging, determines the second knowledge Not as a result, using second recognition result as interactive information;
After second recognition result of determination, this method further include:
Second recognition result is shown;
The first voice messaging that the user is inputted based on the interactive information, comprising:
The first voice messaging that the user is inputted based on second recognition result.
3. audio recognition method according to claim 2, which is characterized in that described to establish voice according to the interactive information Identify corpus, comprising:
Speech recognition corpus is established according to second speech recognition result.
4. audio recognition method according to claim 2, which is characterized in that when the interactive voice process is phonetic search It is described that speech recognition is carried out to second voice messaging when interactive process, determine the second recognition result, comprising:
Speech recognition is carried out to second voice messaging, obtains the second speech recognition result;
It is scanned for according to second speech recognition result by general search library, determining and second speech recognition result Corresponding search result, and using the corresponding search result as the second recognition result.
5. audio recognition method according to claim 1, which is characterized in that described to pass through the speech recognition corpus pair First voice messaging carries out speech recognition, determines the first recognition result, comprising:
Speech recognition is carried out to first voice messaging, obtains the first speech recognition result;
First speech recognition result is scanned for by the speech recognition corpus, obtains knowing with first voice The corresponding search result of other result, and using the corresponding search result as the first recognition result.
6. audio recognition method according to claim 4 or 5, which is characterized in that first speech recognition result and institute Stating the second speech recognition result is syllable sequence;
It wherein, is syllable sequence corpus according to the speech recognition corpus that second speech recognition result is established.
7. a kind of speech recognition equipment based on interactive information characterized by comprising
Module is obtained, for being directed to the interactive voice process of user and terminal device, it is corresponding to obtain the interactive voice process Interactive information;
Module is established, the interactive information for obtaining according to the acquisition module establishes speech recognition corpus;
Receiving module, for receiving the first voice of the interactive information input that the user is obtained based on the acquisition module Information;
Identification module, for received to the receiving module by the speech recognition corpus for establishing module foundation First voice messaging carries out speech recognition, determines the first recognition result.
8. speech recognition equipment according to claim 7, which is characterized in that
The receiving module, specifically for receiving the second voice messaging of user's input;
The identification module carries out speech recognition specifically for second voice messaging received to the receiving module, Determine the second recognition result;
The acquisition module, specifically for using the second determining recognition result of the identification module as interactive information;
Described device further include: display module;
The display module, for second recognition result to be shown;
The first voice messaging that the user is inputted based on the interactive information, comprising:
The first voice messaging that the user is inputted based on second recognition result.
9. a kind of electronic equipment characterized by comprising
Processor, memory, communication interface and bus;
Wherein, the processor, memory, communication interface complete mutual communication by the bus;
The communication interface is for the information transmission between the electronic equipment and the communication equipment of relevant device;
The processor is used to call program instruction in the memory, is required with perform claim 1 to as claimed in claim 6 Audio recognition method.
10. a kind of non-transient computer readable storage medium, which is characterized in that the non-transient computer readable storage medium is deposited Store up computer instruction, the computer instruction requires the computer perform claim 1 to described in any one of claim 6 Audio recognition method.
CN201811126926.XA 2018-09-26 2018-09-26 Audio recognition method, device, electronic equipment and computer readable storage medium Pending CN109166581A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811126926.XA CN109166581A (en) 2018-09-26 2018-09-26 Audio recognition method, device, electronic equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811126926.XA CN109166581A (en) 2018-09-26 2018-09-26 Audio recognition method, device, electronic equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN109166581A true CN109166581A (en) 2019-01-08

Family

ID=64880397

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811126926.XA Pending CN109166581A (en) 2018-09-26 2018-09-26 Audio recognition method, device, electronic equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN109166581A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110517675A (en) * 2019-08-08 2019-11-29 出门问问信息科技有限公司 Exchange method, device, storage medium and electronic equipment based on speech recognition
CN111199730A (en) * 2020-01-08 2020-05-26 北京松果电子有限公司 Voice recognition method, device, terminal and storage medium
CN112927570A (en) * 2021-02-23 2021-06-08 京东方科技集团股份有限公司 Interaction method, interaction device, computer equipment and computer-readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130006604A1 (en) * 2011-06-28 2013-01-03 International Business Machines Corporation Cross-lingual audio search
US20140236572A1 (en) * 2013-02-20 2014-08-21 Jinni Media Ltd. System Apparatus Circuit Method and Associated Computer Executable Code for Natural Language Understanding and Semantic Content Discovery
CN106128453A (en) * 2016-08-30 2016-11-16 深圳市容大数字技术有限公司 The Intelligent Recognition voice auto-answer method of a kind of robot and robot
CN107305768A (en) * 2016-04-20 2017-10-31 上海交通大学 Easy wrongly written character calibration method in interactive voice

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130006604A1 (en) * 2011-06-28 2013-01-03 International Business Machines Corporation Cross-lingual audio search
US20140236572A1 (en) * 2013-02-20 2014-08-21 Jinni Media Ltd. System Apparatus Circuit Method and Associated Computer Executable Code for Natural Language Understanding and Semantic Content Discovery
CN107305768A (en) * 2016-04-20 2017-10-31 上海交通大学 Easy wrongly written character calibration method in interactive voice
CN106128453A (en) * 2016-08-30 2016-11-16 深圳市容大数字技术有限公司 The Intelligent Recognition voice auto-answer method of a kind of robot and robot

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110517675A (en) * 2019-08-08 2019-11-29 出门问问信息科技有限公司 Exchange method, device, storage medium and electronic equipment based on speech recognition
CN110517675B (en) * 2019-08-08 2021-12-03 出门问问信息科技有限公司 Interaction method and device based on voice recognition, storage medium and electronic equipment
CN111199730A (en) * 2020-01-08 2020-05-26 北京松果电子有限公司 Voice recognition method, device, terminal and storage medium
CN112927570A (en) * 2021-02-23 2021-06-08 京东方科技集团股份有限公司 Interaction method, interaction device, computer equipment and computer-readable storage medium

Similar Documents

Publication Publication Date Title
US11158102B2 (en) Method and apparatus for processing information
US10553201B2 (en) Method and apparatus for speech synthesis
CN107464554B (en) Method and device for generating speech synthesis model
US11308671B2 (en) Method and apparatus for controlling mouth shape changes of three-dimensional virtual portrait
CN111951780B (en) Multitasking model training method for speech synthesis and related equipment
CN109166581A (en) Audio recognition method, device, electronic equipment and computer readable storage medium
CN108877782A (en) Audio recognition method and device
US11749255B2 (en) Voice question and answer method and device, computer readable storage medium and electronic device
CN110534085B (en) Method and apparatus for generating information
CN109243468A (en) Audio recognition method, device, electronic equipment and storage medium
CN110610698B (en) Voice labeling method and device
US10936815B2 (en) Removable spell checker device
CN109815448B (en) Slide generation method and device
CN109829164A (en) Method and apparatus for generating text
CN112509562A (en) Method, apparatus, electronic device and medium for text post-processing
CN111667810B (en) Method and device for acquiring polyphone corpus, readable medium and electronic equipment
CN109346079A (en) Voice interactive method and device based on Application on Voiceprint Recognition
CN112182255A (en) Method and apparatus for storing media files and for retrieving media files
CN110232920A (en) Method of speech processing and device
JP2021108095A (en) Method for outputting information on analysis abnormality in speech comprehension
CN114020896A (en) Intelligent question and answer method, system, electronic equipment and storage medium
CN111339770B (en) Method and device for outputting information
CN113761136A (en) Dialogue processing method, information processing method, model training method, information processing apparatus, model training apparatus, and storage medium
CN113035246A (en) Audio data synchronous processing method and device, computer equipment and storage medium
CN108922547A (en) Recognition methods, device and the electronic equipment of identity

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190108

RJ01 Rejection of invention patent application after publication