CN107845384A - A kind of audio recognition method - Google Patents
A kind of audio recognition method Download PDFInfo
- Publication number
- CN107845384A CN107845384A CN201711033275.5A CN201711033275A CN107845384A CN 107845384 A CN107845384 A CN 107845384A CN 201711033275 A CN201711033275 A CN 201711033275A CN 107845384 A CN107845384 A CN 107845384A
- Authority
- CN
- China
- Prior art keywords
- information
- user terminal
- service server
- dictation
- sent
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 28
- 239000000284 extract Substances 0.000 claims description 7
- 230000005236 sound signal Effects 0.000 claims description 2
- 235000013399 edible fruits Nutrition 0.000 claims 1
- 230000002452 interceptive effect Effects 0.000 description 9
- 238000012545 processing Methods 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 238000011161 development Methods 0.000 description 2
- 230000004913 activation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 238000012905 input function Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
The present embodiments relate to a kind of audio recognition method, methods described includes:User terminal obtains the voice signal of user's input, and is sent to the first service server;First service server parses voice signal, and generation dictation object information is simultaneously sent to user terminal;User terminal is according to the application service information prestored in dictation object information matching user terminal, it is determined whether the application service information to match with dictation object information be present;If it does, user terminal jumps to corresponding application services according to application service information, with for users to use;If it does not, user terminal is sent to the second service server by object information is dictated;Second service server parsing dictation object information, generates analysis result information and is sent to user terminal;User terminal loads analysis result information and is shown as result displayed page.The present invention, improve speech recognition speed so that user is more convenient and simple using electronic equipment.
Description
Technical field
The present invention relates to communication technical field, more particularly to a kind of audio recognition method.
Background technology
Voice technology is most natural, most convenient a personal-machine interactive interface, its with the continuous development of computer technology,
The visual field of researcher is progressed into, especially with the fast development of computer hardware technology, voice technology is even more to obtain
Universal attention.
When using electronic product, use for convenience, often replace handwriting input or input through keyboard with phonetic entry,
After voice is identified equipment, exports the word content of voice or perform corresponding operational order.At present, in the prior art
Voice assistant by voice be converted into word all need it is slower by server process and the step of processing locality two, speed, it is impossible to
Meet the higher and higher requirement of user.
The content of the invention
The purpose of the present invention is the problem of presence for prior art, there is provided one kind can improve speech recognition speed, make
Obtain electronic device more simply and easily audio recognition method.
To achieve the above object, the invention provides a kind of audio recognition method, methods described to include:
User terminal obtains the voice signal of user's input, and the voice signal is sent into the first service server;
First service server parses the voice signal, and generation dictation object information is simultaneously sent to user's end
End;
The user terminal matches the application service prestored in the user terminal according to the dictation object information
Information, it is determined whether the application service information to match with the dictation object information be present;
If it does, the user terminal jumps to corresponding application services according to the application service information,
So that the user uses;
If it does not, the dictation object information is sent to the second service server by the user terminal;
Second service server parses the dictation object information, generates analysis result information and is sent to the use
Family terminal;The analysis result information includes inquiry content information, related application information on services and download link information;
The user terminal loads the analysis result information and is shown as result displayed page.
Preferably, first service server parses the voice signal, and generation dictation object information is simultaneously sent to institute
User terminal is stated to specifically include:
First service server receives the voice signal, and extracts the characteristic information of the voice signal;
Using the speech recognition database of first service server, first service server identifies the feature
Information, generate the dictation object information;
The dictation object information is sent to the user terminal by first service server.
Preferably, second service server parses the dictation object information, generates analysis result information and sends
Specifically included to the user terminal:
Second service server receives and parses through the dictation object information;
According to the dictation object information, second service server matches described in second service server
Content information and the related application information on services are inquired about, and extracts the download chain corresponding with the related application information on services
Connect information;
Content information, the related application information on services and the corresponding download link information are inquired about according to described,
Second service server generates the analysis result information, and is sent to the user terminal.
It is it is further preferred that described according to the inquiry content information, the related application information on services and described relative
The download link information answered, second service server generate the analysis result information and specifically included:
According to the degree of correlation of the dictation object information and the inquiry content information, the second service server generation
Sequence beacon information and the attribute information for being loaded as the inquiry content information;
According to the dictation object information and the degree of correlation of the related application information on services, second service server
Generate the sequence beacon information and be loaded as the related application information on services and the corresponding download link information
The attribute information;
Second service server should according to the inquiry content information after the addition attribute information, the correlation
With information on services and the corresponding download link information, the analysis result information is generated.
Preferably, the analysis result information is loaded and is shown as result displayed page and specifically wrapped by the user terminal
Include:
The user terminal receives the analysis result information;
The user terminal identifies the analysis result information, by the inquiry content information, the related application service
Information and the download link information are classified, and loaded and displayed is the result displayed page.
Audio recognition method provided in an embodiment of the present invention, solve voice assistant and voice is converted into word all needs
The problem of by two step of server process and processing locality, the speech data for meeting established condition is identified, for the part number
Interactive voice can be completed according to need to only pass through processing locality, improves speech recognition speed, it is quick by way of interactive voice
Use function interested so that the operation of electronic equipment is more simple and convenient, meets the use demand of user.
Brief description of the drawings
Fig. 1 is the schematic flow sheet of audio recognition method provided in an embodiment of the present invention.
Embodiment
Below by drawings and examples, technical scheme is described in further detail.
The present embodiments relate to the audio recognition method of offer, improves speech recognition speed so that electronic equipment is grasped
Work is more simple and convenient, meets the use demand of user.
Fig. 1 is the schematic flow sheet of audio recognition method provided in an embodiment of the present invention.Below in conjunction with shown in Fig. 1, to this
The audio recognition method that inventive embodiments provide illustrates.
Step 101, user terminal obtains the voice signal of user's input, and voice signal is sent into the first business service
Device.
During one specific, user terminal is the electronic equipment that user uses, and the first service server is voice
Turn text service device, the equipment for voice messaging to be converted into text information, respond the request of user terminal, and located
Reason.When user uses the mode of interactive voice, microphone includes the voice of user, and user terminal obtains the use that microphone is collected
The voice signal of family input is audio signal, and voice signal is sent into the i.e. speech-to-text service of the first service server
Device, voice signal is handled.
Step 102, the first service server parsing voice signal, generation dictation object information are simultaneously sent to user terminal.
Specifically, the first service server receives voice signal, and extract the characteristic information of voice signal;Utilize the first industry
The speech recognition database of business server, the first service server identification feature information, generation dictation object information;First business
Server is sent to user terminal by object information is dictated.
During one specific, the first service server receives the voice signal that user terminal is sent, and voice is believed
Number parsed, and extract the characteristic information of voice signal, the first service server is in speech recognition database to extraction
Feature is identified, and so as to which the voice signal of audio-frequency information to be converted to the dictation object information of text information, and is sent to use
Family terminal.
Step 103, user terminal matches the application service information prestored in user terminal according to dictation object information,
Determine whether there is the application service information to match with dictation object information.
If there is the application service information to match with dictation object information, then into step 104;If it does not,
Then enter step 105.
During one specific, user terminal receives the dictation object information that the first service server is sent, parsing
Object information is dictated, meanwhile, the application service information prestored is obtained, the matching dictation result letter in application service information
Breath, it is determined whether the application service information to match with dictation object information be present, if it is present server need not be carried out
Processing, if it does not exist, then needing further server process.
When the application service information to match with dictation object information be present:
Step 104, user terminal jumps to corresponding application services according to application service information, so that user makes
With.
If there is the application service information to match with dictation object information, then illustrate that user terminal originally exists on the ground and use
Family application services interested, therefore it may only be necessary to carry out processing locality, user terminal takes according to the application matched
Business information is jumped directly on corresponding application services, for users to use, terminates this interactive voice.
When in the absence of the application service information to match with dictation object information:
Step 105, user terminal is sent to the second service server by object information is dictated.
Second service server referred herein is semantic resolution server.If there is no with dictating object information phase
The application service information matched somebody with somebody, then illustrate that user's content interested is originally not present in user terminal on the ground, then need to solve by semanteme
Analysis server is inquired about, and therefore, user terminal is sent to the second service server firstly the need of by dictation object information, then enters
Row is handled in next step.
Step 106, the second service server parsing dictation object information, generate analysis result information and be sent to user's end
End.
Wherein, analysis result information includes inquiry content information, related application information on services and download link information.Specifically
, the second service server receives and parses through dictation object information;According to dictation object information, the second service server matching the
Inquiry content information and related application information on services in two service servers, and extract corresponding with related application information on services
Download link information.Preferably, according to dictation object information and the degree of correlation for inquiring about content information, the life of the second service server
Into sequence beacon information and it is loaded as inquiring about the attribute information of content information;Believed according to dictation object information and related application service
The degree of correlation of breath, the second service server generation sequence beacon information and be loaded as related application information on services and it is corresponding under
Carry the attribute information of link information;Second service server is according to the inquiry content information after addition attribute information, related application
Information on services and corresponding download link information, generate analysis result information.
Checked in order to facilitate user, it is necessary to enter to the multiple inquiry content informations and related application information on services inquired
Row sequence, according to dictation object information and the degree of correlation inquired about between content information, the second service server can generate sequence mark
Show information, while the beacon information that sorts is added to the attribute information of inquiry content information, in display, according to inquiry content letter
The attribute information of breath can be with the sequence to the inquiry content information progress degree of correlation from high to low, similarly, the second service server
Also the sequence of aforesaid way can be carried out to the download link information of related application information on services corresponding thereto, so as to which user is more enough
The content interested to oneself must faster be found.
Step 107, user terminal loads analysis result information and is shown as result displayed page.
Specifically, user terminal receives analysis result information;User terminal identifies analysis result information, by inquiry content letter
Breath, related application information on services and download link information are classified, and loaded and displayed is result displayed page.
During specific at one, user terminal receives the analysis result information after sequence, and will inquiry content information,
Related application information on services and download link information carry out classification display displaying, wherein, inquiry content information is defeated for user
The information of the voice signal correlation entered, such as user want the tourist attractions near inquiry, and inquiry content information will be
The related introduction information of neighbouring tourist attractions, related application information on services are that the application service webpage that user wants to inquire about uses version
Information, download link information are the download link for the application service that user wants inquiry.
In order to be better understood from said process, it is illustrated with a specific example.
User terminal is to additionally provide some specific value-added services, such as health consultation service, the first service server
For speech-to-text server, the second service server is semantic resolution server.
User enters user terminal, the microphone icon in long-press application main screen, you can activation voice input function, now
The microphone of application is in typing state, includes the voice of user, after completion is included, using speech recognition technology by voice
Be converted to word.Word after conversion is carried out matching treatment by user terminal first, if the content of word applies journey to open
The related content of sequence, then judge the application program to be inquired about of user whether is stored in the system of the electronic equipment, if so, then
The application program is opened, terminates this interactive voice, if it is not, judging the specific of correlation whether is provided in user terminal
Service, if so, then specified page is opened for users to use, if it is not, needing to transfer to this speech recognition content
Semantic resolution server processing, semantic resolution server processing will return to user to the speech processes result of speech recognition content
Terminal, user terminal do according to the different type of returning result and show or redirect accordingly, terminate this interactive voice.Voice is handed over
After mutually terminating, user can be to carry out corresponding function use or the information of decorrelation according to the result of feedback.
Audio recognition method provided in an embodiment of the present invention, solve voice assistant and voice is converted into word all needs
The problem of by two step of server process and processing locality, a part, which need to only pass through processing locality, can complete interactive voice, carry
High speech recognition speed, quickly uses function interested so that the operation of electronic equipment is more by way of interactive voice
It is simple and convenient, meet the use demand of user.
Professional should further appreciate that, each example described with reference to the embodiments described herein
Unit and algorithm steps, it can be realized with electronic hardware, computer software or the combination of the two, it is hard in order to clearly demonstrate
The interchangeability of part and software, the composition and step of each example are generally described according to function in the above description.
These functions are performed with hardware or software mode actually, application-specific and design constraint depending on technical scheme.
Professional and technical personnel can realize described function using distinct methods to each specific application, but this realization
It is it is not considered that beyond the scope of this invention.
The method that is described with reference to the embodiments described herein can use hardware, computing device the step of algorithm
Software module, or the two combination are implemented.Software module can be placed in random access memory (RAM), internal memory, read-only storage
(ROM), in electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technical field
In known any other form of storage medium.
Above-described embodiment, the purpose of the present invention, technical scheme and beneficial effect are carried out further
Describe in detail, should be understood that the embodiment that the foregoing is only the present invention, be not intended to limit the present invention
Protection domain, within the spirit and principles of the invention, any modification, equivalent substitution and improvements done etc., all should include
Within protection scope of the present invention.
Claims (5)
1. a kind of audio recognition method, it is characterised in that methods described includes:
User terminal obtains the voice signal of user's input, and the voice signal is sent into the first service server;
First service server parses the voice signal, and generation dictation object information is simultaneously sent to the user terminal;
The user terminal matches the application service information prestored in the user terminal according to the dictation object information,
Determine whether there is the application service information to match with the dictation object information;
If it does, the user terminal jumps to corresponding application services according to the application service information, for
The user uses;
If it does not, the dictation object information is sent to the second service server by the user terminal;
Second service server parses the dictation object information, generates analysis result information and is sent to user's end
End;The analysis result information includes inquiry content information, related application information on services and download link information;
The user terminal loads the analysis result information and is shown as result displayed page.
2. audio recognition method according to claim 1, it is characterised in that the first service server parsing institute predicate
Sound signal, generation dictate object information and are sent to the user terminal and specifically include:
First service server receives the voice signal, and extracts the characteristic information of the voice signal;
Using the speech recognition database of first service server, first service server identifies the feature letter
Breath, generates the dictation object information;
The dictation object information is sent to the user terminal by first service server.
3. audio recognition method according to claim 1, it is characterised in that listened described in the second service server parsing
Write object information, generate analysis result information and be sent to the user terminal and specifically include:
Second service server receives and parses through the dictation object information;
According to the dictation object information, second service server matches the inquiry in second service server
Content information and the related application information on services, and extract the download link letter corresponding with the related application information on services
Breath;
It is described according to inquiry content information, the related application information on services and the corresponding download link information
Second service server generates the analysis result information, and is sent to the user terminal.
4. audio recognition method according to claim 4, it is characterised in that described according to the inquiry content information, institute
State related application information on services and the corresponding download link information, the second service server generation parsing knot
Fruit information specifically includes:
According to the degree of correlation of the dictation object information and the inquiry content information, the second service server generation sequence
Beacon information and the attribute information for being loaded as the inquiry content information;
According to the degree of correlation of the dictation object information and the related application information on services, the second service server generation
The sequence beacon information is simultaneously loaded as the described of the related application information on services and the corresponding download link information
Attribute information;
Second service server takes according to the inquiry content information after the addition attribute information, the related application
Information of being engaged in and the corresponding download link information, generate the analysis result information.
5. audio recognition method according to claim 1, it is characterised in that the user terminal believes the analysis result
Breath loads and is shown as result displayed page and specifically includes:
The user terminal receives the analysis result information;
The user terminal identifies the analysis result information, by the inquiry content information, the related application information on services
Classified with the download link information, and loaded and displayed is the result displayed page.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711033275.5A CN107845384A (en) | 2017-10-30 | 2017-10-30 | A kind of audio recognition method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711033275.5A CN107845384A (en) | 2017-10-30 | 2017-10-30 | A kind of audio recognition method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107845384A true CN107845384A (en) | 2018-03-27 |
Family
ID=61680953
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711033275.5A Pending CN107845384A (en) | 2017-10-30 | 2017-10-30 | A kind of audio recognition method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107845384A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109545214A (en) * | 2018-12-26 | 2019-03-29 | 苏州思必驰信息科技有限公司 | Message distributing method and device based on voice interactive system |
CN109918591A (en) * | 2019-03-01 | 2019-06-21 | 北京猎户星空科技有限公司 | Using adding method, device, electronic equipment and storage medium |
CN110058916A (en) * | 2019-04-23 | 2019-07-26 | 深圳创维数字技术有限公司 | A kind of phonetic function jump method, device, equipment and computer storage medium |
CN112786022A (en) * | 2019-11-11 | 2021-05-11 | 青岛海信移动通信技术股份有限公司 | Terminal, first voice server, second voice server and voice recognition method |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120179463A1 (en) * | 2011-01-07 | 2012-07-12 | Nuance Communications, Inc. | Configurable speech recognition system using multiple recognizers |
CN103177104A (en) * | 2013-03-26 | 2013-06-26 | 北京小米科技有限责任公司 | Searching method and device of application program |
CN104462262A (en) * | 2014-11-21 | 2015-03-25 | 北京奇虎科技有限公司 | Method and device for achieving voice search and browser client side |
CN106101789A (en) * | 2016-07-06 | 2016-11-09 | 深圳Tcl数字技术有限公司 | The voice interactive method of terminal and device |
CN106446265A (en) * | 2016-10-18 | 2017-02-22 | 江西博瑞彤芸科技有限公司 | Question inquiry display method for intelligent terminal |
CN107153965A (en) * | 2017-04-05 | 2017-09-12 | 芜湖恒天易开软件科技股份有限公司 | A kind of intelligent customer service solution of multiple terminals |
-
2017
- 2017-10-30 CN CN201711033275.5A patent/CN107845384A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120179463A1 (en) * | 2011-01-07 | 2012-07-12 | Nuance Communications, Inc. | Configurable speech recognition system using multiple recognizers |
CN103177104A (en) * | 2013-03-26 | 2013-06-26 | 北京小米科技有限责任公司 | Searching method and device of application program |
CN104462262A (en) * | 2014-11-21 | 2015-03-25 | 北京奇虎科技有限公司 | Method and device for achieving voice search and browser client side |
CN106101789A (en) * | 2016-07-06 | 2016-11-09 | 深圳Tcl数字技术有限公司 | The voice interactive method of terminal and device |
CN106446265A (en) * | 2016-10-18 | 2017-02-22 | 江西博瑞彤芸科技有限公司 | Question inquiry display method for intelligent terminal |
CN107153965A (en) * | 2017-04-05 | 2017-09-12 | 芜湖恒天易开软件科技股份有限公司 | A kind of intelligent customer service solution of multiple terminals |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109545214A (en) * | 2018-12-26 | 2019-03-29 | 苏州思必驰信息科技有限公司 | Message distributing method and device based on voice interactive system |
CN109918591A (en) * | 2019-03-01 | 2019-06-21 | 北京猎户星空科技有限公司 | Using adding method, device, electronic equipment and storage medium |
CN110058916A (en) * | 2019-04-23 | 2019-07-26 | 深圳创维数字技术有限公司 | A kind of phonetic function jump method, device, equipment and computer storage medium |
CN112786022A (en) * | 2019-11-11 | 2021-05-11 | 青岛海信移动通信技术股份有限公司 | Terminal, first voice server, second voice server and voice recognition method |
CN112786022B (en) * | 2019-11-11 | 2023-04-07 | 青岛海信移动通信技术股份有限公司 | Terminal, first voice server, second voice server and voice recognition method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10997226B2 (en) | Crafting a response based on sentiment identification | |
CN106874467B (en) | Method and apparatus for providing search results | |
US9754592B2 (en) | Methods and systems for speech-enabling a human-to-machine interface | |
US20170169822A1 (en) | Dialog text summarization device and method | |
US8688453B1 (en) | Intent mining via analysis of utterances | |
CN110765244A (en) | Method and device for acquiring answering, computer equipment and storage medium | |
CN110597952A (en) | Information processing method, server, and computer storage medium | |
CN107590172B (en) | Core content mining method and device for large-scale voice data | |
US8874590B2 (en) | Apparatus and method for supporting keyword input | |
US9424253B2 (en) | Domain specific natural language normalization | |
CN107845384A (en) | A kind of audio recognition method | |
JP2019061662A (en) | Method and apparatus for extracting information | |
CN105657129A (en) | Call information obtaining method and device | |
CN102982061A (en) | Information processing apparatus, information processing method, and program | |
JP2019207648A (en) | Interactive business assistance system | |
JP2019003319A (en) | Interactive business support system and interactive business support program | |
CN111383631A (en) | Voice interaction method, device and system | |
US20170011114A1 (en) | Common data repository for improving transactional efficiencies of user interactions with a computing device | |
CN107844470A (en) | A kind of voice data processing method and its equipment | |
US20150179165A1 (en) | System and method for caller intent labeling of the call-center conversations | |
CN111681087A (en) | Information processing method and device, computer readable storage medium and electronic equipment | |
US10803853B2 (en) | Audio transcription sentence tokenization system and method | |
JP6576847B2 (en) | Analysis system, analysis method, and analysis program | |
US20180046683A1 (en) | Search word list providing device and method using same | |
CN113051389A (en) | Knowledge pushing method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180327 |
|
RJ01 | Rejection of invention patent application after publication |