CN109584865A - A kind of application control method, device, readable storage medium storing program for executing and terminal device - Google Patents

A kind of application control method, device, readable storage medium storing program for executing and terminal device Download PDF

Info

Publication number
CN109584865A
CN109584865A CN201811210044.1A CN201811210044A CN109584865A CN 109584865 A CN109584865 A CN 109584865A CN 201811210044 A CN201811210044 A CN 201811210044A CN 109584865 A CN109584865 A CN 109584865A
Authority
CN
China
Prior art keywords
control instruction
keyword
word
voice messaging
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811210044.1A
Other languages
Chinese (zh)
Inventor
董亚荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201811210044.1A priority Critical patent/CN109584865A/en
Publication of CN109584865A publication Critical patent/CN109584865A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/451Execution arrangements for user interfaces
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Abstract

The invention belongs to field of computer technology more particularly to a kind of application control method, device, computer readable storage medium and terminal devices.The method is after receiving voice collecting instruction, acquire the voice messaging of user's input, and speech recognition is carried out to the voice messaging of acquisition, obtain text information corresponding with the voice messaging, then it is calculated by matching degree and determines that the target control to application program instructs, and controlled the application program and execute operation corresponding with target control instruction.Through the embodiment of the present invention, user can assign the control instruction to application program by way of voice control, and application program can execute corresponding operation automatically, and easy to operation, efficiency has obtained great promotion, so that user obtains preferable usage experience.

Description

A kind of application control method, device, readable storage medium storing program for executing and terminal device
Technical field
The invention belongs to field of computer technology more particularly to a kind of application control methods, device, computer-readable Storage medium and terminal device.
Background technique
With the development of technology, more and more enterprises start using electronic office, and user can directly answer in office With the matters such as application, application of going on business, reimbursement application, outgoing application of asking for leave are carried out in program, compared to traditional paper applicant Formula greatly improves work efficiency.But existing office application procedure operation is still comparatively laborious, needs repeatedly to be clicked, Lookup could open corresponding function choosing-item, take time and effort, user experience is poor.
Summary of the invention
In view of this, the embodiment of the invention provides a kind of application control method, device, computer-readable storage mediums Matter and terminal device, user experience poor problem comparatively laborious to solve existing office application procedure operation.
The first aspect of the embodiment of the present invention provides a kind of application control method, may include:
After receiving voice collecting instruction, the voice messaging of user's input is acquired, includes corresponding in the voice messaging With the control instruction of program;
Speech recognition is carried out to the voice messaging of acquisition, obtains text information corresponding with the voice messaging;
Calculate separately the matching degree between each control instruction in the text information and preset control instruction set;
From the highest control instruction conduct of matching degree chosen in the control instruction set between the text information Target control instruction to the application program, and control the application program and execute behaviour corresponding with target control instruction Make.
The second aspect of the embodiment of the present invention provides a kind of application program controlling device, may include:
Voice messaging acquisition module, for acquiring the voice messaging of user's input, institute after receiving voice collecting instruction Stating includes control instruction to application program in voice messaging;
Speech recognition module obtains and the voice messaging for carrying out speech recognition to the voice messaging of acquisition Corresponding text information;
Matching degree computing module, for calculating separately each control in the text information and preset control instruction set Matching degree between system instruction;
Module is chosen in target control instruction, for choosing between the text information from the control instruction set The highest control instruction of matching degree is instructed as the target control to the application program;
Operation executing module executes operation corresponding with target control instruction for controlling the application program.
The third aspect of the embodiment of the present invention provides a kind of computer readable storage medium, the computer-readable storage Media storage has computer-readable instruction, and the computer-readable instruction realizes following steps when being executed by processor:
After receiving voice collecting instruction, the voice messaging of user's input is acquired, includes corresponding in the voice messaging With the control instruction of program;
Speech recognition is carried out to the voice messaging of acquisition, obtains text information corresponding with the voice messaging;
Calculate separately the matching degree between each control instruction in the text information and preset control instruction set;
From the highest control instruction conduct of matching degree chosen in the control instruction set between the text information Target control instruction to the application program, and control the application program and execute behaviour corresponding with target control instruction Make.
The fourth aspect of the embodiment of the present invention provides a kind of terminal device, including memory, processor and is stored in In the memory and the computer-readable instruction that can run on the processor, the processor executes the computer can Following steps are realized when reading instruction:
After receiving voice collecting instruction, the voice messaging of user's input is acquired, includes corresponding in the voice messaging With the control instruction of program;
Speech recognition is carried out to the voice messaging of acquisition, obtains text information corresponding with the voice messaging;
Calculate separately the matching degree between each control instruction in the text information and preset control instruction set;
From the highest control instruction conduct of matching degree chosen in the control instruction set between the text information Target control instruction to the application program, and control the application program and execute behaviour corresponding with target control instruction Make.
Existing beneficial effect is the embodiment of the present invention compared with prior art: the embodiment of the present invention is adopted receiving voice After collection instruction, the voice messaging of acquisition user's input, and speech recognition is carried out to the voice messaging of acquisition, it obtains and institute's predicate Message ceases corresponding text information, is then calculated by matching degree and determines that the target control to application program instructs, and controlled The application program executes operation corresponding with target control instruction.Through the embodiment of the present invention, user can pass through voice The mode of control assigns the control instruction to application program, and application program can execute corresponding operation automatically, easy to operation, Efficiency has obtained great promotion, so that user obtains preferable usage experience.
Detailed description of the invention
It to describe the technical solutions in the embodiments of the present invention more clearly, below will be to embodiment or description of the prior art Needed in attached drawing be briefly described, it should be apparent that, the accompanying drawings in the following description is only of the invention some Embodiment for those of ordinary skill in the art without any creative labor, can also be according to these Attached drawing obtains other attached drawings.
Fig. 1 is a kind of one embodiment flow chart of application control method in the embodiment of the present invention;
Fig. 2 is the matching degree between each control instruction calculated separately in text information and preset control instruction set Schematic flow diagram;
Fig. 3 is the schematic flow diagram for calculating the vocal print feature vector of voice messaging;
Fig. 4 is a kind of one embodiment structure chart of application program controlling device in the embodiment of the present invention;
Fig. 5 is a kind of schematic block diagram of terminal device in the embodiment of the present invention.
Specific embodiment
In order to make the invention's purpose, features and advantages of the invention more obvious and easy to understand, below in conjunction with the present invention Attached drawing in embodiment, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that disclosed below Embodiment be only a part of the embodiment of the present invention, and not all embodiment.Based on the embodiments of the present invention, this field Those of ordinary skill's all other embodiment obtained without making creative work, belongs to protection of the present invention Range.
Referring to Fig. 1, a kind of one embodiment of application control method may include: in the embodiment of the present invention
Step S101, after receiving voice collecting instruction, the voice messaging of acquisition user's input.
It include the control instruction to application program in the voice messaging.It is centered in the specific bit of the application program Voice input button refers to that is, having issued voice collecting to application program when user's long-pressing voice input button It enables, application program starts to call the voice messaging of microphone acquisition user's input built in terminal device, when user unclamps the language When sound input button, that is, having issued voice collecting command for stopping to application program, application program terminates to voice messaging Acquisition.
Step S102, speech recognition is carried out to the voice messaging of acquisition, obtains text corresponding with the voice messaging This information.
One section of voice messaging is exactly converted into corresponding text information, mainly mentioned comprising feature by the speech recognition It takes, acoustic model, the processes such as language model and decoding, furthermore needs to more effectively extract feature toward contact to being acquired To voice messaging be filtered, the audio datas pretreatment work such as framing, by the audio signal for needing to analyze from original signal In suitably extract.
Voice messaging is transformed into frequency domain from time domain by feature extraction work, provides suitable feature vector for acoustic model.
Score in acoustic model further according to calculation of Acoustic Characteristics each feature vector on acoustic feature.The present embodiment In preferably use hidden Markov (Hidden Markov Model, HMM) Acoustic Modeling: the concept of Markov model is one A discrete time-domain finite-state automata, hidden Markov refer to that the internal state external world of this Markov model is invisible, The external world can only see the output valve at each moment.To speech recognition system, output valve is usually exactly calculated from each frame Acoustic feature.With HMM portray voice messaging need to make two it is assumed that first is that the transfer of internal state only it is related with laststate, Another is that output valve is only related with current state (or current state shifts), the two assume the complexity for greatly reducing model Degree.It the use of HMM is usually from ring, the topological structure with leap with unidirectional from left to right, band come to Recognition unit in speech recognition Modeling, a phoneme are exactly the HMM of three to five states, and a word is exactly to constitute the HMM of multiple phonemes of word serially The HMM of composition, and the entire model of continuous speech recognition is exactly word and the mute HMM to combine.
Language model then according to the relevant theory of linguistics, calculates the probability of the corresponding possible phrase sequence of the voice messaging. Preferably using N-Gram language model in the present embodiment, the model is based on such a it is assumed that the appearance of n-th of word and front N-1 word is related, and all uncorrelated to other any words, and the probability of whole sentence is exactly the product of each word probability of occurrence.These are general Rate can be obtained by directly counting the number of N number of word while appearance from corpus, and the most commonly used is the Bi-Gram of binary and ternarys Tri-Gram.The performance of language model is usually measured with cross entropy and complexity.The meaning of cross entropy is with the model pair The difficulty of identification, or from the perspective of compression, each word is averagely encoded with several positions.The meaning of complexity is to use to be somebody's turn to do Model indicates that the average branch's number of this text, inverse can be considered the average probability of each word.Smoothly refer to not observing N member combination assign a probability value, to guarantee that word sequence can obtain a probability value by language model.
Finally according to existing dictionary, phrase sequence is decoded, text corresponding with the voice messaging can be obtained This information.
Step S103, it calculates separately between each control instruction in the text information and preset control instruction set Matching degree.
It can include but is not limited to create application of asking for leave in the control instruction set, create application of going on business, creation reimbursement Application, the outgoing application of creation etc. control instruction.
As shown in Fig. 2, step S103 can specifically include following process:
Step S1031, determination keyword set corresponding with each control instruction, and calculate separately each keyword The classification identification of each keyword in set.
Firstly, carrying out word cutting processing to each corpus in preset corpus, each word is obtained.
It include corpus word bank corresponding with each control instruction in the corpus, wherein each corpus word bank can Large-scale user data is counted to obtain with basis.Specifically, each user is obtained to be used to when issuing a certain control instruction The sentence being often used, and these sentences are added into corpus word bank corresponding with this control instruction.For example, if being used Amount according to statistics when, party A-subscriber gets used to " I will go on business " this sentence to issue creation and go on business the control instruction of application, and B is used Family gets used to " me please be helped to create an application of going on business " this sentence to issue the control instruction for creating application of going on business, then by this A little sentences are added into corpus word bank corresponding with the control instruction of application of going on business is created, as corpus therein.
Word cutting, which is handled, to be referred to a material segmentation into individual word one by one, in the present embodiment, can basis Universaling dictionary carries out cutting to corpus, and guaranteeing the word separated all is normal vocabulary, separates individual character if word is not in dictionary. When front-rear direction can be at word, such as " praying to Gods for blessing ", it can be divided according to the size of statistics word frequency, such as " it is required that " word frequency Gao Ze " it is required that/mind " is separated, " want/pray to Gods for blessing " is separated if " praying to Gods for blessing " word frequency height.
Then, the frequency that each word occurs in each corpus word bank is counted respectively, and is calculated separately according to the following formula each The classification identification of a word:
Wherein, w is the serial number of word, and 1≤w≤WordNum, WordNum are the total number of word, FreqSeqwFor w The frequency sequence that a word occurs in each corpus word bank, and FreqSeqw=[Freqw,1,Freqw,2,......, Freqw,c,......,Freqw,ClassNum], Freqw,cIt is w-th of word in corpus word bank corresponding with c-th of control instruction The frequency of appearance, FreqSeq 'wFor from FreqSeqwIn get rid of remaining sequence after maximum value, it may be assumed that FreqSeq 'w= FreqSeqw-MAX(FreqSeqw), MAX is maximizing function, ClassDegwFor the classification identification of w-th of word;
Then, word of the classification identification greater than preset identification threshold value is chosen as keyword, and the keyword pair It should be in FreqSeqwObtain corresponding control instruction when maximum value.
The identification threshold value can be configured according to the actual situation, for example, can be set to 5,10,20 or Other values.
It can determine according to the following formula control instruction corresponding with each keyword:
TgtKwSetw=argmax (FreqSeqw)=argmax (Freqw,1,Freqw,2,......, Freqw,c,......,Freqw,ClassNum) wherein, TgtKwSetwFor the serial number of control instruction corresponding with w-th of keyword;
For example, " sick " this word is 1000 in the frequency for applying occurring in corresponding corpus word bank of asking for leave with creation Secondary, the frequency occurred in corpus word bank corresponding with creation reimbursement application is 20 times, applies for corresponding language going on business with creation The frequency occurred in material for making clothes library is 1 time, then its identification of classifying are as follows:
Its identification of classifying is greater than identification threshold value, then can determine it as keyword, since it is submitting an expense account Shen with creation The frequency that please occur in corresponding corpus word bank is most, then can determine that it is to submit an expense account to apply for that this control instruction is corresponding with creation Keyword.
Finally, the corresponding keyword of each and c-th of control instruction is configured to key corresponding with c-th of control instruction Set of words, as shown in the table:
Control instruction Keyword set
Control instruction 1 Set 1={ keyword 1, keyword 2, keyword 3 }
Control instruction 2 Set 2={ keyword 4, keyword 5, keyword 6 }
Control instruction 3 Set 3={ keyword 7, keyword 8 }
…….. ……..
…….. ……..
It may include: " asking for leave ", " marriage leave ", " life in the keyword set of the control instruction of application for example, creation is asked for leave The keywords such as disease ", " accompanying production ".
Step S1032, the frequency that each keyword occurs in the text information is counted respectively.
Step S1033, the matching degree between the text information and each control instruction is calculated separately.
Preferably, the matching degree between the text information and each control instruction can be calculated separately according to the following formula:
Wherein, c is the serial number of control instruction, and 1≤c≤ClassNum, ClassNum are the sum of control instruction, and kn is to close The serial number of keyword, 1≤kn≤KwNumc, KwNumcIt is total for the keyword in keyword set corresponding with c-th of control instruction Number, MsgKWNumc,knIt is n keyword of kth in keyword set corresponding with c-th of control instruction in the text information The frequency of middle appearance, ClassDegc,knFor point of n keyword of kth in keyword set corresponding with c-th of control instruction Class identification, MatchDegcFor the matching degree between the text information and c-th of control instruction.
Step S104, from the highest control of matching degree chosen in the control instruction set between the text information It instructs as the target control instruction to the application program.
It can determine that the target control to the application program instructs according to the following formula:
TargetCmd=argmax (MatchDegSeq)
=argmax (MatchDeg1,MatchDeg2,......,MatchDegc,......,MatchDegClassNum)
Wherein, MatchDegSeq=(MatchDeg1,MatchDeg2,......,MatchDegc,......, MatchDegClassNum), i.e. matching degree series of the MatchDegSeq between the text information and each control instruction, TargetCmd is the serial number of the target control instruction to the application program finally determined.
Step S105, it controls the application program and executes operation corresponding with target control instruction.
After determining user institute operation to be performed, the operating procedure can be executed automatically.For example, if the language that user issues Message breath is " I am sick ", is calculated by speech recognition and matching degree, determines that user wants to create application of asking for leave, then controls institute It states application program and automatically opens corresponding operation interface, create a new application of asking for leave for it.
Preferably, in order to guarantee safety, after collecting voice messaging, language is being carried out to the voice messaging of acquisition Before sound identification, which can also be authenticated, to prevent other users from active user being pretended to be to execute operation.
Firstly, calculating the vocal print feature vector of the voice messaging.
As shown in figure 3, the calculating process of the vocal print feature vector of the voice messaging may include:
Step S301, the voice messaging is divided into M voice subsegment.
Wherein, M is the integer greater than 1, and specific value can be configured according to the actual situation, for example, can be by it It is set as 3,5,10 or other values etc..
Step S302, the Meier frequency spectrum scramble coefficient vector of each voice subsegment is calculated separately.
Preferably, the Meier frequency spectrum scramble coefficient vector of each voice subsegment can be calculated separately according to the following formula:
MelVecm=MFCCFuc (SubVoicem)
Wherein, m is the serial number of voice subsegment, 1≤m≤M, SubVoicemFor m-th of voice subsegment, MFCCFuc is default Meier frequency spectrum scramble coefficient calculate function, MelVecmFor the Meier frequency spectrum scramble coefficient vector of m-th of voice subsegment, and MelVecm=(MelCoem,1,MelCoem,2,......,MelCoem,n,......,MelCoem,N), MelCoem,nIt is m-th N-th of Meier frequency spectrum scramble coefficient of voice subsegment.
Step S303, the weight coefficient of each voice subsegment is calculated separately.
Preferably, the weight coefficient of each voice subsegment can be calculated separately according to the following formula:
Wherein, WeightmFor the weight coefficient of m-th of voice subsegment;
Step S304, the vocal print feature vector of the voice messaging is constructed.
Preferably, the vocal print feature vector of the voice messaging can be constructed according to the following formula:
VoPrintVec=(VpElem1,VpElem2,......,VpElemn,......,VpElemN)
Wherein,VoPrintVec is the vocal print feature of the voice messaging Vector.
Then, and in preset database inquire reference characteristic vector corresponding with the user.
The reference characteristic vector is the vocal print feature extracted from the voice of user corresponding to current logon account in advance Vector, specific calculating process is similar with aforementioned process, and details are not described herein.
Then, the phase between the vocal print feature vector of the voice messaging and the reference characteristic vector is calculated according to the following formula Like degree:
Wherein, n is the element numbers of the vocal print feature vector of the voice messaging, and 1≤n≤N, N are the voice messaging Vocal print feature vector element sum, VpElemnFor the nth elements of the vocal print feature vector of the voice messaging, StVpElemnFor the nth elements of the reference characteristic vector, vocal print feature vector and institute of the SimDeg for the voice messaging State the similarity between reference characteristic vector.
If the similarity between the vocal print feature vector of the voice messaging and the reference characteristic vector is greater than preset Similarity threshold, then that illustrate sending voice messaging is user corresponding to current logon account, then executes described pair of acquisition The voice messaging carries out the step of speech recognition and its subsequent step.If the vocal print feature vector of the voice messaging with it is described Similarity between reference characteristic vector is less than or equal to the similarity threshold, then illustrate sending voice messaging is not to work as User corresponding to preceding logon account should neglect the voice messaging at this time, no longer execute the voice letter of described pair of acquisition Breath carries out the step of speech recognition and its subsequent step.
The similarity threshold can be configured according to the actual situation, for example, can be set to 70%, 80%, 90% or other values etc..
It preferably, in the present embodiment, can also be the different operating right of different user settings, each user can only It controls the application program and executes operation corresponding with its permission, as shown in the table:
User Operating right
User 1 Operating right set 1={ operation 1, operation 2, operation 3, operation 4 }
User 2 Operating right set 2={ operation 1, operation 3, operation 4 }
User 3 Operating right set 3={ operation 1, operation 2 }
…… ……
…… ……
When determined with the target control instruct corresponding operation after, need to inquire the user whether have it is corresponding Operating right not executes operation, and carry out related prompt to user, if the user has if the user does not have operating right There is operating right, then executes the control application program and execute the step of instructing corresponding operation with the target control.
In conclusion the embodiment of the present invention after receiving voice collecting instruction, acquires the voice messaging of user's input, and Speech recognition is carried out to the voice messaging of acquisition, text information corresponding with the voice messaging is obtained, then passes through matching Degree calculates the target control instruction determined to application program, and controls the application program and execute and target control instruction pair The operation answered.Through the embodiment of the present invention, user can assign the control instruction to application program by way of voice control, Application program can execute corresponding operation automatically, and easy to operation, efficiency has obtained great promotion so that user obtain compared with Good usage experience.
It should be understood that the size of the serial number of each step is not meant that the order of the execution order in above-described embodiment, each process Execution sequence should be determined by its function and internal logic, the implementation process without coping with the embodiment of the present invention constitutes any limit It is fixed.
Corresponding to a kind of application control method described in foregoing embodiments, Fig. 4 shows offer of the embodiment of the present invention A kind of application program controlling device one embodiment structure chart.
In the present embodiment, a kind of application program controlling device may include:
Voice messaging acquisition module 401, for after receiving voice collecting instruction, the voice of acquisition user's input to be believed It ceases, includes the control instruction to application program in the voice messaging;
Speech recognition module 402 obtains believing with the voice for carrying out speech recognition to the voice messaging of acquisition Cease corresponding text information;
Matching degree computing module 403, for calculate separately the text information with it is each in preset control instruction set Matching degree between a control instruction;
Target control instruction choose module 404, for from the control instruction set choose with the text information it Between the highest control instruction of matching degree as to the application program target control instruction;
Operation executing module 405 executes operation corresponding with target control instruction for controlling the application program.
Further, the matching degree computing module may include:
Keyword set determination unit, for determining keyword set corresponding with each control instruction, and respectively Calculate the classification identification of each keyword in each keyword set;
Frequency statistics unit, the frequency occurred in the text information for counting each keyword respectively;
Matching degree computing unit, for calculating separately between the text information and each control instruction according to the following formula With degree:
Wherein, c is the serial number of control instruction, and 1≤c≤ClassNum, ClassNum are the sum of control instruction, and kn is to close The serial number of keyword, 1≤kn≤KwNumc, KwNumcIt is total for the keyword in keyword set corresponding with c-th of control instruction Number, MsgKWNumc,knIt is n keyword of kth in keyword set corresponding with c-th of control instruction in the text information The frequency of middle appearance, ClassDegc,knFor point of n keyword of kth in keyword set corresponding with c-th of control instruction Class identification, MatchDegcFor the matching degree between the text information and c-th of control instruction.
Further, the keyword set determination unit may include:
Word cutting handles subelement, for carrying out word cutting processing to each corpus in preset corpus, obtains each word Language includes corpus word bank corresponding with each control instruction in the corpus;
Frequency statistics subelement, the frequency occurred in each corpus word bank for counting each word respectively;
Classification identification computation subunit, for calculating separately the classification identification of each word according to the following formula:
Wherein, w is the serial number of word, and 1≤w≤WordNum, WordNum are the total number of word, FreqSeqwFor w The frequency sequence that a word occurs in each corpus word bank, and FreqSeqw=[Freqw,1,Freqw,2,......, Freqw,c,......,Freqw,ClassNum], Freqw,cIt is w-th of word in corpus word bank corresponding with c-th of control instruction The frequency of appearance, FreqSeq 'wFor from FreqSeqwIn get rid of remaining sequence after maximum value, it may be assumed that FreqSeq 'w= FreqSeqw-MAX(FreqSeqw), MAX is maximizing function, ClassDegwFor the classification identification of w-th of word;
Keyword determines subelement, is greater than the word of preset identification threshold value as crucial for choosing classification identification Word, and control instruction corresponding with each keyword is determined according to the following formula:
TgtKwSetw=argmax (FreqSeqw)=argmax (Freqw,1,Freqw,2,......, Freqw,c,......,Freqw,ClassNum) wherein, TgtKwSetwFor the serial number of control instruction corresponding with w-th of keyword;
Keyword set constructs subelement, for each keyword corresponding with c-th of control instruction to be configured to and c The corresponding keyword set of a control instruction.
Further, the application program controlling device can also include:
Vocal print feature vector calculation module, for calculating the vocal print feature vector of the voice messaging;
Reference characteristic vector query module, for inquiring reference characteristic corresponding with the user in preset database Vector;
Similarity calculation module, the vocal print feature vector and the benchmark for calculating the voice messaging according to the following formula are special Levy the similarity between vector:
Wherein, n is the element numbers of the vocal print feature vector of the voice messaging, and 1≤n≤N, N are the voice messaging Vocal print feature vector element sum, VpElemnFor the nth elements of the vocal print feature vector of the voice messaging, StVpElemnFor the nth elements of the reference characteristic vector, vocal print feature vector and institute of the SimDeg for the voice messaging State the similarity between reference characteristic vector.
Further, the vocal print feature vector calculation module may include:
Voice subsegment division unit, for the voice messaging to be divided into M voice subsegment, wherein M is greater than 1 Integer;
Meier frequency spectrum scramble coefficient vector computing unit, for calculating separately the Meier frequency of each voice subsegment according to the following formula Compose scramble coefficient vector:
MelVecm=MFCCFuc (SubVoicem)
Wherein, m is the serial number of voice subsegment, 1≤m≤M, SubVoicemFor m-th of voice subsegment, MFCCFuc is default Meier frequency spectrum scramble coefficient calculate function, MelVecmFor the Meier frequency spectrum scramble coefficient vector of m-th of voice subsegment, and MelVecm=(MelCoem,1,MelCoem,2,......,MelCoem,n,......,MelCoem,N), MelCoem,nIt is m-th N-th of Meier frequency spectrum scramble coefficient of voice subsegment;
Weight-coefficient calculating unit, for calculating separately the weight coefficient of each voice subsegment according to the following formula:
Wherein, WeightmFor the weight coefficient of m-th of voice subsegment;
Vocal print feature vector structural unit, for constructing the vocal print feature vector of the voice messaging according to the following formula:
VoPrintVec=(VpElem1,VpElem2,......,VpElemn,......,VpElemN)
Wherein,VoPrintVec is the vocal print feature of the voice messaging Vector.
It is apparent to those skilled in the art that for convenience and simplicity of description, the device of foregoing description, The specific work process of module and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, is not described in detail or remembers in some embodiment The part of load may refer to the associated description of other embodiments.
The schematic block diagram that Fig. 5 shows a kind of terminal device provided in an embodiment of the present invention is only shown for ease of description Part related to the embodiment of the present invention.
In the present embodiment, the terminal device 5 can be desktop PC, notebook, palm PC and cloud clothes Business device etc. calculates equipment.The terminal device 5 can include: processor 50, memory 51 and be stored in the memory 51 simultaneously The computer-readable instruction 52 that can be run on the processor 50, such as execute the calculating of above-mentioned application control method Machine readable instruction.The processor 50 realizes above-mentioned each application control method when executing the computer-readable instruction 52 Step in embodiment, such as step S101 to S105 shown in FIG. 1.Alternatively, the processor 50 execute the computer can The function of each module/unit in above-mentioned each Installation practice, such as the function of module 401 to 405 shown in Fig. 4 are realized when reading instruction 52 Energy.
Illustratively, the computer-readable instruction 52 can be divided into one or more module/units, one Or multiple module/units are stored in the memory 51, and are executed by the processor 50, to complete the present invention.Institute Stating one or more module/units can be the series of computation machine readable instruction section that can complete specific function, the instruction segment For describing implementation procedure of the computer-readable instruction 52 in the terminal device 5.
The processor 50 can be central processing unit (Central Processing Unit, CPU), can also be Other general processors, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field- Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic, Discrete hardware components etc..General processor can be microprocessor or the processor is also possible to any conventional processor Deng.
The memory 51 can be the internal storage unit of the terminal device 5, such as the hard disk or interior of terminal device 5 It deposits.The memory 51 is also possible to the External memory equipment of the terminal device 5, such as be equipped on the terminal device 5 Plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card dodge Deposit card (Flash Card) etc..Further, the memory 51 can also both include the storage inside list of the terminal device 5 Member also includes External memory equipment.The memory 51 is for storing the computer-readable instruction and the terminal device 5 Required other instruction and datas.The memory 51 can be also used for temporarily storing the number that has exported or will export According to.
The functional units in various embodiments of the present invention may be integrated into one processing unit, is also possible to each Unit physically exists alone, and can also be integrated in one unit with two or more units.Above-mentioned integrated unit both may be used To use formal implementation of hardware, can also realize in the form of software functional units.
If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product When, it can store in a computer readable storage medium.Based on this understanding, technical solution of the present invention substantially or Person says that all or part of the part that contributes to existing technology or the technical solution can body in the form of software products Reveal and, which is stored in a storage medium, including several computer-readable instructions are used so that one Platform computer equipment (can be personal computer, server or the network equipment etc.) executes described in each embodiment of the present invention The all or part of the steps of method.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read- Only Memory), random access memory (RAM, Random Access Memory), magnetic or disk etc. are various can be with Store the medium of computer-readable instruction.
Embodiment described above is merely illustrative of the technical solution of the present invention, rather than its limitations;Although referring to aforementioned reality Applying example, invention is explained in detail, those skilled in the art should understand that: it still can be to aforementioned each Technical solution documented by embodiment is modified or equivalent replacement of some of the technical features;And these are modified Or replacement, the spirit and scope for technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution.

Claims (10)

1. a kind of application control method characterized by comprising
After receiving voice collecting instruction, the voice messaging of user's input is acquired, includes to using journey in the voice messaging The control instruction of sequence;
Speech recognition is carried out to the voice messaging of acquisition, obtains text information corresponding with the voice messaging;
Calculate separately the matching degree between each control instruction in the text information and preset control instruction set;
It is used as from the highest control instruction of matching degree chosen in the control instruction set between the text information to institute The target control instruction of application program is stated, and controls the application program and executes operation corresponding with target control instruction.
2. application control method according to claim 1, which is characterized in that described to calculate separately the text information Matching degree between each control instruction in preset control instruction set includes:
It determines keyword set corresponding with each control instruction, and calculates separately each pass in each keyword set The classification identification of keyword;
The frequency that each keyword occurs in the text information is counted respectively;
The matching degree between the text information and each control instruction is calculated separately according to the following formula:
Wherein, c is the serial number of control instruction, and 1≤c≤ClassNum, ClassNum are the sum of control instruction, and kn is keyword Serial number, 1≤kn≤KwNumc, KwNumcIt is total for the keyword in keyword set corresponding with c-th of control instruction, MsgKWNumc,knIt is n keyword of kth in keyword set corresponding with c-th of control instruction in the text information The frequency of appearance, ClassDegc,knFor the classification of n keyword of kth in keyword set corresponding with c-th of control instruction Identification, MatchDegcFor the matching degree between the text information and c-th of control instruction.
3. application control method according to claim 2, which is characterized in that the determination and each control instruction point Not corresponding keyword set, and the classification identification for calculating separately each keyword in each keyword set includes:
To in preset corpus each corpus carry out word cutting processing, obtain each word, include in the corpus with respectively The corresponding corpus word bank of a control instruction;
The frequency that each word occurs in each corpus word bank is counted respectively;
The classification identification of each word is calculated separately according to the following formula:
Wherein, w is the serial number of word, and 1≤w≤WordNum, WordNum are the total number of word, FreqSeqwFor w-th of word The frequency sequence occurred in each corpus word bank, and FreqSeqw=[Freqw,1,Freqw,2,......, Freqw,c,......,Freqw,ClassNum], Freqw,cIt is w-th of word in corpus word bank corresponding with c-th of control instruction The frequency of appearance, FreqSeq 'wFor from FreqSeqwIn get rid of remaining sequence after maximum value, it may be assumed that FreqSeq 'w= FreqSeqw-MAX(FreqSeqw), MAX is maximizing function, ClassDegwFor the classification identification of w-th of word;
Word of the classification identification greater than preset identification threshold value is chosen as keyword, and determining and each pass according to the following formula The corresponding control instruction of keyword:
TgtKwSetw=argmax (FreqSeqw)=argmax (Freqw,1,Freqw,2,......,Freqw,c,......, Freqw,ClassNum)
Wherein, TgtKwSetwFor the serial number of control instruction corresponding with w-th of keyword;
The corresponding keyword of each and c-th of control instruction is configured to keyword set corresponding with c-th of control instruction.
4. application control method according to any one of claim 1 to 3, which is characterized in that in the institute to acquisition State voice messaging carry out speech recognition before, further includes:
The vocal print feature vector of the voice messaging is calculated, and inquires benchmark corresponding with the user in preset database Feature vector;
The similarity between the vocal print feature vector of the voice messaging and the reference characteristic vector is calculated according to the following formula:
Wherein, n is the element numbers of the vocal print feature vector of the voice messaging, and 1≤n≤N, N are the sound of the voice messaging The element sum of line feature vector, VpElemnFor the nth elements of the vocal print feature vector of the voice messaging, StVpElemn For the nth elements of the reference characteristic vector, SimDeg is vocal print feature vector and the benchmark spy of the voice messaging Levy the similarity between vector;
If the similarity between the vocal print feature vector of the voice messaging and the reference characteristic vector is greater than preset similar The step of spending threshold value, then executing the voice messaging progress speech recognition of described pair of acquisition and its subsequent step.
5. application control method according to claim 4, which is characterized in that the sound for calculating the voice messaging Line feature vector includes:
The voice messaging is divided into M voice subsegment, wherein M is the integer greater than 1;
The Meier frequency spectrum scramble coefficient vector of each voice subsegment is calculated separately according to the following formula:
MelVecm=MFCCFuc (SubVoicem)
Wherein, m is the serial number of voice subsegment, 1≤m≤M, SubVoicemFor m-th of voice subsegment, MFCCFuc is preset plum You calculate function, MelVec by frequency spectrum scramble coefficientmFor the Meier frequency spectrum scramble coefficient vector of m-th of voice subsegment, and MelVecm =(MelCoem,1,MelCoem,2,......,MelCoem,n,......,MelCoem,N), MelCoem,nFor m-th of voice subsegment N-th of Meier frequency spectrum scramble coefficient;
The weight coefficient of each voice subsegment is calculated separately according to the following formula:
Wherein, WeightmFor the weight coefficient of m-th of voice subsegment;
The vocal print feature vector of the voice messaging is constructed according to the following formula:
VoPrintVec=(VpElem1,VpElem2,......,VpElemn,......,VpElemN)
Wherein,VoPrintVec is the vocal print feature vector of the voice messaging.
6. a kind of application program controlling device characterized by comprising
Voice messaging acquisition module, for acquiring the voice messaging of user's input, institute's predicate after receiving voice collecting instruction It include the control instruction to application program in message breath;
Speech recognition module obtains corresponding with the voice messaging for carrying out speech recognition to the voice messaging of acquisition Text information;
Matching degree computing module refers to for calculating separately the text information with each control in preset control instruction set Matching degree between order;
Target control instruction choose module, for from the control instruction set choose and the text information between matching Highest control instruction is spent as the target control instruction to the application program;
Operation executing module executes operation corresponding with target control instruction for controlling the application program.
7. application program controlling device according to claim 6, which is characterized in that the matching degree computing module includes:
Keyword set determination unit for determining keyword set corresponding with each control instruction, and calculates separately The classification identification of each keyword in each keyword set;
Frequency statistics unit, the frequency occurred in the text information for counting each keyword respectively;
Matching degree computing unit, for calculating separately the matching between the text information and each control instruction according to the following formula Degree:
Wherein, c is the serial number of control instruction, and 1≤c≤ClassNum, ClassNum are the sum of control instruction, and kn is keyword Serial number, 1≤kn≤KwNumc, KwNumcIt is total for the keyword in keyword set corresponding with c-th of control instruction, MsgKWNumc,knIt is n keyword of kth in keyword set corresponding with c-th of control instruction in the text information The frequency of appearance, ClassDegc,knFor the classification of n keyword of kth in keyword set corresponding with c-th of control instruction Identification, MatchDegcFor the matching degree between the text information and c-th of control instruction.
8. application program controlling device according to claim 7, which is characterized in that the keyword set determination unit packet It includes:
Word cutting handles subelement, for carrying out word cutting processing to each corpus in preset corpus, obtains each word, institute Stating in corpus includes corpus word bank corresponding with each control instruction;
Frequency statistics subelement, the frequency occurred in each corpus word bank for counting each word respectively;
Classification identification computation subunit, for calculating separately the classification identification of each word according to the following formula:
Wherein, w is the serial number of word, and 1≤w≤WordNum, WordNum are the total number of word, FreqSeqwFor w-th of word The frequency sequence occurred in each corpus word bank, and FreqSeqw=[Freqw,1,Freqw,2,......, Freqw,c,......,Freqw,ClassNum], Freqw,cIt is w-th of word in corpus word bank corresponding with c-th of control instruction The frequency of appearance, FreqSeq 'wFor from FreqSeqwIn get rid of remaining sequence after maximum value, it may be assumed that FreqSeq 'w= FreqSeqw-MAX(FreqSeqw), MAX is maximizing function, ClassDegwFor the classification identification of w-th of word;
Keyword determines subelement, is greater than the word of preset identification threshold value as keyword for choosing classification identification, And control instruction corresponding with each keyword is determined according to the following formula:
TgtKwSetw=argmax (FreqSeqw)=argmax (Freqw,1,Freqw,2,......,Freqw,c,......, Freqw,ClassNum)
Wherein, TgtKwSetwFor the serial number of control instruction corresponding with w-th of keyword;
Keyword set constructs subelement, for each keyword corresponding with c-th of control instruction to be configured to control with c-th System instructs corresponding keyword set.
9. a kind of computer readable storage medium, the computer-readable recording medium storage has computer-readable instruction, special Sign is, the application journey as described in any one of claims 1 to 5 is realized when the computer-readable instruction is executed by processor The step of sequence controlling method.
10. a kind of terminal device, including memory, processor and storage are in the memory and can be on the processor The computer-readable instruction of operation, which is characterized in that the processor realizes such as right when executing the computer-readable instruction It is required that the step of application control method described in any one of 1 to 5.
CN201811210044.1A 2018-10-17 2018-10-17 A kind of application control method, device, readable storage medium storing program for executing and terminal device Pending CN109584865A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811210044.1A CN109584865A (en) 2018-10-17 2018-10-17 A kind of application control method, device, readable storage medium storing program for executing and terminal device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811210044.1A CN109584865A (en) 2018-10-17 2018-10-17 A kind of application control method, device, readable storage medium storing program for executing and terminal device

Publications (1)

Publication Number Publication Date
CN109584865A true CN109584865A (en) 2019-04-05

Family

ID=65920096

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811210044.1A Pending CN109584865A (en) 2018-10-17 2018-10-17 A kind of application control method, device, readable storage medium storing program for executing and terminal device

Country Status (1)

Country Link
CN (1) CN109584865A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110109365A (en) * 2019-04-24 2019-08-09 平安科技(深圳)有限公司 Speaker control method, device and computer readable storage medium
CN110147216A (en) * 2019-04-16 2019-08-20 深圳壹账通智能科技有限公司 Page switching method, device, computer equipment and the storage medium of application program
CN110171005A (en) * 2019-06-10 2019-08-27 杭州任你说智能科技有限公司 A kind of tourism robot system based on intelligent sound box
CN111292742A (en) * 2020-01-14 2020-06-16 京东数字科技控股有限公司 Data processing method and device, electronic equipment and computer storage medium
CN112581957A (en) * 2020-12-04 2021-03-30 浪潮电子信息产业股份有限公司 Computer voice control method, system and related device
CN112599125A (en) * 2020-12-02 2021-04-02 一汽资本控股有限公司 Voice office processing method and device, terminal and storage medium
CN112825030A (en) * 2020-02-28 2021-05-21 腾讯科技(深圳)有限公司 Application program control method, device, equipment and storage medium
WO2023093121A1 (en) * 2021-11-29 2023-06-01 中兴通讯股份有限公司 Voice control method, terminal device, server, and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06161488A (en) * 1992-11-17 1994-06-07 Ricoh Co Ltd Speech recognizing device
CN101123428A (en) * 2006-08-09 2008-02-13 马昊 Intelligent electronic remote control switch for voice recognition capable of dynamic setting
CN101447185A (en) * 2008-12-08 2009-06-03 深圳市北科瑞声科技有限公司 Audio frequency rapid classification method based on content
CN201514761U (en) * 2009-09-23 2010-06-23 上海大屯能源股份有限公司 Household voice controller
US20100179811A1 (en) * 2009-01-13 2010-07-15 Crim Identifying keyword occurrences in audio data
CN107329843A (en) * 2017-06-30 2017-11-07 百度在线网络技术(北京)有限公司 Application program sound control method, device, equipment and storage medium
CN107492374A (en) * 2017-10-11 2017-12-19 深圳市汉普电子技术开发有限公司 A kind of sound control method, smart machine and storage medium
CN108182937A (en) * 2018-01-17 2018-06-19 出门问问信息科技有限公司 Keyword recognition method, device, equipment and storage medium
CN108597512A (en) * 2018-04-27 2018-09-28 努比亚技术有限公司 Method for controlling mobile terminal, mobile terminal and computer readable storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06161488A (en) * 1992-11-17 1994-06-07 Ricoh Co Ltd Speech recognizing device
CN101123428A (en) * 2006-08-09 2008-02-13 马昊 Intelligent electronic remote control switch for voice recognition capable of dynamic setting
CN101447185A (en) * 2008-12-08 2009-06-03 深圳市北科瑞声科技有限公司 Audio frequency rapid classification method based on content
US20100179811A1 (en) * 2009-01-13 2010-07-15 Crim Identifying keyword occurrences in audio data
CN201514761U (en) * 2009-09-23 2010-06-23 上海大屯能源股份有限公司 Household voice controller
CN107329843A (en) * 2017-06-30 2017-11-07 百度在线网络技术(北京)有限公司 Application program sound control method, device, equipment and storage medium
CN107492374A (en) * 2017-10-11 2017-12-19 深圳市汉普电子技术开发有限公司 A kind of sound control method, smart machine and storage medium
CN108182937A (en) * 2018-01-17 2018-06-19 出门问问信息科技有限公司 Keyword recognition method, device, equipment and storage medium
CN108597512A (en) * 2018-04-27 2018-09-28 努比亚技术有限公司 Method for controlling mobile terminal, mobile terminal and computer readable storage medium

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110147216A (en) * 2019-04-16 2019-08-20 深圳壹账通智能科技有限公司 Page switching method, device, computer equipment and the storage medium of application program
CN110109365A (en) * 2019-04-24 2019-08-09 平安科技(深圳)有限公司 Speaker control method, device and computer readable storage medium
CN110171005A (en) * 2019-06-10 2019-08-27 杭州任你说智能科技有限公司 A kind of tourism robot system based on intelligent sound box
CN111292742A (en) * 2020-01-14 2020-06-16 京东数字科技控股有限公司 Data processing method and device, electronic equipment and computer storage medium
CN112825030A (en) * 2020-02-28 2021-05-21 腾讯科技(深圳)有限公司 Application program control method, device, equipment and storage medium
CN112825030B (en) * 2020-02-28 2023-09-19 腾讯科技(深圳)有限公司 Application program control method, device, equipment and storage medium
CN112599125A (en) * 2020-12-02 2021-04-02 一汽资本控股有限公司 Voice office processing method and device, terminal and storage medium
CN112581957A (en) * 2020-12-04 2021-03-30 浪潮电子信息产业股份有限公司 Computer voice control method, system and related device
WO2023093121A1 (en) * 2021-11-29 2023-06-01 中兴通讯股份有限公司 Voice control method, terminal device, server, and storage medium

Similar Documents

Publication Publication Date Title
CN109584865A (en) A kind of application control method, device, readable storage medium storing program for executing and terminal device
WO2021093449A1 (en) Wakeup word detection method and apparatus employing artificial intelligence, device, and medium
WO2023273170A1 (en) Welcoming robot conversation method
US11823678B2 (en) Proactive command framework
Wu et al. Emotion recognition from text using semantic labels and separable mixture models
US7542901B2 (en) Methods and apparatus for generating dialog state conditioned language models
CN107315766A (en) A kind of voice response method and its device for gathering intelligence and artificial question and answer
CN109509470A (en) Voice interactive method, device, computer readable storage medium and terminal device
CN110277088B (en) Intelligent voice recognition method, intelligent voice recognition device and computer readable storage medium
CN109767787A (en) Emotion identification method, equipment and readable storage medium storing program for executing
CN103971675A (en) Automatic voice recognizing method and system
US11735190B2 (en) Attentive adversarial domain-invariant training
CN106847279A (en) Man-machine interaction method based on robot operating system ROS
CN112071310A (en) Speech recognition method and apparatus, electronic device, and storage medium
KR101677859B1 (en) Method for generating system response using knowledgy base and apparatus for performing the method
WO2023279691A1 (en) Speech classification method and apparatus, model training method and apparatus, device, medium, and program
Liu et al. Speech emotion recognition using an enhanced co-training algorithm
CN112199498A (en) Man-machine conversation method, device, medium and electronic equipment for endowment service
US20220277732A1 (en) Method and apparatus for training speech recognition model, electronic device and storage medium
US10706086B1 (en) Collaborative-filtering based user simulation for dialog systems
Sohail et al. Text classification in an under-resourced language via lexical normalization and feature pooling
Zajíc et al. First insight into the processing of the language consulting center data
Matsubara et al. Example-based speech intention understanding and its application to in-car spoken dialogue system
CN108804411B (en) A kind of semantic role analysis method, computer readable storage medium and terminal device
CN112632234A (en) Human-computer interaction method and device, intelligent robot and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination