CN109448727A - Voice interactive method and device - Google Patents
Voice interactive method and device Download PDFInfo
- Publication number
- CN109448727A CN109448727A CN201811098577.5A CN201811098577A CN109448727A CN 109448727 A CN109448727 A CN 109448727A CN 201811098577 A CN201811098577 A CN 201811098577A CN 109448727 A CN109448727 A CN 109448727A
- Authority
- CN
- China
- Prior art keywords
- voice
- control
- short mark
- voice messaging
- target widget
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0481—Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0487—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
- G06F3/0488—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/451—Execution arrangements for user interfaces
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
Abstract
The present invention relates to a kind of voice interactive method, device, electronic equipment and storage mediums.This method comprises: the control information for the control that inquiry current application program window includes, being obtained based on the control information and being preset the control of operation is target widget;It is that each target widget distributes a short mark respectively, and the corresponding short mark is presented in each target widget position according to preset rules;It receives voice messaging and the voice messaging is identified, obtain the short mark to be responded for including in the voice messaging;Determine that the position of the short mark to be responded in current application program window is target position, and simulation executes the predetermined registration operation in the target position, to trigger corresponding target widget.The scope of application of interactive voice can be improved in the present invention.
Description
Technical field
The present invention relates to technical field of voice recognition, set in particular to a kind of voice interactive method, device, electronics
Standby and computer readable storage medium.
Background technique
Voice be the mankind be used to using exchange way, not only more naturally, but also having compared with other communication means
The advantages that cognitive load is small, and resource occupation is few and interactive efficiency is high.Voice is used as one kind is powerful arbitrarily to control entrance, at present
Through being widely applied in the various electronic equipments such as PC, communication terminal, user passes through typing voice, so that it may
The operation such as to execute required inquiry on an electronic device, search for, make a phone call, it is convenient for the user to use.
Existing interactive voice mode, it usually needs support the application program of voice operating to determine in electronic equipment
System, interactive voice process includes following processing links: after the window of voice operating is supported in the application program display by customization, meeting
The phonetic order set that the window is supported is registered in the voice service provided to operating system;When voice service receives user's input
Voice messaging after, if it is detected that certain phonetic order phase that voice messaging and the registered phonetic order of the application program are concentrated
Symbol, then convert speech information into the respective window that corresponding control instruction is sent to the application program, and application program passes through pre-
The code first customized is responded.
But on the one hand, if each window for each application program is customized exploitation voice interactive function, by pole
The workload of big increase developer;On the other hand, for much without the application program of customized development voice interactive function,
It will be unable to hinder the popularization and application of interactive voice mode on an electronic device with voice interactive function.
Around this theme of interactive voice, there are some patent applications to carry out good try in the prior art, than
Such as:
Application No. is the patent applications of CN201410634017.2 to disclose a kind of software operation side based on interactive voice
Method and system, the software and the voice assistant independent operating, the voice assistant obtain the execution item of the software operation
Mesh information, the voice assistant match speech recognition conversion result with the project implementation information of acquisition, then for
The project implementation information matched is carried out according to project implementation element information and project implementation status information and voice messaging by software
Operation executes.The software running method and system based on interactive voice is carried out according to the real-time project implementation information of software
It operates on it and uses, voice software is made really to march toward intelligence, meanwhile, independent operating is separated with software, it can be with one
Voice assistant is used cooperatively with multiple softwares, greatlys save system resource.But each application program includes a large amount of different behaviour
It instructs, and the operational order that different application programs includes is even more very different.In this way, the intelligence for voice assistant
It is required that then very high.
Disclosing application No. is the patent application of CN201110081146.X a kind of can be widely used in PC, mobile phone, household
Speech recognition and interactive system in the various terminal equipments such as electric appliance.Whole system include: interaction design device, interaction actuator,
Platform abstraction library, interaction five plug-in unit, platform api core library parts: the completely new interaction of one kind is proposed in interaction design device and is set
Meter method, by intuitively operating the design that can complete entirely to interact;Interaction actuator is used to explain execution interaction scripts;
Interaction plug-in unit is used to extend the function of existing interaction platform abstraction library and increases some special applications;Platform abstraction library for realizing
Multi-platform portability and the independence with platform specific;Platform api core library passes through encapsulation platform specific operating system
API is convenient to be called by platform abstraction library.But the operational order that the program may be implemented is less, it is difficult to be applied to operate
More application program.
Application No. is the patent application of CN201610736268.0 disclose a kind of control method based on interactive voice and
System.This method starts voice interactive system, voice interactive system real-time sense voice messaging, by what is listened to by wake-up signal
Converting voice message into text message analyzes the text information of conversion, by with the pre-stored functional parameter of system
Judge whether the functional parameter of the corresponding text information of voice messaging is complete, if completely, executing corresponding operation;If endless
It is whole, operation to be performed is replied according to the functional parameter prompt user lacked, is grasped in real time by voice calling system to realize
Make.Using the control method and system, different function can be selected to operate at any time, or the difference of the same function of selection executes ginseng
Number, meets the different demands of user.But the program there is a problem of similar with CN201410634017.2.
Accordingly, it is desirable to provide a kind of adaptability is higher, identifies more fast and accurately voice interactive method, at least can
Solve said one or multiple technical problems.
It should be noted that information is only used for reinforcing the reason to background of the invention disclosed in above-mentioned background technology part
Solution, therefore may include the information not constituted to the prior art known to persons of ordinary skill in the art.
Summary of the invention
The purpose of the present invention is to provide a kind of voice interactive method, device, electronic equipment and computer-readable storages
Medium, and then one or more is asked caused by overcoming the limitation and defect due to the relevant technologies at least to a certain extent
Topic.
According to an aspect of the present invention, a kind of voice interactive method is provided, which comprises
The control information for the control that inquiry current application program window includes can be carried out pre- based on control information acquisition
If the control of operation is target widget;
A short mark is distributed respectively for each target widget according to preset rules, and in place in each target widget institute
It sets and the corresponding short mark is presented;
It receives voice messaging simultaneously to identify the voice messaging, obtains include in the voice messaging to be responded
The short mark;
Determine that the position of the short mark to be responded in current application program window is target position, and in the mesh
Cursor position simulation executes the predetermined registration operation, to trigger corresponding target widget.
It is described that operation is preset based on control information acquisition in a kind of exemplary embodiment of the invention
Control is target widget, comprising:
For each control in current application program window, the triggering for including in the control information of the control is obtained
Action type simultaneously judges whether the trigger action type is consistent with the predetermined registration operation;
If the trigger action type is consistent with the predetermined registration operation, using the corresponding control as target control
Part.
In a kind of exemplary embodiment of the invention, the predetermined registration operation is touch-control clicking operation and/or mouse-click
Operation.
It is described to be distributed respectively according to preset rules for each target widget in a kind of exemplary embodiment of the invention
One short mark, comprising:
According to preset order, number mark is sequentially allocated for each target widget, letter identifies or customized
Mark.
It is described that the voice messaging is identified in a kind of exemplary embodiment of the invention, obtain the voice
The short mark to be responded for including in information, comprising:
Speech recognition is carried out to the voice messaging, the voice messaging is converted into text information;
Matching operation is carried out to the text information, obtains the short mark to be responded for including.
It is described that speech recognition is carried out to voice messaging in a kind of exemplary embodiment of the invention, comprising:
It is right by one of deep neural network model, Hidden Markov Model, gauss hybrid models or a variety of models
The voice messaging carries out speech recognition.
In a kind of exemplary embodiment of the invention, the simulation in the target position executes the predetermined registration operation,
Include:
By simulating the movement of manual input device, the predetermined registration operation is executed in the target position;It is described defeated manually
Entering equipment includes touch screen and/or mouse.
According to an aspect of the present invention, a kind of voice interaction device is provided, described device includes:
Target widget detection module is based on institute for inquiring the control information for the control that current application program window includes
Stating control information acquisition and being preset the control of operation is target widget;
Short mark distribution module, each target widget distributes a short mark respectively for being according to preset rules, and
The corresponding short mark is presented in each target widget position;
Short mark identification module obtains the voice for receiving voice messaging and identifying to the voice messaging
The short mark to be responded for including in information;
Operation simulation module, for determining that the position of the short mark to be responded in current application program window is mesh
Cursor position, and simulation executes the predetermined registration operation in the target position, to trigger corresponding target widget.
In one aspect of the invention, a kind of electronic equipment is provided, comprising:
Processor;And
Memory is stored with computer-readable instruction on the memory, and the computer-readable instruction is by the processing
The method according to above-mentioned any one is realized when device executes.
In one aspect of the invention, a kind of computer readable storage medium is provided, computer program is stored thereon with, institute
State realization method according to above-mentioned any one when computer program is executed by processor.
Voice interactive method in exemplary embodiment of the present invention distributes short mark respectively first for target widget;Its
It is secondary, identify the short mark to be responded for including in the voice messaging received;Finally, to determine short mark to be responded current
Position in application window is target position, and simulates in target position and execute predetermined registration operation, to trigger corresponding mesh
Mark control.On the one hand, it voice is not provided supports for a certain window of a certain application program or application program, from being
System level provides voice and supports;Simultaneously as in such a way that simulation executes predetermined registration operation, therefore application program is for default
Operation can be responded according to conventional treatment logic;Based on this two o'clock, developer is not necessarily to appoint for program code
Adaptation work in terms of what voice reduces the work of developer while providing perfect support to interactive voice mode
Amount.On the other hand, it is unified for target widget in the present invention and distributes the preset short mark of difference, in this way, can then be issued to user
Voice messaging have desired, not only feature database needed for speech recognition can be greatly reduced, and speech recognition accuracy
Available guarantee.It therefore, through the invention can be further to promote in a manner of interactive voice being applicable on an electronic device.
It should be understood that above general description and following detailed description be only it is exemplary and explanatory, not
It can the limitation present invention.
Detailed description of the invention
Its example embodiment is described in detail by referring to accompanying drawing, above and other feature of the invention and advantage will become
It is more obvious.
Fig. 1 shows the flow chart of the voice interactive method of an exemplary embodiment according to the present invention;
Fig. 2 shows the short home position schematic diagrames of an exemplary embodiment according to the present invention;
Fig. 3 shows the schematic block diagram of the voice interaction device of an exemplary embodiment according to the present invention;
Fig. 4 diagrammatically illustrates the block diagram of the electronic equipment of an exemplary embodiment according to the present invention;And
Fig. 5 diagrammatically illustrates the schematic diagram of the computer readable storage medium of an exemplary embodiment according to the present invention.
Specific embodiment
Example embodiment is described more fully with reference to the drawings.However, example embodiment can be real in a variety of forms
It applies, and is not understood as limited to embodiment set forth herein;On the contrary, thesing embodiments are provided so that the present invention will be comprehensively and complete
It is whole, and the design of example embodiment is comprehensively communicated to those skilled in the art.Identical appended drawing reference indicates in figure
Same or similar part, thus repetition thereof will be omitted.
In addition, described feature, structure or characteristic can be incorporated in one or more implementations in any suitable manner
In example.In the following description, many details are provided to provide and fully understand to the embodiment of the present invention.However,
It will be appreciated by persons skilled in the art that technical solution of the present invention can be practiced without one in the specific detail or more
It is more, or can be using other methods, constituent element, material, device, step etc..In other cases, it is not shown in detail or describes
Known features, method, apparatus, realization, material or operation are to avoid fuzzy each aspect of the present invention.
Block diagram shown in the drawings is only functional entity, not necessarily must be corresponding with physically separate entity.
I.e., it is possible to realize these functional entitys using software form, or these are realized in the module of one or more softwares hardening
A part of functional entity or functional entity, or realized in heterogeneous networks and/or processor device and/or microcontroller device
These functional entitys.
In this exemplary embodiment, a kind of voice interactive method is provided firstly, can be applied to computer or movement
The electronic equipments such as terminal;With reference to shown in Fig. 1, which be may comprise steps of:
Step S110, the control information for the control that inquiry current application program window includes, is obtained based on the control information
Taking the control for being preset operation is target widget;
Step S120, a short mark is distributed respectively for each target widget according to preset rules, and in each target
The corresponding short mark is presented in control position;
Step S130, it receives and voice messaging and the voice messaging is identified, obtain in the voice messaging and include
The short mark to be responded;
Step S140, determine that the position of the short mark to be responded in current application program window is target position,
And simulation executes the predetermined registration operation in the target position, to trigger corresponding target widget.
According to the voice interactive method in this example embodiment, on the one hand, be not for a certain application program or application
The a certain window of program provides voice and supports, but provides voice from system level and support;It is executed simultaneously as using simulation
The mode of predetermined registration operation, therefore application program can respond predetermined registration operation according to conventional treatment logic;It is based on
This two o'clock, developer are not necessarily to the adaptation work in terms of making any voice for program code, provide to interactive voice mode
While improving support, reduce the workload of developer.On the other hand, it is default that target widget distribution is unified in the present invention
The short marks of difference, in this way, the voice messaging that can then issue to user have desired, not only feature database needed for speech recognition
It can be greatly reduced, and the available guarantee of speech recognition accuracy.Therefore, language can further be promoted through the invention
The popularization and application of sound interactive mode on an electronic device.
In the following, by the voice interactive method in this example embodiment is further detailed.
In step s 110, the control information for the control that inquiry current application program window includes, is believed based on the control
It is target widget that breath, which obtains and is preset the control of operation,.
In this example embodiment, the current application program window can be the window of the application program of front stage operation;
By taking Android operation system as an example, the task stack that one only saves multiple elements can be initialized in application program launching and is used
To store current page component (activity) and history page component (activity);Wherein, current page component is foreground
The component of the application program of operation or by front stage operation application program component activation other applications component;When
Other page assemblies except preceding page assembly are then the component of the application program of running background.Certainly, in of the invention other
In exemplary embodiment, the current application program window also may include system windows, such as system desktop window etc., originally show
Particular determination is not done to this in example property embodiment.
By taking mobile terminal device as an example, the control that the current application program window includes may include such as button, surpass
Link, list, Text Entry etc..Operating system for mobile terminal device moreover, usually all provide window management service,
Application program or system program can be by interacting, so that windows content is shown to screen with window management service
On.Therefore, when implementing the technical program, can monitor window management service includes to obtain current application program window
Control and control information with the control.Specifically, when application program wishes that a certain window is presented on foreground, it can be with
It sends and requests to window management service, the corresponding window display processing function of triggering window management service call;Therefore, Ke Yi
Monitor code can be added in the function, obtained application program and wished the control for including in the window that foreground is presented and and institute
State the control information of control.
For each control in current application program window, corresponding control information may include control type,
Control title, the trigger action type of control, the layout information of control and control ID etc..It, can in this example embodiment
The trigger action type in control information to obtain each control, and then may determine that whether are trigger action type and predetermined registration operation
Unanimously.In this example embodiment, the predetermined registration operation can hit operation and/or mouse-click operation for touch point.Citing and
Speech, in a mobile terminal device, in current application program window include refresh button, forwarding button, hyperlink URL 1 and
Displaying control;For refresh button, forwarding button and hyperlink URL 1, corresponding trigger action type is touch-control
Clicking operation, therefore, it is possible to judge that the trigger action type of 1 these three controls of refresh button, forwarding button and hyperlink URL
It is consistent with predetermined registration operation, and then refresh button, forwarding button and hyperlink URL 1 can be regard as target widget.
Certainly, in other exemplary embodiments of the invention, the predetermined registration operation also may include such as slide, long-pressing
Other operations such as operation or pressing operation;Correspondingly, the target widget also may include other kinds of control, this example
Particular determination is not done to this in property embodiment.
In the step s 120, a short mark is distributed respectively for each target widget according to preset rules, and each described
The corresponding short mark is presented in target widget position.
Refering to what is shown in Fig. 2, in application window shown in Fig. 2, including target widget 201 is to target widget 213.With
It is short to be identified as three bit digitals, in this example embodiment, can according to sequence from small to large, be target widget 201 to
Target widget 213 distributes short mark 001 to 013 respectively.In addition, in other exemplary embodiment of the present invention, the short mark
Knowing may be the other kinds of short mark of such as letter, monogram, alphanumeric;Alternatively, the short mark can also
Think the customized mark of user;Meanwhile in this example embodiment for the specific length of short mark without particular determination,
But it is unsuitable too long, such as preferably more than 5 characters etc..
It should be noted that in Fig. 2, be control distributing order according to target widget in application window and
The size order of short mark is that target widget distributes short mark, but the present invention is not limited thereto.For example, being pressed for such as advancing
The common control such as button, backspace button, submitting button, refresh button can be the short mark of its distribution fixation, to reduce user
Learning cost.In another example user is that each target widget is distributed respectively in system automatically for any application window
After short mark, the sequence of short mark can be manually adjusted and be saved into configuration file, in this way, load should in next time
When application window, then the distribution of short mark can be carried out according to configuration file.Therefore, those skilled in the art can basis
Specific requirements carry out above-mentioned preset rules the setting of adaptability, and particular determination is not done to this in the present exemplary embodiment.
In the control information obtained in above-mentioned steps S110, it will usually which the layout information including control, such as control exist
Show position in screen and with the dimension information of control etc..Therefore, short mark is distributed respectively for each target widget
Afterwards, the corresponding short mark, example can be presented in each target widget position according to the layout information of target widget
Such as uniformly it is shown in central location, upper left position or the upper right Angle Position of target widget.In this way, user then can be straight
That sees learns the corresponding short mark of each target widget.Certainly, after user is familiar with the corresponding short mark of each target widget of memory,
For the consideration such as page beauty, the short mark of some or all of display in the window can also be hidden according to user setting, this is same
Sample belongs to the scope of protection of the present invention.
In step s 130, it receives voice messaging and the voice messaging is identified, obtain in the voice messaging
Including the short mark to be responded.
In this example embodiment, user can be moved by the speech input devices input voice information such as microphone
Terminal device then can receive corresponding voice messaging.After receiving voice messaging, can to the voice messaging into
The voice messaging is converted to text information by row speech recognition.
In this example embodiment, deep neural network model, Hidden Markov Model, gauss hybrid models can be passed through
One of or a variety of models, speech recognition is carried out to each voice messaging, obtains corresponding text information.For example, can be with
Timing information is modeled by Hidden Markov Model, after a state of given Hidden Markov Model, by most
The methods of big expectation value-based algorithm is built based on probability distribution of the gauss hybrid models to the speech feature vector for belonging to the state
Mould;After modeling successfully, then speech recognition can be carried out to voice messaging, obtain corresponding text information.Certainly, in this hair
In other bright exemplary embodiments, in conjunction with contextual information (Context Dependent) or its other party can also be passed through
Formula carries out carry out speech recognition, and particular determination is not done to this in the present exemplary embodiment.
After obtaining above-mentioned text information, matching operation can be carried out to the text information, obtain include wait ring
The short mark answered.For example, user is read aloud following sentence " 007 " by microphone, to reception in this example embodiment
After the voice messaging arrived carries out speech recognition, text information " 007 " can be identified, based on the short identification record table pair prestored
" 007 " is matched, then available short mark " 007 ".In another example user, which reads aloud following sentence by microphone, " executes 007
Plan " can identify text information in this example embodiment after carrying out speech recognition to the voice messaging received
" executing 007 plan " matches " executing 007 plan " based on the short identification record table prestored, then available short mark
"007".Correspondingly, if not carrying out being matched to short mark in the text information based on the short identification record table prestored, it can
To prompt user to re-enter voice messaging.In addition, may also require that user exists in other exemplary embodiment of the present invention
The beginning of voice messaging increases wake-up word and does not do particular determination in the present exemplary embodiment to this to reduce maloperation etc..
Further, since being unified for target widget in the present invention distributes the preset short mark of difference, in this way, carrying out voice knowledge
When other, then the voice messaging that can be issued to user have desired, and not only feature database needed for speech recognition can significantly subtract
It is few, and the available guarantee of speech recognition accuracy.
In step S140, determine that the position of the short mark to be responded in current application program window is target position
It sets, and simulation executes the predetermined registration operation in the target position, to trigger corresponding target widget.
In this example embodiment, institute can be executed in the target position by the movement of simulation manual input device
State predetermined registration operation;The manual input device includes touch screen and/or mouse.By taking Android operation system as an example, it can use
The sendevent order that Android operation system provides sends touch event to the corresponding device node of touch screen, for example, can
To send to device node to give an order: the sendevent instruction of specified screen position of touch is first sent, wherein carrying above-mentioned mesh
The corresponding coordinate value of cursor position, then send correspond in touch control operation clicking operation (the i.e. usually described down movement with
And up movement) sendevent order.By sending above-metioned instruction, primary touching of the full simulation for the position of touch
Control clicking operation;It will be received after touch event is distributed to window management service in operating system, window management service is according to connecing
The clicking operation received is converted into the touch-control thing for the target widget by the corresponding target position of the clicking operation received
Part, and the touch event for being directed to the target widget is sent to the current application program window, it is held to trigger the window
The corresponding processing operation of row.Certainly, it according to the difference of action type, can be simulated by different modes, this exemplary implementation
Particular determination is not done to this in example.
Since present invention employs the mode that simulation executes predetermined registration operation, application program can be by for predetermined registration operation
More solito processing logic is responded, so that developer is not necessarily to the adaptation work in terms of making any voice for program code
Make, while providing perfect support to interactive voice mode, reduces the workload of developer, interactive voice can be promoted
The popularization and application of mode on an electronic device.
It should be noted that although describing each step of method in the present invention in the accompanying drawings with particular order,
This does not require that or implies must execute these steps in this particular order, or have to carry out step shown in whole
Just it is able to achieve desired result.Additional or alternative, it is convenient to omit multiple steps are merged into a step and held by certain steps
Row, and/or a step is decomposed into execution of multiple steps etc..
In addition, in this exemplary embodiment, additionally providing a kind of voice interaction device.Referring to shown in Fig. 3, the interactive voice
Device 300 may include: target widget detection module 310, short mark distribution module 320, short mark identification module 330 and behaviour
Make analog module 340.Wherein:
Target widget detection module 310 can be used for inquiring the control information for the control that current application program window includes,
Being obtained based on the control information and being preset the control of operation is target widget;
Short mark distribution module 320 can be used for being that each target widget distributes a short mark respectively according to preset rules
Know, and the corresponding short mark is presented in each target widget position;
Short mark identification module 330 can be used for receiving voice messaging and identify to the voice messaging, obtain institute
State the short mark to be responded for including in voice messaging;
Operation simulation module 340 is determined for the position of the short mark to be responded in current application program window
It is set to target position, and simulation executes the predetermined registration operation in the target position, to trigger corresponding target widget.
The detail of each voice interaction device module carries out in corresponding audio paragraph recognition methods among the above
Detailed description, therefore details are not described herein again.
It should be noted that although being referred to several modules or unit of voice interaction device 400 in the above detailed description,
But it is this divide it is not enforceable.In fact, embodiment according to the present invention, two or more above-described modules
Either the feature and function of unit can embody in a module or unit.Conversely, an above-described module or
The feature and function of person's unit can be to be embodied by multiple modules or unit with further division.
In addition, in an exemplary embodiment of the present invention, additionally providing a kind of electronic equipment that can be realized the above method.
Person of ordinary skill in the field it is understood that various aspects of the invention can be implemented as system, method or
Program product.Therefore, various aspects of the invention can be embodied in the following forms, it may be assumed that complete hardware embodiment, completely
Software implementation (including firmware, microcode etc.) or hardware and software in terms of combine embodiment, may be collectively referred to as here
Circuit, " module " or " system ".
The electronic equipment 400 of this embodiment according to the present invention is described referring to Fig. 4.The electronics that Fig. 4 is shown is set
Standby 400 be only an example, should not function to the embodiment of the present invention and use scope bring any restrictions.
As shown in figure 4, electronic equipment 400 is showed in the form of universal computing device.The component of electronic equipment 400 can wrap
It includes but is not limited to: at least one above-mentioned processing unit 410, at least one above-mentioned storage unit 420, the different system components of connection
The bus 430 of (including storage unit 420 and processing unit 410), display unit 440.
Wherein, the storage unit is stored with program code, and said program code can be held by the processing unit 410
Row, so that various according to the present invention described in the execution of the processing unit 410 above-mentioned " illustrative methods " part of this specification
The step of exemplary embodiment.For example, the processing unit 410 can execute step S110 as shown in fig. 1 to step
S160。
Storage unit 420 may include the readable medium of volatile memory cell form, such as Random Access Storage Unit
(RAM) 4201 and/or cache memory unit 4202, it can further include read-only memory unit (ROM) 4203.
Storage unit 420 can also include program/utility with one group of (at least one) program module 4205
4204, such program module 4204 includes but is not limited to: operating system, one or more application program, other program moulds
It may include the realization of network environment in block and program data, each of these examples or certain combination.
Bus 430 can be to indicate one of a few class bus structures or a variety of, including storage unit bus or storage
Cell controller, peripheral bus, graphics acceleration port, processing unit use any bus structures in a variety of bus structures
Local bus.
Electronic equipment 400 can also be with one or more external equipments 470 (such as keyboard, sensing equipment, bluetooth equipment
Deng) communication, can also be enabled a user to one or more equipment interact with the electronic equipment 400 communicate, and/or with make
Any equipment (such as the router, modulation /demodulation that the electronic equipment 400 can be communicated with one or more of the other calculating equipment
Device etc.) communication.This communication can be carried out by input/output (I/O) interface 450.Also, electronic equipment 400 can be with
By network adapter 460 and one or more network (such as local area network (LAN), wide area network (WAN) and/or public network,
Such as internet) communication.As shown, network adapter 460 is communicated by bus 430 with other modules of electronic equipment 400.
It should be understood that although not shown in the drawings, other hardware and/or software module can not used in conjunction with electronic equipment 400, including but not
Be limited to: microcode, device driver, redundant processing unit, external disk drive array, RAID system, tape drive and
Data backup storage system etc..
By the description of above embodiment, those skilled in the art is it can be readily appreciated that example embodiment described herein
It can also be realized in such a way that software is in conjunction with necessary hardware by software realization.Therefore, implement according to the present invention
The technical solution of example can be embodied in the form of software products, which can store in a non-volatile memories
In medium (can be CD-ROM, USB flash disk, mobile hard disk etc.) or on network, including some instructions are so that a calculating equipment (can
To be personal computer, server, terminal installation or network equipment etc.) it executes according to the method for the embodiment of the present invention.
In an exemplary embodiment of the present invention, a kind of computer readable storage medium is additionally provided, energy is stored thereon with
Enough realize the program product of this specification above method.In some possible embodiments, various aspects of the invention can be with
It is embodied as a kind of form of program product comprising program code, it is described when described program product is run on the terminal device
Program code is for executing the terminal device described in above-mentioned " illustrative methods " part of this specification according to the present invention
The step of various exemplary embodiments.
Refering to what is shown in Fig. 5, the program product 500 for realizing the above method of embodiment according to the present invention is described,
It can using portable compact disc read only memory (CD-ROM) and including program code, and can in terminal device, such as
It is run on PC.However, program product of the invention is without being limited thereto, in this document, readable storage medium storing program for executing, which can be, appoints
What include or the tangible medium of storage program that the program can be commanded execution system, device or device use or and its
It is used in combination.
Described program product can be using any combination of one or more readable mediums.Readable medium can be readable letter
Number medium or readable storage medium storing program for executing.Readable storage medium storing program for executing for example can be but be not limited to electricity, magnetic, optical, electromagnetic, infrared ray or
System, device or the device of semiconductor, or any above combination.The more specific example of readable storage medium storing program for executing is (non exhaustive
List) include: electrical connection with one or more conducting wires, portable disc, hard disk, random access memory (RAM), read-only
Memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read only memory
(CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.
Computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal,
In carry readable program code.The data-signal of this propagation can take various forms, including but not limited to electromagnetic signal,
Optical signal or above-mentioned any appropriate combination.Readable signal medium can also be any readable Jie other than readable storage medium storing program for executing
Matter, the readable medium can send, propagate or transmit for by instruction execution system, device or device use or and its
The program of combined use.
The program code for including on readable medium can transmit with any suitable medium, including but not limited to wirelessly, have
Line, optical cable, RF etc. or above-mentioned any appropriate combination.
The program for executing operation of the present invention can be write with any combination of one or more programming languages
Code, described program design language include object oriented program language-Java, C++ etc., further include conventional
Procedural programming language-such as " C " language or similar programming language.Program code can be fully in user
It calculates and executes in equipment, partly executes on a user device, being executed as an independent software package, partially in user's calculating
Upper side point is executed on a remote computing or is executed in remote computing device or server completely.It is being related to far
Journey calculates in the situation of equipment, and remote computing device can pass through the network of any kind, including local area network (LAN) or wide area network
(WAN), it is connected to user calculating equipment, or, it may be connected to external computing device (such as utilize ISP
To be connected by internet).
In addition, above-mentioned attached drawing is only the schematic theory of processing included by method according to an exemplary embodiment of the present invention
It is bright, rather than limit purpose.It can be readily appreciated that the time that above-mentioned processing shown in the drawings did not indicated or limited these processing is suitable
Sequence.In addition, be also easy to understand, these processing, which can be, for example either synchronously or asynchronously to be executed in multiple modules.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to of the invention its
His embodiment.This application is intended to cover any variations, uses, or adaptations of the invention, these modifications, purposes or
Adaptive change follow general principle of the invention and including the undocumented common knowledge in the art of the present invention or
Conventional techniques.The description and examples are only to be considered as illustrative, and true scope and spirit of the invention are by claim
It points out.
It should be understood that the present invention is not limited to the precise structure already described above and shown in the accompanying drawings, and
And various modifications and changes may be made without departing from the scope thereof.The scope of the present invention is only limited by the attached claims.
Claims (10)
1. a kind of voice interactive method, which is characterized in that the described method includes:
The control information for the control that inquiry current application program window includes is preset behaviour based on control information acquisition
The control of work is target widget;
It is that each target widget distributes a short mark respectively, and is according to preset rules in each target widget position
The existing corresponding short mark;
Receive voice messaging simultaneously the voice messaging is identified, obtain include in the voice messaging it is to be responded described in
Short mark;
Determine that the position of the short mark to be responded in current application program window is target position, and in the target position
It sets simulation and executes the predetermined registration operation, to trigger corresponding target widget.
2. voice interactive method according to claim 1, which is characterized in that described obtained based on the control information can be into
The control of row predetermined registration operation is target widget, comprising:
For each control in current application program window, the trigger action for including in the control information of the control is obtained
Type simultaneously judges whether the trigger action type is consistent with the predetermined registration operation;
If the trigger action type is consistent with the predetermined registration operation, using the corresponding control as target widget.
3. voice interactive method according to claim 2, which is characterized in that the predetermined registration operation is touch-control clicking operation
And/or mouse-click operation.
4. voice interactive method according to claim 1, which is characterized in that it is described according to preset rules be each target
Control distributes a short mark respectively, comprising:
According to preset order, number mark, letter mark or user-defined identification are sequentially allocated for each target widget.
5. voice interactive method according to claim 1, which is characterized in that it is described that the voice messaging is identified,
Obtain the short mark to be responded for including in the voice messaging, comprising:
Speech recognition is carried out to the voice messaging, the voice messaging is converted into text information;
Matching operation is carried out to the text information, obtains the short mark to be responded for including.
6. voice interactive method according to claim 5, which is characterized in that it is described that speech recognition is carried out to voice messaging,
Include:
By one of deep neural network model, Hidden Markov Model, gauss hybrid models or a variety of models, to described
Voice messaging carries out speech recognition.
7. voice interactive method according to claim 1, which is characterized in that described simulate in the target position executes institute
State predetermined registration operation, comprising:
By simulating the movement of manual input device, the predetermined registration operation is executed in the target position;Described be manually entered sets
Standby includes touch screen and/or mouse.
8. a kind of voice interaction device, which is characterized in that described device includes:
Target widget detection module is based on the control for inquiring the control information for the control that current application program window includes
The control that part acquisition of information is preset operation is target widget;
Short mark distribution module, for distributing a short mark respectively according to preset rules for each target widget, and in each institute
It states target widget position and the corresponding short mark is presented;
Short mark identification module obtains the voice messaging for receiving voice messaging and identifying to the voice messaging
In include the short mark to be responded;
Operation simulation module, for determining that the position of the short mark to be responded in current application program window is target position
It sets, and simulation executes the predetermined registration operation in the target position, to trigger corresponding target widget.
9. a kind of electronic equipment characterized by comprising
Processor;And
Memory is stored with computer-readable instruction on the memory, and the computer-readable instruction is held by the processor
Method according to any one of claim 1 to 7 is realized when row.
10. a kind of computer readable storage medium, is stored thereon with computer program, the computer program is executed by processor
Shi Shixian is according to claim 1 to any one of 7 the methods.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811098577.5A CN109448727A (en) | 2018-09-20 | 2018-09-20 | Voice interactive method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811098577.5A CN109448727A (en) | 2018-09-20 | 2018-09-20 | Voice interactive method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109448727A true CN109448727A (en) | 2019-03-08 |
Family
ID=65533136
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811098577.5A Pending CN109448727A (en) | 2018-09-20 | 2018-09-20 | Voice interactive method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109448727A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112102832A (en) * | 2020-09-18 | 2020-12-18 | 广州小鹏汽车科技有限公司 | Speech recognition method, speech recognition device, server and computer-readable storage medium |
CN112309388A (en) * | 2020-03-02 | 2021-02-02 | 北京字节跳动网络技术有限公司 | Method and apparatus for processing information |
CN112346695A (en) * | 2019-08-09 | 2021-02-09 | 华为技术有限公司 | Method for controlling equipment through voice and electronic equipment |
CN112634896A (en) * | 2020-12-30 | 2021-04-09 | 智道网联科技(北京)有限公司 | Operation method of application program on intelligent terminal and intelligent terminal |
CN113742223A (en) * | 2021-08-23 | 2021-12-03 | 北京鲸鲮信息系统技术有限公司 | Method and device for identifying control, electronic equipment and storage medium |
CN115048161A (en) * | 2021-02-26 | 2022-09-13 | 华为技术有限公司 | Application control method, electronic device, apparatus, and medium |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030125956A1 (en) * | 1999-07-13 | 2003-07-03 | James R. Lewis | Speech enabling labeless controls in an existing graphical user interface |
US20040230637A1 (en) * | 2003-04-29 | 2004-11-18 | Microsoft Corporation | Application controls for speech enabled recognition |
CN104076916A (en) * | 2013-03-29 | 2014-10-01 | 联想(北京)有限公司 | Information processing method and electronic device |
CN104182124A (en) * | 2014-08-25 | 2014-12-03 | 广东欧珀移动通信有限公司 | Operating method and device of mobile terminal |
CN105161106A (en) * | 2015-08-20 | 2015-12-16 | 深圳Tcl数字技术有限公司 | Voice control method of intelligent terminal, voice control device and television system |
CN107147776A (en) * | 2017-04-14 | 2017-09-08 | 努比亚技术有限公司 | The implementation method and mobile terminal of a kind of Voice command |
CN107507615A (en) * | 2017-08-29 | 2017-12-22 | 百度在线网络技术(北京)有限公司 | Interface intelligent interaction control method, device, system and storage medium |
CN107608586A (en) * | 2012-06-05 | 2018-01-19 | 苹果公司 | Phonetic order during navigation |
CN107656674A (en) * | 2017-09-26 | 2018-02-02 | 网易(杭州)网络有限公司 | Information interacting method, device, electronic equipment and storage medium |
CN107948698A (en) * | 2017-12-14 | 2018-04-20 | 深圳市雷鸟信息科技有限公司 | Sound control method, system and the smart television of smart television |
CN108279839A (en) * | 2017-01-05 | 2018-07-13 | 阿里巴巴集团控股有限公司 | Voice-based exchange method, device, electronic equipment and operating system |
-
2018
- 2018-09-20 CN CN201811098577.5A patent/CN109448727A/en active Pending
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030125956A1 (en) * | 1999-07-13 | 2003-07-03 | James R. Lewis | Speech enabling labeless controls in an existing graphical user interface |
US20040230637A1 (en) * | 2003-04-29 | 2004-11-18 | Microsoft Corporation | Application controls for speech enabled recognition |
CN107608586A (en) * | 2012-06-05 | 2018-01-19 | 苹果公司 | Phonetic order during navigation |
CN104076916A (en) * | 2013-03-29 | 2014-10-01 | 联想(北京)有限公司 | Information processing method and electronic device |
CN104182124A (en) * | 2014-08-25 | 2014-12-03 | 广东欧珀移动通信有限公司 | Operating method and device of mobile terminal |
CN105161106A (en) * | 2015-08-20 | 2015-12-16 | 深圳Tcl数字技术有限公司 | Voice control method of intelligent terminal, voice control device and television system |
CN108279839A (en) * | 2017-01-05 | 2018-07-13 | 阿里巴巴集团控股有限公司 | Voice-based exchange method, device, electronic equipment and operating system |
US20190317725A1 (en) * | 2017-01-05 | 2019-10-17 | Alibaba Group Holding Limited | Speech-based interaction with a display window |
CN107147776A (en) * | 2017-04-14 | 2017-09-08 | 努比亚技术有限公司 | The implementation method and mobile terminal of a kind of Voice command |
CN107507615A (en) * | 2017-08-29 | 2017-12-22 | 百度在线网络技术(北京)有限公司 | Interface intelligent interaction control method, device, system and storage medium |
CN107656674A (en) * | 2017-09-26 | 2018-02-02 | 网易(杭州)网络有限公司 | Information interacting method, device, electronic equipment and storage medium |
CN107948698A (en) * | 2017-12-14 | 2018-04-20 | 深圳市雷鸟信息科技有限公司 | Sound control method, system and the smart television of smart television |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112346695A (en) * | 2019-08-09 | 2021-02-09 | 华为技术有限公司 | Method for controlling equipment through voice and electronic equipment |
CN115145529A (en) * | 2019-08-09 | 2022-10-04 | 华为技术有限公司 | Method for controlling equipment through voice and electronic equipment |
CN112309388A (en) * | 2020-03-02 | 2021-02-02 | 北京字节跳动网络技术有限公司 | Method and apparatus for processing information |
CN112102832A (en) * | 2020-09-18 | 2020-12-18 | 广州小鹏汽车科技有限公司 | Speech recognition method, speech recognition device, server and computer-readable storage medium |
CN112102832B (en) * | 2020-09-18 | 2021-12-28 | 广州小鹏汽车科技有限公司 | Speech recognition method, speech recognition device, server and computer-readable storage medium |
CN112634896A (en) * | 2020-12-30 | 2021-04-09 | 智道网联科技(北京)有限公司 | Operation method of application program on intelligent terminal and intelligent terminal |
CN115048161A (en) * | 2021-02-26 | 2022-09-13 | 华为技术有限公司 | Application control method, electronic device, apparatus, and medium |
CN113742223A (en) * | 2021-08-23 | 2021-12-03 | 北京鲸鲮信息系统技术有限公司 | Method and device for identifying control, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109448727A (en) | Voice interactive method and device | |
JP7391452B2 (en) | Semantic understanding model training method, apparatus, electronic device and computer program | |
CN108022586B (en) | Method and apparatus for controlling the page | |
US11145302B2 (en) | System for processing user utterance and controlling method thereof | |
CN105493027B (en) | User interface for real-time language translation | |
US7548859B2 (en) | Method and system for assisting users in interacting with multi-modal dialog systems | |
CN109074292A (en) | The automation assistant of agency appropriate calls | |
CN107667318A (en) | Dialog interface technology for system control | |
CN109716714A (en) | Use the control system of the search and dialog interface that have scope | |
CN106373570A (en) | Voice control method and terminal | |
CN108733703A (en) | The answer prediction technique and device of question answering system, electronic equipment, storage medium | |
KR20140094282A (en) | Method and system for providing multi-user messenger service | |
CN109102802A (en) | System for handling user spoken utterances | |
CN109474658A (en) | Electronic equipment, server and the recording medium of task run are supported with external equipment | |
JP6434640B2 (en) | Message display method, message display device, and message display device | |
EP4113357A1 (en) | Method and apparatus for recognizing entity, electronic device and storage medium | |
CN112735418B (en) | Voice interaction processing method, device, terminal and storage medium | |
CN110246499A (en) | The sound control method and device of home equipment | |
CN112286485B (en) | Method and device for controlling application through voice, electronic equipment and storage medium | |
CN110047484A (en) | A kind of speech recognition exchange method, system, equipment and storage medium | |
JP2021022928A (en) | Artificial intelligence-based automatic response method and system | |
CN110311856A (en) | Instant communicating method, equipment and computer readable storage medium | |
CN110286776A (en) | Input method, device, electronic equipment and the storage medium of character combination information | |
CN106197394A (en) | Air navigation aid and device | |
CN109359187A (en) | Sentence entry exchange method and device, electronic equipment, storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |