CN109448727A - Voice interactive method and device - Google Patents

Voice interactive method and device Download PDF

Info

Publication number
CN109448727A
CN109448727A CN201811098577.5A CN201811098577A CN109448727A CN 109448727 A CN109448727 A CN 109448727A CN 201811098577 A CN201811098577 A CN 201811098577A CN 109448727 A CN109448727 A CN 109448727A
Authority
CN
China
Prior art keywords
voice
control
short mark
voice messaging
target widget
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811098577.5A
Other languages
Chinese (zh)
Inventor
李庆湧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201811098577.5A priority Critical patent/CN109448727A/en
Publication of CN109448727A publication Critical patent/CN109448727A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0487Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
    • G06F3/0488Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/451Execution arrangements for user interfaces
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Abstract

The present invention relates to a kind of voice interactive method, device, electronic equipment and storage mediums.This method comprises: the control information for the control that inquiry current application program window includes, being obtained based on the control information and being preset the control of operation is target widget;It is that each target widget distributes a short mark respectively, and the corresponding short mark is presented in each target widget position according to preset rules;It receives voice messaging and the voice messaging is identified, obtain the short mark to be responded for including in the voice messaging;Determine that the position of the short mark to be responded in current application program window is target position, and simulation executes the predetermined registration operation in the target position, to trigger corresponding target widget.The scope of application of interactive voice can be improved in the present invention.

Description

Voice interactive method and device
Technical field
The present invention relates to technical field of voice recognition, set in particular to a kind of voice interactive method, device, electronics Standby and computer readable storage medium.
Background technique
Voice be the mankind be used to using exchange way, not only more naturally, but also having compared with other communication means The advantages that cognitive load is small, and resource occupation is few and interactive efficiency is high.Voice is used as one kind is powerful arbitrarily to control entrance, at present Through being widely applied in the various electronic equipments such as PC, communication terminal, user passes through typing voice, so that it may The operation such as to execute required inquiry on an electronic device, search for, make a phone call, it is convenient for the user to use.
Existing interactive voice mode, it usually needs support the application program of voice operating to determine in electronic equipment System, interactive voice process includes following processing links: after the window of voice operating is supported in the application program display by customization, meeting The phonetic order set that the window is supported is registered in the voice service provided to operating system;When voice service receives user's input Voice messaging after, if it is detected that certain phonetic order phase that voice messaging and the registered phonetic order of the application program are concentrated Symbol, then convert speech information into the respective window that corresponding control instruction is sent to the application program, and application program passes through pre- The code first customized is responded.
But on the one hand, if each window for each application program is customized exploitation voice interactive function, by pole The workload of big increase developer;On the other hand, for much without the application program of customized development voice interactive function, It will be unable to hinder the popularization and application of interactive voice mode on an electronic device with voice interactive function.
Around this theme of interactive voice, there are some patent applications to carry out good try in the prior art, than Such as:
Application No. is the patent applications of CN201410634017.2 to disclose a kind of software operation side based on interactive voice Method and system, the software and the voice assistant independent operating, the voice assistant obtain the execution item of the software operation Mesh information, the voice assistant match speech recognition conversion result with the project implementation information of acquisition, then for The project implementation information matched is carried out according to project implementation element information and project implementation status information and voice messaging by software Operation executes.The software running method and system based on interactive voice is carried out according to the real-time project implementation information of software It operates on it and uses, voice software is made really to march toward intelligence, meanwhile, independent operating is separated with software, it can be with one Voice assistant is used cooperatively with multiple softwares, greatlys save system resource.But each application program includes a large amount of different behaviour It instructs, and the operational order that different application programs includes is even more very different.In this way, the intelligence for voice assistant It is required that then very high.
Disclosing application No. is the patent application of CN201110081146.X a kind of can be widely used in PC, mobile phone, household Speech recognition and interactive system in the various terminal equipments such as electric appliance.Whole system include: interaction design device, interaction actuator, Platform abstraction library, interaction five plug-in unit, platform api core library parts: the completely new interaction of one kind is proposed in interaction design device and is set Meter method, by intuitively operating the design that can complete entirely to interact;Interaction actuator is used to explain execution interaction scripts; Interaction plug-in unit is used to extend the function of existing interaction platform abstraction library and increases some special applications;Platform abstraction library for realizing Multi-platform portability and the independence with platform specific;Platform api core library passes through encapsulation platform specific operating system API is convenient to be called by platform abstraction library.But the operational order that the program may be implemented is less, it is difficult to be applied to operate More application program.
Application No. is the patent application of CN201610736268.0 disclose a kind of control method based on interactive voice and System.This method starts voice interactive system, voice interactive system real-time sense voice messaging, by what is listened to by wake-up signal Converting voice message into text message analyzes the text information of conversion, by with the pre-stored functional parameter of system Judge whether the functional parameter of the corresponding text information of voice messaging is complete, if completely, executing corresponding operation;If endless It is whole, operation to be performed is replied according to the functional parameter prompt user lacked, is grasped in real time by voice calling system to realize Make.Using the control method and system, different function can be selected to operate at any time, or the difference of the same function of selection executes ginseng Number, meets the different demands of user.But the program there is a problem of similar with CN201410634017.2.
Accordingly, it is desirable to provide a kind of adaptability is higher, identifies more fast and accurately voice interactive method, at least can Solve said one or multiple technical problems.
It should be noted that information is only used for reinforcing the reason to background of the invention disclosed in above-mentioned background technology part Solution, therefore may include the information not constituted to the prior art known to persons of ordinary skill in the art.
Summary of the invention
The purpose of the present invention is to provide a kind of voice interactive method, device, electronic equipment and computer-readable storages Medium, and then one or more is asked caused by overcoming the limitation and defect due to the relevant technologies at least to a certain extent Topic.
According to an aspect of the present invention, a kind of voice interactive method is provided, which comprises
The control information for the control that inquiry current application program window includes can be carried out pre- based on control information acquisition If the control of operation is target widget;
A short mark is distributed respectively for each target widget according to preset rules, and in place in each target widget institute It sets and the corresponding short mark is presented;
It receives voice messaging simultaneously to identify the voice messaging, obtains include in the voice messaging to be responded The short mark;
Determine that the position of the short mark to be responded in current application program window is target position, and in the mesh Cursor position simulation executes the predetermined registration operation, to trigger corresponding target widget.
It is described that operation is preset based on control information acquisition in a kind of exemplary embodiment of the invention Control is target widget, comprising:
For each control in current application program window, the triggering for including in the control information of the control is obtained Action type simultaneously judges whether the trigger action type is consistent with the predetermined registration operation;
If the trigger action type is consistent with the predetermined registration operation, using the corresponding control as target control Part.
In a kind of exemplary embodiment of the invention, the predetermined registration operation is touch-control clicking operation and/or mouse-click Operation.
It is described to be distributed respectively according to preset rules for each target widget in a kind of exemplary embodiment of the invention One short mark, comprising:
According to preset order, number mark is sequentially allocated for each target widget, letter identifies or customized Mark.
It is described that the voice messaging is identified in a kind of exemplary embodiment of the invention, obtain the voice The short mark to be responded for including in information, comprising:
Speech recognition is carried out to the voice messaging, the voice messaging is converted into text information;
Matching operation is carried out to the text information, obtains the short mark to be responded for including.
It is described that speech recognition is carried out to voice messaging in a kind of exemplary embodiment of the invention, comprising:
It is right by one of deep neural network model, Hidden Markov Model, gauss hybrid models or a variety of models The voice messaging carries out speech recognition.
In a kind of exemplary embodiment of the invention, the simulation in the target position executes the predetermined registration operation, Include:
By simulating the movement of manual input device, the predetermined registration operation is executed in the target position;It is described defeated manually Entering equipment includes touch screen and/or mouse.
According to an aspect of the present invention, a kind of voice interaction device is provided, described device includes:
Target widget detection module is based on institute for inquiring the control information for the control that current application program window includes Stating control information acquisition and being preset the control of operation is target widget;
Short mark distribution module, each target widget distributes a short mark respectively for being according to preset rules, and The corresponding short mark is presented in each target widget position;
Short mark identification module obtains the voice for receiving voice messaging and identifying to the voice messaging The short mark to be responded for including in information;
Operation simulation module, for determining that the position of the short mark to be responded in current application program window is mesh Cursor position, and simulation executes the predetermined registration operation in the target position, to trigger corresponding target widget.
In one aspect of the invention, a kind of electronic equipment is provided, comprising:
Processor;And
Memory is stored with computer-readable instruction on the memory, and the computer-readable instruction is by the processing The method according to above-mentioned any one is realized when device executes.
In one aspect of the invention, a kind of computer readable storage medium is provided, computer program is stored thereon with, institute State realization method according to above-mentioned any one when computer program is executed by processor.
Voice interactive method in exemplary embodiment of the present invention distributes short mark respectively first for target widget;Its It is secondary, identify the short mark to be responded for including in the voice messaging received;Finally, to determine short mark to be responded current Position in application window is target position, and simulates in target position and execute predetermined registration operation, to trigger corresponding mesh Mark control.On the one hand, it voice is not provided supports for a certain window of a certain application program or application program, from being System level provides voice and supports;Simultaneously as in such a way that simulation executes predetermined registration operation, therefore application program is for default Operation can be responded according to conventional treatment logic;Based on this two o'clock, developer is not necessarily to appoint for program code Adaptation work in terms of what voice reduces the work of developer while providing perfect support to interactive voice mode Amount.On the other hand, it is unified for target widget in the present invention and distributes the preset short mark of difference, in this way, can then be issued to user Voice messaging have desired, not only feature database needed for speech recognition can be greatly reduced, and speech recognition accuracy Available guarantee.It therefore, through the invention can be further to promote in a manner of interactive voice being applicable on an electronic device.
It should be understood that above general description and following detailed description be only it is exemplary and explanatory, not It can the limitation present invention.
Detailed description of the invention
Its example embodiment is described in detail by referring to accompanying drawing, above and other feature of the invention and advantage will become It is more obvious.
Fig. 1 shows the flow chart of the voice interactive method of an exemplary embodiment according to the present invention;
Fig. 2 shows the short home position schematic diagrames of an exemplary embodiment according to the present invention;
Fig. 3 shows the schematic block diagram of the voice interaction device of an exemplary embodiment according to the present invention;
Fig. 4 diagrammatically illustrates the block diagram of the electronic equipment of an exemplary embodiment according to the present invention;And
Fig. 5 diagrammatically illustrates the schematic diagram of the computer readable storage medium of an exemplary embodiment according to the present invention.
Specific embodiment
Example embodiment is described more fully with reference to the drawings.However, example embodiment can be real in a variety of forms It applies, and is not understood as limited to embodiment set forth herein;On the contrary, thesing embodiments are provided so that the present invention will be comprehensively and complete It is whole, and the design of example embodiment is comprehensively communicated to those skilled in the art.Identical appended drawing reference indicates in figure Same or similar part, thus repetition thereof will be omitted.
In addition, described feature, structure or characteristic can be incorporated in one or more implementations in any suitable manner In example.In the following description, many details are provided to provide and fully understand to the embodiment of the present invention.However, It will be appreciated by persons skilled in the art that technical solution of the present invention can be practiced without one in the specific detail or more It is more, or can be using other methods, constituent element, material, device, step etc..In other cases, it is not shown in detail or describes Known features, method, apparatus, realization, material or operation are to avoid fuzzy each aspect of the present invention.
Block diagram shown in the drawings is only functional entity, not necessarily must be corresponding with physically separate entity. I.e., it is possible to realize these functional entitys using software form, or these are realized in the module of one or more softwares hardening A part of functional entity or functional entity, or realized in heterogeneous networks and/or processor device and/or microcontroller device These functional entitys.
In this exemplary embodiment, a kind of voice interactive method is provided firstly, can be applied to computer or movement The electronic equipments such as terminal;With reference to shown in Fig. 1, which be may comprise steps of:
Step S110, the control information for the control that inquiry current application program window includes, is obtained based on the control information Taking the control for being preset operation is target widget;
Step S120, a short mark is distributed respectively for each target widget according to preset rules, and in each target The corresponding short mark is presented in control position;
Step S130, it receives and voice messaging and the voice messaging is identified, obtain in the voice messaging and include The short mark to be responded;
Step S140, determine that the position of the short mark to be responded in current application program window is target position, And simulation executes the predetermined registration operation in the target position, to trigger corresponding target widget.
According to the voice interactive method in this example embodiment, on the one hand, be not for a certain application program or application The a certain window of program provides voice and supports, but provides voice from system level and support;It is executed simultaneously as using simulation The mode of predetermined registration operation, therefore application program can respond predetermined registration operation according to conventional treatment logic;It is based on This two o'clock, developer are not necessarily to the adaptation work in terms of making any voice for program code, provide to interactive voice mode While improving support, reduce the workload of developer.On the other hand, it is default that target widget distribution is unified in the present invention The short marks of difference, in this way, the voice messaging that can then issue to user have desired, not only feature database needed for speech recognition It can be greatly reduced, and the available guarantee of speech recognition accuracy.Therefore, language can further be promoted through the invention The popularization and application of sound interactive mode on an electronic device.
In the following, by the voice interactive method in this example embodiment is further detailed.
In step s 110, the control information for the control that inquiry current application program window includes, is believed based on the control It is target widget that breath, which obtains and is preset the control of operation,.
In this example embodiment, the current application program window can be the window of the application program of front stage operation; By taking Android operation system as an example, the task stack that one only saves multiple elements can be initialized in application program launching and is used To store current page component (activity) and history page component (activity);Wherein, current page component is foreground The component of the application program of operation or by front stage operation application program component activation other applications component;When Other page assemblies except preceding page assembly are then the component of the application program of running background.Certainly, in of the invention other In exemplary embodiment, the current application program window also may include system windows, such as system desktop window etc., originally show Particular determination is not done to this in example property embodiment.
By taking mobile terminal device as an example, the control that the current application program window includes may include such as button, surpass Link, list, Text Entry etc..Operating system for mobile terminal device moreover, usually all provide window management service, Application program or system program can be by interacting, so that windows content is shown to screen with window management service On.Therefore, when implementing the technical program, can monitor window management service includes to obtain current application program window Control and control information with the control.Specifically, when application program wishes that a certain window is presented on foreground, it can be with It sends and requests to window management service, the corresponding window display processing function of triggering window management service call;Therefore, Ke Yi Monitor code can be added in the function, obtained application program and wished the control for including in the window that foreground is presented and and institute State the control information of control.
For each control in current application program window, corresponding control information may include control type, Control title, the trigger action type of control, the layout information of control and control ID etc..It, can in this example embodiment The trigger action type in control information to obtain each control, and then may determine that whether are trigger action type and predetermined registration operation Unanimously.In this example embodiment, the predetermined registration operation can hit operation and/or mouse-click operation for touch point.Citing and Speech, in a mobile terminal device, in current application program window include refresh button, forwarding button, hyperlink URL 1 and Displaying control;For refresh button, forwarding button and hyperlink URL 1, corresponding trigger action type is touch-control Clicking operation, therefore, it is possible to judge that the trigger action type of 1 these three controls of refresh button, forwarding button and hyperlink URL It is consistent with predetermined registration operation, and then refresh button, forwarding button and hyperlink URL 1 can be regard as target widget.
Certainly, in other exemplary embodiments of the invention, the predetermined registration operation also may include such as slide, long-pressing Other operations such as operation or pressing operation;Correspondingly, the target widget also may include other kinds of control, this example Particular determination is not done to this in property embodiment.
In the step s 120, a short mark is distributed respectively for each target widget according to preset rules, and each described The corresponding short mark is presented in target widget position.
Refering to what is shown in Fig. 2, in application window shown in Fig. 2, including target widget 201 is to target widget 213.With It is short to be identified as three bit digitals, in this example embodiment, can according to sequence from small to large, be target widget 201 to Target widget 213 distributes short mark 001 to 013 respectively.In addition, in other exemplary embodiment of the present invention, the short mark Knowing may be the other kinds of short mark of such as letter, monogram, alphanumeric;Alternatively, the short mark can also Think the customized mark of user;Meanwhile in this example embodiment for the specific length of short mark without particular determination, But it is unsuitable too long, such as preferably more than 5 characters etc..
It should be noted that in Fig. 2, be control distributing order according to target widget in application window and The size order of short mark is that target widget distributes short mark, but the present invention is not limited thereto.For example, being pressed for such as advancing The common control such as button, backspace button, submitting button, refresh button can be the short mark of its distribution fixation, to reduce user Learning cost.In another example user is that each target widget is distributed respectively in system automatically for any application window After short mark, the sequence of short mark can be manually adjusted and be saved into configuration file, in this way, load should in next time When application window, then the distribution of short mark can be carried out according to configuration file.Therefore, those skilled in the art can basis Specific requirements carry out above-mentioned preset rules the setting of adaptability, and particular determination is not done to this in the present exemplary embodiment.
In the control information obtained in above-mentioned steps S110, it will usually which the layout information including control, such as control exist Show position in screen and with the dimension information of control etc..Therefore, short mark is distributed respectively for each target widget Afterwards, the corresponding short mark, example can be presented in each target widget position according to the layout information of target widget Such as uniformly it is shown in central location, upper left position or the upper right Angle Position of target widget.In this way, user then can be straight That sees learns the corresponding short mark of each target widget.Certainly, after user is familiar with the corresponding short mark of each target widget of memory, For the consideration such as page beauty, the short mark of some or all of display in the window can also be hidden according to user setting, this is same Sample belongs to the scope of protection of the present invention.
In step s 130, it receives voice messaging and the voice messaging is identified, obtain in the voice messaging Including the short mark to be responded.
In this example embodiment, user can be moved by the speech input devices input voice information such as microphone Terminal device then can receive corresponding voice messaging.After receiving voice messaging, can to the voice messaging into The voice messaging is converted to text information by row speech recognition.
In this example embodiment, deep neural network model, Hidden Markov Model, gauss hybrid models can be passed through One of or a variety of models, speech recognition is carried out to each voice messaging, obtains corresponding text information.For example, can be with Timing information is modeled by Hidden Markov Model, after a state of given Hidden Markov Model, by most The methods of big expectation value-based algorithm is built based on probability distribution of the gauss hybrid models to the speech feature vector for belonging to the state Mould;After modeling successfully, then speech recognition can be carried out to voice messaging, obtain corresponding text information.Certainly, in this hair In other bright exemplary embodiments, in conjunction with contextual information (Context Dependent) or its other party can also be passed through Formula carries out carry out speech recognition, and particular determination is not done to this in the present exemplary embodiment.
After obtaining above-mentioned text information, matching operation can be carried out to the text information, obtain include wait ring The short mark answered.For example, user is read aloud following sentence " 007 " by microphone, to reception in this example embodiment After the voice messaging arrived carries out speech recognition, text information " 007 " can be identified, based on the short identification record table pair prestored " 007 " is matched, then available short mark " 007 ".In another example user, which reads aloud following sentence by microphone, " executes 007 Plan " can identify text information in this example embodiment after carrying out speech recognition to the voice messaging received " executing 007 plan " matches " executing 007 plan " based on the short identification record table prestored, then available short mark "007".Correspondingly, if not carrying out being matched to short mark in the text information based on the short identification record table prestored, it can To prompt user to re-enter voice messaging.In addition, may also require that user exists in other exemplary embodiment of the present invention The beginning of voice messaging increases wake-up word and does not do particular determination in the present exemplary embodiment to this to reduce maloperation etc..
Further, since being unified for target widget in the present invention distributes the preset short mark of difference, in this way, carrying out voice knowledge When other, then the voice messaging that can be issued to user have desired, and not only feature database needed for speech recognition can significantly subtract It is few, and the available guarantee of speech recognition accuracy.
In step S140, determine that the position of the short mark to be responded in current application program window is target position It sets, and simulation executes the predetermined registration operation in the target position, to trigger corresponding target widget.
In this example embodiment, institute can be executed in the target position by the movement of simulation manual input device State predetermined registration operation;The manual input device includes touch screen and/or mouse.By taking Android operation system as an example, it can use The sendevent order that Android operation system provides sends touch event to the corresponding device node of touch screen, for example, can To send to device node to give an order: the sendevent instruction of specified screen position of touch is first sent, wherein carrying above-mentioned mesh The corresponding coordinate value of cursor position, then send correspond in touch control operation clicking operation (the i.e. usually described down movement with And up movement) sendevent order.By sending above-metioned instruction, primary touching of the full simulation for the position of touch Control clicking operation;It will be received after touch event is distributed to window management service in operating system, window management service is according to connecing The clicking operation received is converted into the touch-control thing for the target widget by the corresponding target position of the clicking operation received Part, and the touch event for being directed to the target widget is sent to the current application program window, it is held to trigger the window The corresponding processing operation of row.Certainly, it according to the difference of action type, can be simulated by different modes, this exemplary implementation Particular determination is not done to this in example.
Since present invention employs the mode that simulation executes predetermined registration operation, application program can be by for predetermined registration operation More solito processing logic is responded, so that developer is not necessarily to the adaptation work in terms of making any voice for program code Make, while providing perfect support to interactive voice mode, reduces the workload of developer, interactive voice can be promoted The popularization and application of mode on an electronic device.
It should be noted that although describing each step of method in the present invention in the accompanying drawings with particular order, This does not require that or implies must execute these steps in this particular order, or have to carry out step shown in whole Just it is able to achieve desired result.Additional or alternative, it is convenient to omit multiple steps are merged into a step and held by certain steps Row, and/or a step is decomposed into execution of multiple steps etc..
In addition, in this exemplary embodiment, additionally providing a kind of voice interaction device.Referring to shown in Fig. 3, the interactive voice Device 300 may include: target widget detection module 310, short mark distribution module 320, short mark identification module 330 and behaviour Make analog module 340.Wherein:
Target widget detection module 310 can be used for inquiring the control information for the control that current application program window includes, Being obtained based on the control information and being preset the control of operation is target widget;
Short mark distribution module 320 can be used for being that each target widget distributes a short mark respectively according to preset rules Know, and the corresponding short mark is presented in each target widget position;
Short mark identification module 330 can be used for receiving voice messaging and identify to the voice messaging, obtain institute State the short mark to be responded for including in voice messaging;
Operation simulation module 340 is determined for the position of the short mark to be responded in current application program window It is set to target position, and simulation executes the predetermined registration operation in the target position, to trigger corresponding target widget.
The detail of each voice interaction device module carries out in corresponding audio paragraph recognition methods among the above Detailed description, therefore details are not described herein again.
It should be noted that although being referred to several modules or unit of voice interaction device 400 in the above detailed description, But it is this divide it is not enforceable.In fact, embodiment according to the present invention, two or more above-described modules Either the feature and function of unit can embody in a module or unit.Conversely, an above-described module or The feature and function of person's unit can be to be embodied by multiple modules or unit with further division.
In addition, in an exemplary embodiment of the present invention, additionally providing a kind of electronic equipment that can be realized the above method.
Person of ordinary skill in the field it is understood that various aspects of the invention can be implemented as system, method or Program product.Therefore, various aspects of the invention can be embodied in the following forms, it may be assumed that complete hardware embodiment, completely Software implementation (including firmware, microcode etc.) or hardware and software in terms of combine embodiment, may be collectively referred to as here Circuit, " module " or " system ".
The electronic equipment 400 of this embodiment according to the present invention is described referring to Fig. 4.The electronics that Fig. 4 is shown is set Standby 400 be only an example, should not function to the embodiment of the present invention and use scope bring any restrictions.
As shown in figure 4, electronic equipment 400 is showed in the form of universal computing device.The component of electronic equipment 400 can wrap It includes but is not limited to: at least one above-mentioned processing unit 410, at least one above-mentioned storage unit 420, the different system components of connection The bus 430 of (including storage unit 420 and processing unit 410), display unit 440.
Wherein, the storage unit is stored with program code, and said program code can be held by the processing unit 410 Row, so that various according to the present invention described in the execution of the processing unit 410 above-mentioned " illustrative methods " part of this specification The step of exemplary embodiment.For example, the processing unit 410 can execute step S110 as shown in fig. 1 to step S160。
Storage unit 420 may include the readable medium of volatile memory cell form, such as Random Access Storage Unit (RAM) 4201 and/or cache memory unit 4202, it can further include read-only memory unit (ROM) 4203.
Storage unit 420 can also include program/utility with one group of (at least one) program module 4205 4204, such program module 4204 includes but is not limited to: operating system, one or more application program, other program moulds It may include the realization of network environment in block and program data, each of these examples or certain combination.
Bus 430 can be to indicate one of a few class bus structures or a variety of, including storage unit bus or storage Cell controller, peripheral bus, graphics acceleration port, processing unit use any bus structures in a variety of bus structures Local bus.
Electronic equipment 400 can also be with one or more external equipments 470 (such as keyboard, sensing equipment, bluetooth equipment Deng) communication, can also be enabled a user to one or more equipment interact with the electronic equipment 400 communicate, and/or with make Any equipment (such as the router, modulation /demodulation that the electronic equipment 400 can be communicated with one or more of the other calculating equipment Device etc.) communication.This communication can be carried out by input/output (I/O) interface 450.Also, electronic equipment 400 can be with By network adapter 460 and one or more network (such as local area network (LAN), wide area network (WAN) and/or public network, Such as internet) communication.As shown, network adapter 460 is communicated by bus 430 with other modules of electronic equipment 400. It should be understood that although not shown in the drawings, other hardware and/or software module can not used in conjunction with electronic equipment 400, including but not Be limited to: microcode, device driver, redundant processing unit, external disk drive array, RAID system, tape drive and Data backup storage system etc..
By the description of above embodiment, those skilled in the art is it can be readily appreciated that example embodiment described herein It can also be realized in such a way that software is in conjunction with necessary hardware by software realization.Therefore, implement according to the present invention The technical solution of example can be embodied in the form of software products, which can store in a non-volatile memories In medium (can be CD-ROM, USB flash disk, mobile hard disk etc.) or on network, including some instructions are so that a calculating equipment (can To be personal computer, server, terminal installation or network equipment etc.) it executes according to the method for the embodiment of the present invention.
In an exemplary embodiment of the present invention, a kind of computer readable storage medium is additionally provided, energy is stored thereon with Enough realize the program product of this specification above method.In some possible embodiments, various aspects of the invention can be with It is embodied as a kind of form of program product comprising program code, it is described when described program product is run on the terminal device Program code is for executing the terminal device described in above-mentioned " illustrative methods " part of this specification according to the present invention The step of various exemplary embodiments.
Refering to what is shown in Fig. 5, the program product 500 for realizing the above method of embodiment according to the present invention is described, It can using portable compact disc read only memory (CD-ROM) and including program code, and can in terminal device, such as It is run on PC.However, program product of the invention is without being limited thereto, in this document, readable storage medium storing program for executing, which can be, appoints What include or the tangible medium of storage program that the program can be commanded execution system, device or device use or and its It is used in combination.
Described program product can be using any combination of one or more readable mediums.Readable medium can be readable letter Number medium or readable storage medium storing program for executing.Readable storage medium storing program for executing for example can be but be not limited to electricity, magnetic, optical, electromagnetic, infrared ray or System, device or the device of semiconductor, or any above combination.The more specific example of readable storage medium storing program for executing is (non exhaustive List) include: electrical connection with one or more conducting wires, portable disc, hard disk, random access memory (RAM), read-only Memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read only memory (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.
Computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal, In carry readable program code.The data-signal of this propagation can take various forms, including but not limited to electromagnetic signal, Optical signal or above-mentioned any appropriate combination.Readable signal medium can also be any readable Jie other than readable storage medium storing program for executing Matter, the readable medium can send, propagate or transmit for by instruction execution system, device or device use or and its The program of combined use.
The program code for including on readable medium can transmit with any suitable medium, including but not limited to wirelessly, have Line, optical cable, RF etc. or above-mentioned any appropriate combination.
The program for executing operation of the present invention can be write with any combination of one or more programming languages Code, described program design language include object oriented program language-Java, C++ etc., further include conventional Procedural programming language-such as " C " language or similar programming language.Program code can be fully in user It calculates and executes in equipment, partly executes on a user device, being executed as an independent software package, partially in user's calculating Upper side point is executed on a remote computing or is executed in remote computing device or server completely.It is being related to far Journey calculates in the situation of equipment, and remote computing device can pass through the network of any kind, including local area network (LAN) or wide area network (WAN), it is connected to user calculating equipment, or, it may be connected to external computing device (such as utilize ISP To be connected by internet).
In addition, above-mentioned attached drawing is only the schematic theory of processing included by method according to an exemplary embodiment of the present invention It is bright, rather than limit purpose.It can be readily appreciated that the time that above-mentioned processing shown in the drawings did not indicated or limited these processing is suitable Sequence.In addition, be also easy to understand, these processing, which can be, for example either synchronously or asynchronously to be executed in multiple modules.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to of the invention its His embodiment.This application is intended to cover any variations, uses, or adaptations of the invention, these modifications, purposes or Adaptive change follow general principle of the invention and including the undocumented common knowledge in the art of the present invention or Conventional techniques.The description and examples are only to be considered as illustrative, and true scope and spirit of the invention are by claim It points out.
It should be understood that the present invention is not limited to the precise structure already described above and shown in the accompanying drawings, and And various modifications and changes may be made without departing from the scope thereof.The scope of the present invention is only limited by the attached claims.

Claims (10)

1. a kind of voice interactive method, which is characterized in that the described method includes:
The control information for the control that inquiry current application program window includes is preset behaviour based on control information acquisition The control of work is target widget;
It is that each target widget distributes a short mark respectively, and is according to preset rules in each target widget position The existing corresponding short mark;
Receive voice messaging simultaneously the voice messaging is identified, obtain include in the voice messaging it is to be responded described in Short mark;
Determine that the position of the short mark to be responded in current application program window is target position, and in the target position It sets simulation and executes the predetermined registration operation, to trigger corresponding target widget.
2. voice interactive method according to claim 1, which is characterized in that described obtained based on the control information can be into The control of row predetermined registration operation is target widget, comprising:
For each control in current application program window, the trigger action for including in the control information of the control is obtained Type simultaneously judges whether the trigger action type is consistent with the predetermined registration operation;
If the trigger action type is consistent with the predetermined registration operation, using the corresponding control as target widget.
3. voice interactive method according to claim 2, which is characterized in that the predetermined registration operation is touch-control clicking operation And/or mouse-click operation.
4. voice interactive method according to claim 1, which is characterized in that it is described according to preset rules be each target Control distributes a short mark respectively, comprising:
According to preset order, number mark, letter mark or user-defined identification are sequentially allocated for each target widget.
5. voice interactive method according to claim 1, which is characterized in that it is described that the voice messaging is identified, Obtain the short mark to be responded for including in the voice messaging, comprising:
Speech recognition is carried out to the voice messaging, the voice messaging is converted into text information;
Matching operation is carried out to the text information, obtains the short mark to be responded for including.
6. voice interactive method according to claim 5, which is characterized in that it is described that speech recognition is carried out to voice messaging, Include:
By one of deep neural network model, Hidden Markov Model, gauss hybrid models or a variety of models, to described Voice messaging carries out speech recognition.
7. voice interactive method according to claim 1, which is characterized in that described simulate in the target position executes institute State predetermined registration operation, comprising:
By simulating the movement of manual input device, the predetermined registration operation is executed in the target position;Described be manually entered sets Standby includes touch screen and/or mouse.
8. a kind of voice interaction device, which is characterized in that described device includes:
Target widget detection module is based on the control for inquiring the control information for the control that current application program window includes The control that part acquisition of information is preset operation is target widget;
Short mark distribution module, for distributing a short mark respectively according to preset rules for each target widget, and in each institute It states target widget position and the corresponding short mark is presented;
Short mark identification module obtains the voice messaging for receiving voice messaging and identifying to the voice messaging In include the short mark to be responded;
Operation simulation module, for determining that the position of the short mark to be responded in current application program window is target position It sets, and simulation executes the predetermined registration operation in the target position, to trigger corresponding target widget.
9. a kind of electronic equipment characterized by comprising
Processor;And
Memory is stored with computer-readable instruction on the memory, and the computer-readable instruction is held by the processor Method according to any one of claim 1 to 7 is realized when row.
10. a kind of computer readable storage medium, is stored thereon with computer program, the computer program is executed by processor Shi Shixian is according to claim 1 to any one of 7 the methods.
CN201811098577.5A 2018-09-20 2018-09-20 Voice interactive method and device Pending CN109448727A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811098577.5A CN109448727A (en) 2018-09-20 2018-09-20 Voice interactive method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811098577.5A CN109448727A (en) 2018-09-20 2018-09-20 Voice interactive method and device

Publications (1)

Publication Number Publication Date
CN109448727A true CN109448727A (en) 2019-03-08

Family

ID=65533136

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811098577.5A Pending CN109448727A (en) 2018-09-20 2018-09-20 Voice interactive method and device

Country Status (1)

Country Link
CN (1) CN109448727A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112102832A (en) * 2020-09-18 2020-12-18 广州小鹏汽车科技有限公司 Speech recognition method, speech recognition device, server and computer-readable storage medium
CN112309388A (en) * 2020-03-02 2021-02-02 北京字节跳动网络技术有限公司 Method and apparatus for processing information
CN112346695A (en) * 2019-08-09 2021-02-09 华为技术有限公司 Method for controlling equipment through voice and electronic equipment
CN112634896A (en) * 2020-12-30 2021-04-09 智道网联科技(北京)有限公司 Operation method of application program on intelligent terminal and intelligent terminal
CN113742223A (en) * 2021-08-23 2021-12-03 北京鲸鲮信息系统技术有限公司 Method and device for identifying control, electronic equipment and storage medium
CN115048161A (en) * 2021-02-26 2022-09-13 华为技术有限公司 Application control method, electronic device, apparatus, and medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030125956A1 (en) * 1999-07-13 2003-07-03 James R. Lewis Speech enabling labeless controls in an existing graphical user interface
US20040230637A1 (en) * 2003-04-29 2004-11-18 Microsoft Corporation Application controls for speech enabled recognition
CN104076916A (en) * 2013-03-29 2014-10-01 联想(北京)有限公司 Information processing method and electronic device
CN104182124A (en) * 2014-08-25 2014-12-03 广东欧珀移动通信有限公司 Operating method and device of mobile terminal
CN105161106A (en) * 2015-08-20 2015-12-16 深圳Tcl数字技术有限公司 Voice control method of intelligent terminal, voice control device and television system
CN107147776A (en) * 2017-04-14 2017-09-08 努比亚技术有限公司 The implementation method and mobile terminal of a kind of Voice command
CN107507615A (en) * 2017-08-29 2017-12-22 百度在线网络技术(北京)有限公司 Interface intelligent interaction control method, device, system and storage medium
CN107608586A (en) * 2012-06-05 2018-01-19 苹果公司 Phonetic order during navigation
CN107656674A (en) * 2017-09-26 2018-02-02 网易(杭州)网络有限公司 Information interacting method, device, electronic equipment and storage medium
CN107948698A (en) * 2017-12-14 2018-04-20 深圳市雷鸟信息科技有限公司 Sound control method, system and the smart television of smart television
CN108279839A (en) * 2017-01-05 2018-07-13 阿里巴巴集团控股有限公司 Voice-based exchange method, device, electronic equipment and operating system

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030125956A1 (en) * 1999-07-13 2003-07-03 James R. Lewis Speech enabling labeless controls in an existing graphical user interface
US20040230637A1 (en) * 2003-04-29 2004-11-18 Microsoft Corporation Application controls for speech enabled recognition
CN107608586A (en) * 2012-06-05 2018-01-19 苹果公司 Phonetic order during navigation
CN104076916A (en) * 2013-03-29 2014-10-01 联想(北京)有限公司 Information processing method and electronic device
CN104182124A (en) * 2014-08-25 2014-12-03 广东欧珀移动通信有限公司 Operating method and device of mobile terminal
CN105161106A (en) * 2015-08-20 2015-12-16 深圳Tcl数字技术有限公司 Voice control method of intelligent terminal, voice control device and television system
CN108279839A (en) * 2017-01-05 2018-07-13 阿里巴巴集团控股有限公司 Voice-based exchange method, device, electronic equipment and operating system
US20190317725A1 (en) * 2017-01-05 2019-10-17 Alibaba Group Holding Limited Speech-based interaction with a display window
CN107147776A (en) * 2017-04-14 2017-09-08 努比亚技术有限公司 The implementation method and mobile terminal of a kind of Voice command
CN107507615A (en) * 2017-08-29 2017-12-22 百度在线网络技术(北京)有限公司 Interface intelligent interaction control method, device, system and storage medium
CN107656674A (en) * 2017-09-26 2018-02-02 网易(杭州)网络有限公司 Information interacting method, device, electronic equipment and storage medium
CN107948698A (en) * 2017-12-14 2018-04-20 深圳市雷鸟信息科技有限公司 Sound control method, system and the smart television of smart television

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112346695A (en) * 2019-08-09 2021-02-09 华为技术有限公司 Method for controlling equipment through voice and electronic equipment
CN115145529A (en) * 2019-08-09 2022-10-04 华为技术有限公司 Method for controlling equipment through voice and electronic equipment
CN112309388A (en) * 2020-03-02 2021-02-02 北京字节跳动网络技术有限公司 Method and apparatus for processing information
CN112102832A (en) * 2020-09-18 2020-12-18 广州小鹏汽车科技有限公司 Speech recognition method, speech recognition device, server and computer-readable storage medium
CN112102832B (en) * 2020-09-18 2021-12-28 广州小鹏汽车科技有限公司 Speech recognition method, speech recognition device, server and computer-readable storage medium
CN112634896A (en) * 2020-12-30 2021-04-09 智道网联科技(北京)有限公司 Operation method of application program on intelligent terminal and intelligent terminal
CN115048161A (en) * 2021-02-26 2022-09-13 华为技术有限公司 Application control method, electronic device, apparatus, and medium
CN113742223A (en) * 2021-08-23 2021-12-03 北京鲸鲮信息系统技术有限公司 Method and device for identifying control, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN109448727A (en) Voice interactive method and device
JP7391452B2 (en) Semantic understanding model training method, apparatus, electronic device and computer program
CN108022586B (en) Method and apparatus for controlling the page
US11145302B2 (en) System for processing user utterance and controlling method thereof
CN105493027B (en) User interface for real-time language translation
US7548859B2 (en) Method and system for assisting users in interacting with multi-modal dialog systems
CN109074292A (en) The automation assistant of agency appropriate calls
CN107667318A (en) Dialog interface technology for system control
CN109716714A (en) Use the control system of the search and dialog interface that have scope
CN106373570A (en) Voice control method and terminal
CN108733703A (en) The answer prediction technique and device of question answering system, electronic equipment, storage medium
KR20140094282A (en) Method and system for providing multi-user messenger service
CN109102802A (en) System for handling user spoken utterances
CN109474658A (en) Electronic equipment, server and the recording medium of task run are supported with external equipment
JP6434640B2 (en) Message display method, message display device, and message display device
EP4113357A1 (en) Method and apparatus for recognizing entity, electronic device and storage medium
CN112735418B (en) Voice interaction processing method, device, terminal and storage medium
CN110246499A (en) The sound control method and device of home equipment
CN112286485B (en) Method and device for controlling application through voice, electronic equipment and storage medium
CN110047484A (en) A kind of speech recognition exchange method, system, equipment and storage medium
JP2021022928A (en) Artificial intelligence-based automatic response method and system
CN110311856A (en) Instant communicating method, equipment and computer readable storage medium
CN110286776A (en) Input method, device, electronic equipment and the storage medium of character combination information
CN106197394A (en) Air navigation aid and device
CN109359187A (en) Sentence entry exchange method and device, electronic equipment, storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination