CN113053384A - APP voice control method and system and computer equipment - Google Patents

APP voice control method and system and computer equipment Download PDF

Info

Publication number
CN113053384A
CN113053384A CN202110426130.1A CN202110426130A CN113053384A CN 113053384 A CN113053384 A CN 113053384A CN 202110426130 A CN202110426130 A CN 202110426130A CN 113053384 A CN113053384 A CN 113053384A
Authority
CN
China
Prior art keywords
app
sliding
user
current
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110426130.1A
Other languages
Chinese (zh)
Inventor
瞿辩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
May 8 Home Co ltd
Original Assignee
May 8 Home Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by May 8 Home Co ltd filed Critical May 8 Home Co ltd
Priority to CN202110426130.1A priority Critical patent/CN113053384A/en
Publication of CN113053384A publication Critical patent/CN113053384A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Abstract

The invention belongs to the technical field of voice control, and provides an APP voice control method, an APP voice control system and computer equipment, wherein the method comprises the following steps: establishing a user operation database, wherein the user operation database is used for recording historical operation data of a user during using the APP through finger operation; when a user opens an APP at a client, the APP automatically starts a voice input function; receiving voice input of a user through an APP, and identifying the input voice to convert the input voice into a current operation instruction; judging whether the operation to be executed by the current operation instruction is a specific operation or not; and when the operation to be executed is judged to be a specific operation, calculating the parameters of the current operation instruction according to the historical operation data recorded by the user operation database, and automatically executing the corresponding operation. The method realizes more intellectualization and automation of the APP, improves the adaptability of users to use people, simplifies the operation and optimizes the APP voice control method.

Description

APP voice control method and system and computer equipment
Technical Field
The invention belongs to the technical field of voice control, is particularly suitable for the field of housekeeping services, and more particularly relates to an APP voice control method, an APP voice control system and computer equipment.
Background
In recent years, with the age-old population times and the "second-birth" times, the demand for home services is increasing and various home services are increasing, and users mainly search for home service staff through offline or online home service systems. The home services include a month-sister service, a nurse service, and the like.
At present, a mode of controlling an Application program (APP) by voice is ubiquitous. However, in the existing home service product APP, a user is usually required to perform manual operation to complete the relevant home service, and obviously, in some specific home service scenarios, the user cannot be required only by manual operation, which results in low applicability due to limited user population.
In addition, in the existing voice control method, when a user uses a function of a voice control application, the user needs to turn on a voice wakeup to start the voice control function. However, the step of turning on the voice wakeup by the user is complicated, and even the user needs to perform some manual operations to perform corresponding voice control, which cannot meet the user requirements of some specific scenes or specific users. In addition, the corresponding text description needs to be customized according to the control event of the application program, so that the recognition result of the voice signal is matched with the customized text description, and adverse effects such as poor user experience are caused. Therefore, there is still much room for improvement in the applicability of the APP voice control method, scene diversification, operation complexity, and the like.
Therefore, it is necessary to provide an APP voice control method to solve the above problems.
Disclosure of Invention
Technical problem to be solved
The invention aims to solve the technical problems that the existing APP voice control method is limited in applicable population, high in complexity of voice control, incapable of meeting user requirements of certain specific scenes or specific users and the like.
(II) technical scheme
In order to solve the above technical problem, an aspect of the present invention provides an APP voice control method for controlling a client APP to complete an operation through voice, where the method includes: establishing a user operation database, wherein the user operation database is used for recording historical operation data of a user during using the APP through finger operation; when a user opens an APP at a client, the APP automatically starts a voice input function; receiving voice input of a user through an APP, and identifying the input voice to convert the input voice into a current operation instruction; judging whether the operation to be executed by the current operation instruction is a specific operation or not; and when the operation to be executed is judged to be a specific operation, calculating the parameters of the current operation instruction according to the historical operation data recorded by the user operation database, and automatically executing the corresponding operation according to the current operation instruction and the parameters thereof.
According to a preferred embodiment of the present invention, the operation includes a click, a double click, a slide, an input, and a long press, and the specific operation includes a leftward slide, a rightward slide, an upward slide, and a downward slide.
According to a preferred embodiment of the present invention, the historical operation data comprises historical slide operation data; when the current operation instruction is sliding leftwards, sliding rightwards, sliding upwards or sliding downwards, the parameters of the current operation instruction comprise a sliding starting point position and a sliding end point position.
According to a preferred embodiment of the present invention, when the current operation instruction is a leftward slide, a rightward slide, an upward slide, or a downward slide, an average value of the slide start position and the slide end position of the current operation instruction is calculated from a plurality of pieces of historical slide operation data as a parameter of the current operation instruction.
According to the preferred embodiment of the present invention, when the user operation database does not record the historical operation data of the specific operation, the predetermined default parameter is used as the parameter of the current operation instruction, and the corresponding operation is automatically executed according to the current operation instruction and the parameter thereof.
According to a preferred embodiment of the present invention, when the operation to be performed is a leftward swipe or a rightward swipe, the default parameter is a screen width; when the operation to be performed is a swipe up or a swipe down, the default parameter is a screen height.
According to a preferred embodiment of the present invention, calculating the parameter of the current operation instruction according to the historical operation data recorded by the user operation database comprises: training a machine learning model by using the historical operation data, and calculating parameters of a current operation instruction by using the trained machine learning model, wherein the historical operation data further comprises operation environment data; the machine learning model also performs calculation according to the current operating environment data when calculating the parameters of the current operating instructions.
According to a preferred embodiment of the present invention, the operation environment data includes at least one of time of operation, geographical location, and front-back operation instruction.
According to a preferred embodiment of the present invention, said recognizing the speech input comprises: travel is identified from the voice input as information and target information, the behavior information includes click behavior, double click behavior, slide behavior, input behavior and long press behavior, and the target information includes buttons, options and input boxes corresponding to the operation behavior.
According to a preferred embodiment of the present invention, further comprising: and when the operation to be executed by the current operation instruction is judged not to be the specific operation, directly and automatically executing the operation.
According to a preferred embodiment of the present invention, further comprising: the voice input function is realized by matching with the Android barrier-free auxiliary function.
A second aspect of the present invention provides an APP voice control system for controlling a client APP to complete operations through voice, the system including: the system comprises an establishing module, a judging module and a processing module, wherein the establishing module is used for establishing a user operation database, and the user operation database is used for recording historical operation data of a user during the period of using the APP through finger operation; the function starting module is used for automatically starting the voice input function by the APP when the user opens the client APP; the recognition conversion module receives the voice input of a user through the APP and recognizes the input voice to convert the input voice into a current operation instruction; the judging module is used for judging whether the operation to be executed by the current operation instruction is a specific operation or not; and the automatic execution module is used for calculating the parameter of the current operation instruction according to the historical operation data recorded by the user operation database when the operation to be executed is judged to be the specific operation, and automatically executing the corresponding operation according to the current operation instruction and the parameter thereof.
According to a preferred embodiment of the present invention, the operation includes a click, a double click, a slide, an input, and a long press, and the specific operation includes a leftward slide, a rightward slide, an upward slide, and a downward slide.
According to a preferred embodiment of the present invention, the historical operation data comprises historical slide operation data; when the current operation instruction is sliding leftwards, sliding rightwards, sliding upwards or sliding downwards, the parameters of the current operation instruction comprise a sliding starting point position and a sliding end point position.
According to a preferred embodiment of the present invention, the method further comprises a calculating module, wherein the calculating module is configured to calculate an average value of the sliding start position and the sliding end position of the current operation instruction according to a plurality of historical sliding operation data as a parameter of the current operation instruction, when the current operation instruction is sliding leftward, sliding rightward, sliding upward or sliding downward.
According to the preferred embodiment of the present invention, when the user operation database does not record the historical operation data of the specific operation, the predetermined default parameter is used as the parameter of the current operation instruction, and the corresponding operation is automatically executed according to the current operation instruction and the parameter thereof.
According to a preferred embodiment of the present invention, when the operation to be performed is a leftward swipe or a rightward swipe, the default parameter is a screen width; when the operation to be performed is a swipe up or a swipe down, the default parameter is a screen height.
According to the preferred embodiment of the present invention, the system further comprises a model building module, wherein the model building module is configured to train a machine learning model using the historical operating data, and calculate parameters of a current operating instruction using the trained machine learning model, and the historical operating data further comprises operating environment data; the machine learning model also performs calculation according to the current operating environment data when calculating the parameters of the current operating instructions.
According to a preferred embodiment of the present invention, the operation environment data includes at least one of time of operation, geographical location, and front-back operation instruction.
According to a preferred embodiment of the present invention, said recognizing the speech input comprises: travel is identified from the voice input as information and target information, the behavior information includes click behavior, double click behavior, slide behavior, input behavior and long press behavior, and the target information includes buttons, options and input boxes corresponding to the operation behavior.
According to a preferred embodiment of the present invention, further comprising: and when the operation to be executed by the current operation instruction is judged not to be the specific operation, directly and automatically executing the operation.
According to a preferred embodiment of the present invention, the voice input function is implemented by using an Android barrier-free auxiliary function in cooperation.
A third aspect of the present invention proposes a computer device comprising a processor and a memory, said memory being adapted to store a computer executable program, said processor performing the APP voice control method of the present invention when said computer program is executed by said processor.
A fourth aspect of the present invention provides a computer program product, storing a computer executable program, where the computer executable program, when executed, implements the APP voice control method of the present invention.
(III) advantageous effects
Compared with the prior art, the method and the device have the advantages that the automatic control of the APP or the automatic execution of all functions in the APP is realized by simulating manual operation and calculating the relevant parameters of automatic execution, so that more intellectualization and automation of the APP can be realized, and the adaptability of users to use people can be improved; the operation can be simplified, the complexity of voice control can be reduced, and the APP voice control method can be optimized.
Drawings
Fig. 1 is a flowchart of an example of an APP voice control method of embodiment 1 of the present invention;
fig. 2 is a flowchart of another example of an APP voice control method of embodiment 1 of the present invention;
fig. 3 is a flowchart of still another example of an APP voice control method of embodiment 1 of the present invention;
fig. 4 is a flowchart of still another example of an APP voice control method of embodiment 1 of the present invention;
fig. 5 is a schematic diagram of an example of an APP voice control system of embodiment 2 of the present invention;
fig. 6 is a schematic diagram of another example of an APP voice control system of embodiment 2 of the present invention;
fig. 7 is a schematic diagram of still another example of an APP voice control system of embodiment 2 of the present invention.
Fig. 8 is a schematic structural diagram of a computer device according to an embodiment of the present invention.
FIG. 9 is a schematic diagram of a computer program product of an embodiment of the invention.
Detailed Description
In describing particular embodiments, specific details of structures, properties, effects, or other features are set forth in order to provide a thorough understanding of the embodiments by one skilled in the art. However, it is not excluded that a person skilled in the art may implement the invention in a specific case without the above-described structures, performances, effects or other features.
The flow chart in the drawings is only an exemplary flow demonstration, and does not represent that all the contents, operations and steps in the flow chart are necessarily included in the scheme of the invention, nor does it represent that the execution is necessarily performed in the order shown in the drawings. For example, some operations/steps in the flowcharts may be divided, some operations/steps may be combined or partially combined, and the like, and the execution order shown in the flowcharts may be changed according to actual situations without departing from the gist of the present invention.
The block diagrams in the figures generally represent functional entities and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different network and/or processing unit devices and/or microcontroller devices.
The same reference numerals denote the same or similar elements, components, or parts throughout the drawings, and thus, a repetitive description thereof may be omitted hereinafter. It will be further understood that, although the terms first, second, third, etc. may be used herein to describe various elements, components, or sections, these elements, components, or sections should not be limited by these terms. That is, these phrases are used only to distinguish one from another. For example, a first device may also be referred to as a second device without departing from the spirit of the present invention. Furthermore, the term "and/or", "and/or" is intended to include all combinations of any one or more of the listed items.
In view of the above problems, the present invention provides an APP voice control method, which can calculate relevant parameters for automatic execution by simulating manual operation, and realize automatic control of APP or automatic execution of each function in the APP, thereby realizing more intellectualization and automation of APP, and improving adaptability of users; the operation can be simplified, the complexity of voice control can be reduced, and the APP voice control method can be optimized.
In order that the objects, technical solutions and advantages of the present invention will become more apparent, the present invention will be further described in detail with reference to the accompanying drawings in conjunction with the following specific embodiments.
Fig. 1 is a flowchart of an example of an APP voice control method of embodiment 1 of the present invention.
As shown in fig. 1, the APP voice control method includes the following steps:
step S101, establishing a user operation database, wherein the user operation database is used for recording historical operation data during the period that the user uses the APP through finger operation.
Step S102, when a user opens the client APP, the APP automatically starts a voice input function.
And step S103, receiving voice input of a user through the APP, and recognizing the input voice to convert the input voice into a current operation instruction.
Step S104, determining whether the operation to be executed by the current operation instruction is a specific operation.
And step S105, when the operation to be executed is judged to be a specific operation, calculating the parameter of the current operation instruction according to the historical operation data recorded by the user operation database, and automatically executing the corresponding operation according to the current operation instruction and the parameter thereof.
It should be noted that the APP voice control method of the present invention has wide applicability, and is particularly suitable for assisting the housekeeping service staff to complete operations through the voice control APP server side in a scene where manual operations are inconvenient when using the housekeeping APP. The specific process of the method of the present invention will be specifically described below with reference to specific application scenarios.
First, in step S101, a user operation database for recording historical operation data during use of APP by a user through finger operation is established.
Specifically, historical user operation data related to each household service product APP in a specific time period is obtained, and a user operation database is established and used for recording historical operation data of a user during the period when the user uses the APP through finger operation.
In this example, the particular time period includes seven, ten, fourteen, one, two, three, or six months, and so forth.
For example, for a user who needs or provides the month-sister-in-law service, the corresponding specific time period is one month or two months. As another example, for a user who needs a caregiver or provides a caregiver service, the corresponding specific time period is six months or one year, etc. However, without being limited thereto, in other examples, the adaptation is performed according to the service time, the service type, and the like of the housekeeping service business.
It should be noted that, in the present invention, the user refers to various users using various home services APPs, and specifically includes home service staff, users needing home services, third party service staff, and the like.
Further, the manual operation includes clicking, double clicking, sliding, inputting, long pressing, and the like.
Still further, the historical operation data includes historical click data, historical double click data, historical slide data, historical input data, and the like associated with the above-described manual operation.
Therefore, by recording historical operation data of each user, behavior analysis is carried out on each user so as to be used for simulating manual APP operation.
It should be noted that the above description is only given by way of example, and the present invention is not limited thereto.
Next, in step S102, when the user opens the client APP, the APP automatically starts the voice input function.
Specifically, for example, when a home service person providing the month-sister-in-law service opens the client APP, the APP automatically starts the voice input function.
Furthermore, each household service product APP is provided with a voice inlet, and when the fact that the user opens the client APP is detected, the voice inlet is automatically opened to receive voice input of the user.
Preferably, each home service product APP is further configured with a monitoring component and the like for cooperating with the execution of the auxiliary mode.
Furthermore, a voice recognition device corresponding to the voice inlet is also provided for recognizing the voice input of the user.
It should be noted that the above description is only given by way of example, and the present invention is not limited thereto.
Next, in step S103, a voice input of the user is received through the APP, and the input voice is recognized to be converted into a current operation instruction.
Specifically, when the service end of the APP receives the voice input of the user, the voice input is automatically recognized.
Further, recognizing the voice input includes recognizing travel as information including a click behavior, a double click behavior, a slide behavior, an input behavior, and a long press behavior from the voice input, and target information including a button, an option, and an input box corresponding to the operation behavior.
Further, the identified behavior information and the target information are converted into the current operation instruction.
Preferably, the operation command of each user is automatically controlled by automatic software (for example, autoJS) and the Android barrier-free auxiliary function is used in a matched manner to simulate manual operation to automatically control the APP or automatically execute each function in the APP, so that more intelligence and automation of the APP can be realized, the adaptability of users to use crowds can be improved, the operation can be simplified, and the APP voice control method can be optimized.
It should be noted that, in this example, the voice input function is implemented by using the Android barrier-free auxiliary function cooperatively. But not limited to, in other examples, other automation assistance software, etc. may also be used.
For example, in an administrative service scenario where, for example, the user 1 (i.e., the month-sao) takes care of an infant, the user 1 cannot use hands to perform other care services while holding the infant or helping the infant, but needs to record the body temperature or other indicators of the infant. At this time, after the user 1 opens the APP to automatically start the voice input function, for example, by inputting "click to log in button" through voice, the server of the APP automatically performs text recognition on the voice input "click to log in button", where the behavior information is "click" and the target information is "log in" button, and further converts into a current operation instruction, so as to control the APP to automatically execute an operation of clicking the "log in" button.
For another example, the user 1 performs text recognition on the voice input "slide left" automatically by the server of the APP through the voice input "slide left", where the behavior information is "slide", and the target information is "slide left", and further converts into the current operation instruction, so as to control the APP to automatically execute the operation of sliding left.
It should be noted that the above description is only given by way of example, and the present invention is not limited thereto.
Next, in step S104, it is determined whether the operation to be performed by the current operation instruction is a specific operation.
Specifically, the behavior information of the current operation instruction is compared with the specific operation, and whether the operation to be executed by the current operation instruction is the specific operation is judged.
Further, the specific operation includes a leftward slide, a rightward slide, an upward slide, and a downward slide.
For example, the current operation instruction of the user 1 is "click to log in button", where the behavior information is click, and after the comparison, it is determined that the operation is not a specific operation.
For another example, the current operation instruction of the user 1 is "slide left", where the behavior information is slide, and after the comparison, it is determined that the operation is a specific operation.
It should be noted that the above description is only given by way of example, and the present invention is not limited thereto.
Next, in step S105, when it is determined that the operation to be performed is a specific operation, a parameter of a current operation instruction is calculated according to the historical operation data recorded in the user operation database, and a corresponding operation is automatically performed according to the current operation instruction and the parameter thereof.
Specifically, when the current operation instruction is to slide left, right, up or down, the user operation database is called.
Further, when the current operation instruction is a specific operation, the parameters of the current operation instruction include a sliding start position and a sliding end position.
Further, the calling time period is determined according to the housekeeping service time of the user, the type of the housekeeping service and the user information data.
Specifically, the user information data includes a user account, a mobile phone number, an identification number, and the like.
For example, for a user of the month-sao service, the invocation time period is within a half month, within a month, or within two months from the current time onward.
Fig. 2 is a flowchart showing another example of the APP voice control method of embodiment 1 of the present invention.
As shown in fig. 2, a step S201 of determining whether the user operation database records history operation data of the specific operation of the user is further included.
In step S201, it is determined whether the user operation database records historical operation data of a specific operation of a user to call up the relevant historical operation data to calculate relevant parameters for automatic execution.
On one hand, when the historical operation data of the specific operation of the user is judged to be recorded in the user operation database, the historical operation data in the calling time period is called according to the determined calling time period and the user information.
Further, an average value of the slide start position and the slide end position of the current operation instruction is calculated as a parameter of the current operation instruction based on a plurality of pieces of historical slide operation data.
For example, taking a cell phone screen size of 1080 × 1920 as an example, in the historical slide data of the user, the user's history includes a slide operation from a middle position (540, 780) to (100, 720), for example.
Preferably, a specific number n (for example, 20) of start point positions and a specific number n (for example, 20) of end point positions within the calling time period in the history are acquired, and the start point center position and the end point center position are calculated respectively. Therefore, the average value of the sliding starting point position and the sliding end point position of the current operation instruction is obtained by a method of calculating the center point coordinates of a plurality of points.
Fig. 3 is a flowchart showing still another example of the APP voice control method of embodiment 1 of the present invention.
As shown in fig. 3, a step S301 of establishing a machine learning model is further included.
In step S301, a machine learning model is built for the parameters of the current operating instruction.
Specifically, a machine learning model is trained by using the historical operation data, and parameters of the current operation instruction are calculated by using the trained machine learning model, wherein the historical operation data further comprises operation environment data.
Further, the training data set used to train the machine learning model includes historical slide operation data, historical slide times, operating environment parameters, housekeeping service times and types, user information data, and so forth.
Preferably, the operation environment data includes at least one of time, geographical location, and front-back operation instruction of the operation.
Further, the machine learning model calculates parameters of the current operating instruction according to the current operating environment data.
On the other hand, when the user operation database does not record historical operation data of specific operation, the preset default parameters are used as the parameters of the current operation instruction, and corresponding operation is automatically executed according to the current operation instruction and the parameters thereof.
Specifically, for example, when the operation to be performed is a leftward swipe or a rightward swipe, the default parameter is a screen width or a half width of the screen width. For another example, when the operation to be performed is a slide up or a slide down, the default parameter is a screen height or a half height of the screen height.
Further, when the operation to be executed by the current operation instruction is judged not to be the specific operation, the operation is directly and automatically executed.
For example, if the current operation instruction of the user 1 is "click on the login button", the center coordinate of the "login button" is acquired (for example, (100, 1000)), and the operation of clicking the button is directly performed.
From this, through simulation manual operation, calculate the relevant parameter of automatic execution, realized each function automatic execution in automatic control APP or this APP, from this, can realize more intelligent, the automation of APP, can improve the user and use crowd's adaptability, can simplify the operation, can also further optimize APP speech control method.
It should be noted that the procedures of the above method are only used for illustrating the present invention, and the order and number of the steps are not particularly limited. In other examples, the steps in the method may be further split into two (for example, the step S105 is split into S105 and S401, see fig. 4 in particular), three, or some steps may be combined into one step, and the adjustment is performed according to an actual example. The foregoing is illustrative only and is not to be construed as limiting the invention.
Compared with the prior art, the method and the device have the advantages that the automatic control of the APP or the automatic execution of all functions in the APP is realized by simulating manual operation and calculating the relevant parameters of automatic execution, so that more intellectualization and automation of the APP can be realized, and the adaptability of users to use people can be improved; the operation can be simplified, the complexity of voice control can be reduced, and the APP voice control method can be optimized.
Example 2
Embodiments of systems of the present invention are described below, which may be used to perform method embodiments of the present invention. Details described in the system embodiments of the invention should be considered supplementary to the above-described method embodiments; reference is made to the above-described method embodiments for details not disclosed in the system embodiments of the invention.
Referring to fig. 5, 6 and 7, an APP voice control system 500 of embodiment 2 of the present invention will be explained.
According to a second aspect of the present invention, the present invention further provides an APP voice control system 500, configured to complete operations through voice control of a client APP, where the APP voice control system 500 includes: an establishing module 501, configured to establish a user operation database, where the user operation database is used to record historical operation data of a user during using an APP through finger operation; a function starting module 502, when a user opens a client APP, the APP automatically starts a voice input function; the recognition conversion module 503 receives the voice input of the user through the APP, and recognizes the input voice to convert the input voice into a current operation instruction; a judging module 504, configured to judge whether an operation to be performed by the current operation instruction is a specific operation; and an automatic execution module 505, configured to, when it is determined that the operation to be executed is a specific operation, calculate a parameter of a current operation instruction according to historical operation data recorded in the user operation database, and automatically execute a corresponding operation according to the current operation instruction and the parameter thereof.
Preferably, the operation includes clicking, double clicking, sliding, inputting and long pressing, and the specific operation includes sliding left, sliding right, sliding up and sliding down.
Preferably, the historical operation data comprises historical slide operation data; when the current operation instruction is sliding leftwards, sliding rightwards, sliding upwards or sliding downwards, the parameters of the current operation instruction comprise a sliding starting point position and a sliding end point position.
As shown in fig. 6, the method further includes a calculating module 601, where the calculating module 601 is configured to calculate an average value of the sliding start position and the sliding end position of the current operation instruction according to a plurality of historical sliding operation data as a parameter of the current operation instruction, when the current operation instruction is sliding left, sliding right, sliding up, or sliding down.
Preferably, when the user operation database does not record historical operation data of a specific operation, a predetermined default parameter is used as a parameter of a current operation instruction, and a corresponding operation is automatically executed according to the current operation instruction and the parameter thereof.
Specifically, when the operation to be performed is a leftward swipe or a rightward swipe, the default parameter is a screen width or a half width of the screen width. And when the operation to be performed is a slide up or a slide down, the default parameter is a screen height or a half height of the screen.
As shown in fig. 7, the system further includes a model building module 701, where the model building module 701 is configured to train a machine learning model using the historical operation data, and calculate parameters of a current operation instruction using the trained machine learning model, where the historical operation data further includes operation environment data; the machine learning model also performs calculation according to the current operating environment data when calculating the parameters of the current operating instructions.
Preferably, the operation environment data includes at least one of time, geographical location, and front-to-back operation instruction of the operation.
Preferably, the recognizing the voice input comprises: travel is identified from the voice input as information and target information, the behavior information includes click behavior, double click behavior, slide behavior, input behavior and long press behavior, and the target information includes buttons, options and input boxes corresponding to the operation behavior.
Preferably, when it is judged that the operation to be executed by the current operation instruction is not the specific operation, the operation is directly and automatically executed.
Preferably, the voice input function is implemented by using an Android barrier-free auxiliary function in cooperation.
In embodiment 2, the same portions as those in embodiment 1 are not described.
Compared with the prior art, the method and the device have the advantages that the automatic control of the APP or the automatic execution of all functions in the APP is realized by simulating manual operation and calculating the relevant parameters of automatic execution, so that more intellectualization and automation of the APP can be realized, and the adaptability of users to use people can be improved; the operation can be simplified, the complexity of voice control can be reduced, and the APP voice control method can be optimized.
Example 3
The following describes an embodiment of the computer apparatus of the present invention, which may be considered as a concrete physical implementation of the above-described embodiments of the method and system of the present invention. Details described in relation to the computer device embodiment of the present invention should be considered supplementary to the method or system embodiment described above; for details not disclosed in the computer device embodiments of the invention, reference may be made to the above-described method or system embodiments.
Fig. 7 is a schematic structural diagram of a computer device according to an embodiment of the present invention, the computer device including a processor and a memory, the memory storing a computer-executable program, the processor executing the method of fig. 1 when the computer program is executed by the processor.
As shown in fig. 7, the computer device is in the form of a general purpose computing device. The processor can be one or more and can work together. The invention also does not exclude that distributed processing is performed, i.e. the processors may be distributed over different physical devices. The computer device of the present invention is not limited to a single entity, and may be a sum of a plurality of entity devices.
The memory stores a computer executable program, typically machine readable code. The computer readable program may be executed by the processor to enable a computer device to perform the method of the invention, or at least some of the steps of the method.
The memory may include volatile memory, such as Random Access Memory (RAM) and/or cache memory, and may also be non-volatile memory, such as read-only memory (ROM).
Optionally, in this embodiment, the computer device further includes an I/O interface, which is used for data exchange between the computer device and an external device. The I/O interface may be a local bus representing one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, and/or a memory storage device using any of a variety of bus architectures.
It should be understood that the computer device shown in fig. 7 is only one example of the present invention, and elements or components not shown in the above examples may also be included in the computer device of the present invention. For example, some computer devices also include display units such as display screens, and some computer devices also include human-computer interaction elements such as buttons, keyboards, and the like. The computer device can be considered to be covered by the present invention as long as the computer device can execute the computer readable program in the memory to implement the method of the present invention or at least part of the steps of the method.
FIG. 8 is a schematic diagram of a computer program product of an embodiment of the invention. As shown in fig. 8, the computer program product has stored therein a computer executable program, which when executed, implements the above-described method of the present invention. The computer program product may comprise a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. The computer program product may be transmitted, propagated, or transported by a computer to be used by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer program product may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
From the above description of the embodiments, those skilled in the art will readily appreciate that the present invention can be implemented by hardware capable of executing a specific computer program, such as the system of the present invention, and electronic processing units, servers, clients, mobile phones, control units, processors, etc. included in the system. The invention may also be implemented by computer software for performing the method of the invention, e.g. control software executed by a microprocessor, an electronic control unit, a client, a server, etc. It should be noted that the computer software for executing the method of the present invention is not limited to be executed by one or a specific hardware entity, and can also be realized in a distributed manner by non-specific hardware. For computer software, the software product may be stored in a computer readable storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or may be distributed over a network, as long as it enables the computer device to perform the method according to the present invention.
While the foregoing detailed description has described the objects, aspects and advantages of the present invention in further detail, it should be appreciated that the present invention is not inherently related to any particular computer, virtual machine, or computer apparatus, as various general purpose devices may implement the present invention. The invention is not to be considered as limited to the specific embodiments thereof, but is to be understood as being modified in all respects, all changes and equivalents that come within the spirit and scope of the invention.

Claims (24)

1. An APP voice control method for controlling a client APP to complete an operation through voice, the method comprising:
establishing a user operation database, wherein the user operation database is used for recording historical operation data of a user during using the APP through finger operation;
when a user opens an APP at a client, the APP automatically starts a voice input function;
receiving voice input of a user through an APP, and identifying the input voice to convert the input voice into a current operation instruction;
judging whether the operation to be executed by the current operation instruction is a specific operation or not;
and when the operation to be executed is judged to be a specific operation, calculating the parameters of the current operation instruction according to the historical operation data recorded by the user operation database, and automatically executing the corresponding operation according to the current operation instruction and the parameters thereof.
2. The APP voice control method of claim 1, wherein the operations comprise clicking, double clicking, sliding, inputting and long pressing, and the specific operation comprises sliding left, sliding right, sliding up and sliding down.
3. The APP speech control method of claim 2,
the historical operation data comprises historical sliding operation data;
when the current operation instruction is sliding leftwards, sliding rightwards, sliding upwards or sliding downwards, the parameters of the current operation instruction comprise a sliding starting point position and a sliding end point position.
4. The APP voice control method of claim 3, wherein when the current operation instruction is a leftward swipe, a rightward swipe, an upward swipe or a downward swipe, an average of the swipe start position and the swipe end position of the current operation instruction is calculated from a plurality of historical swipe operation data as a parameter of the current operation instruction.
5. The APP speech control method of claim 1,
and when the user operation database does not record historical operation data of specific operation, taking a preset default parameter as a parameter of the current operation instruction, and automatically executing corresponding operation according to the current operation instruction and the parameter thereof.
6. The APP speech control method of claim 5, wherein:
when the operation to be performed is a leftward swipe or a rightward swipe, the default parameter is a screen width;
when the operation to be performed is a swipe up or a swipe down, the default parameter is a screen height.
7. The APP voice control method of claim 1, wherein calculating parameters of a current operation instruction from historical operation data recorded by the user operation database comprises:
training a machine learning model using the historical operating data, calculating parameters of a current operating instruction using the trained machine learning model, wherein,
the historical operating data further comprises operating environment data;
the machine learning model also performs calculation according to the current operating environment data when calculating the parameters of the current operating instructions.
8. The APP voice control method of claim 7, wherein the operating environment data comprises at least one of time of operation, geographic location, pre-post operation instructions.
9. The APP speech control method of claim 1, wherein the recognizing the speech input comprises:
travel is identified from the voice input as information and target information, the behavior information includes click behavior, double click behavior, slide behavior, input behavior and long press behavior, and the target information includes buttons, options and input boxes corresponding to the operation behavior.
10. The APP voice control method of claim 1, further comprising:
and when the operation to be executed by the current operation instruction is judged not to be the specific operation, directly and automatically executing the operation.
11. The APP voice control method of claim 1, further comprising:
the voice input function is realized by matching with the Android barrier-free auxiliary function.
12. An APP voice control system for completing operations by voice control of a client APP, the system comprising:
the system comprises an establishing module, a judging module and a processing module, wherein the establishing module is used for establishing a user operation database, and the user operation database is used for recording historical operation data of a user during the period of using the APP through finger operation;
the function starting module is used for automatically starting the voice input function by the APP when the user opens the client APP;
the recognition conversion module receives the voice input of a user through the APP and recognizes the input voice to convert the input voice into a current operation instruction;
the judging module is used for judging whether the operation to be executed by the current operation instruction is a specific operation or not;
and the automatic execution module is used for calculating the parameter of the current operation instruction according to the historical operation data recorded by the user operation database when the operation to be executed is judged to be the specific operation, and automatically executing the corresponding operation according to the current operation instruction and the parameter thereof.
13. The APP voice control system of claim 12, where the operations comprise clicking, double clicking, sliding, entering, and long pressing, and the particular operation comprises sliding left, sliding right, sliding up, and sliding down.
14. The APP speech control system of claim 13,
the historical operation data comprises historical sliding operation data;
when the current operation instruction is sliding leftwards, sliding rightwards, sliding upwards or sliding downwards, the parameters of the current operation instruction comprise a sliding starting point position and a sliding end point position.
15. The APP voice control system of claim 14, further comprising a calculation module configured to calculate an average value of the sliding start position and the sliding end position of a current operation instruction according to a plurality of historical sliding operation data as a parameter of the current operation instruction when the current operation instruction is sliding left, sliding right, sliding up, or sliding down.
16. The APP speech control system of claim 12,
and when the user operation database does not record historical operation data of specific operation, taking a preset default parameter as a parameter of the current operation instruction, and automatically executing corresponding operation according to the current operation instruction and the parameter thereof.
17. The APP voice control system of claim 16, wherein:
when the operation to be performed is a leftward swipe or a rightward swipe, the default parameter is a screen width;
when the operation to be performed is a swipe up or a swipe down, the default parameter is a screen height.
18. The APP voice control system of claim 1, further comprising a model building module to train a machine learning model using the historical operating data, calculate parameters of a current operating instruction using the trained machine learning model, wherein,
the historical operating data further comprises operating environment data;
the machine learning model also performs calculation according to the current operating environment data when calculating the parameters of the current operating instructions.
19. The APP voice control system of claim 18, where the operating environment data includes at least one of time of operation, geographic location, pre-post operation instructions.
20. The APP speech control system of claim 12, wherein the recognizing the speech input comprises:
travel is identified from the voice input as information and target information, the behavior information includes click behavior, double click behavior, slide behavior, input behavior and long press behavior, and the target information includes buttons, options and input boxes corresponding to the operation behavior.
21. The APP voice control system of claim 12, further comprising:
and when the operation to be executed by the current operation instruction is judged not to be the specific operation, directly and automatically executing the operation.
22. The APP voice control system of claim 12, wherein the voice input function is implemented using an Android barrier free assistance function in concert.
23. A computer device comprising a processor and a memory, the memory for storing a computer executable program, characterized in that:
when the computer program is executed by the processor, the processor performs the APP speech control method of any of claims 1-11.
24. A computer program product storing a computer executable program, wherein the computer executable program, when executed, implements the APP voice control method of any one of claims 1-11.
CN202110426130.1A 2021-04-20 2021-04-20 APP voice control method and system and computer equipment Pending CN113053384A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110426130.1A CN113053384A (en) 2021-04-20 2021-04-20 APP voice control method and system and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110426130.1A CN113053384A (en) 2021-04-20 2021-04-20 APP voice control method and system and computer equipment

Publications (1)

Publication Number Publication Date
CN113053384A true CN113053384A (en) 2021-06-29

Family

ID=76520715

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110426130.1A Pending CN113053384A (en) 2021-04-20 2021-04-20 APP voice control method and system and computer equipment

Country Status (1)

Country Link
CN (1) CN113053384A (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1619662A1 (en) * 2004-07-22 2006-01-25 Alcatel Speech recognition system
US20070123223A1 (en) * 2005-11-29 2007-05-31 Gary Letourneau Enhanced analogue of interactive voice response structures and functions for mobile phones and similar handheld communications devices
CN105094807A (en) * 2015-06-25 2015-11-25 三星电子(中国)研发中心 Method and device for implementing voice control
CN105788597A (en) * 2016-05-12 2016-07-20 深圳市联谛信息无障碍有限责任公司 Voice recognition-based screen reading application instruction input method and device
CN107885481A (en) * 2017-10-26 2018-04-06 中国地质大学(武汉) The page sound control method and voice browser of a kind of browser of mobile terminal
CN109257503A (en) * 2018-10-24 2019-01-22 珠海格力电器股份有限公司 A kind of method, apparatus and terminal device of voice control application program
CN109584870A (en) * 2018-12-04 2019-04-05 安徽精英智能科技有限公司 A kind of intelligent sound interactive service method and system
CN109785829A (en) * 2017-11-15 2019-05-21 百度在线网络技术(北京)有限公司 A kind of customer service householder method and system based on voice control
US20190311713A1 (en) * 2018-04-05 2019-10-10 GM Global Technology Operations LLC System and method to fulfill a speech request
CN112053678A (en) * 2019-06-06 2020-12-08 北京快松果科技有限公司 Lock opening and closing method and system based on voice recognition, lock opening and closing body and shared vehicle

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1619662A1 (en) * 2004-07-22 2006-01-25 Alcatel Speech recognition system
US20070123223A1 (en) * 2005-11-29 2007-05-31 Gary Letourneau Enhanced analogue of interactive voice response structures and functions for mobile phones and similar handheld communications devices
CN105094807A (en) * 2015-06-25 2015-11-25 三星电子(中国)研发中心 Method and device for implementing voice control
CN105788597A (en) * 2016-05-12 2016-07-20 深圳市联谛信息无障碍有限责任公司 Voice recognition-based screen reading application instruction input method and device
CN107885481A (en) * 2017-10-26 2018-04-06 中国地质大学(武汉) The page sound control method and voice browser of a kind of browser of mobile terminal
CN109785829A (en) * 2017-11-15 2019-05-21 百度在线网络技术(北京)有限公司 A kind of customer service householder method and system based on voice control
US20190311713A1 (en) * 2018-04-05 2019-10-10 GM Global Technology Operations LLC System and method to fulfill a speech request
CN109257503A (en) * 2018-10-24 2019-01-22 珠海格力电器股份有限公司 A kind of method, apparatus and terminal device of voice control application program
CN109584870A (en) * 2018-12-04 2019-04-05 安徽精英智能科技有限公司 A kind of intelligent sound interactive service method and system
CN112053678A (en) * 2019-06-06 2020-12-08 北京快松果科技有限公司 Lock opening and closing method and system based on voice recognition, lock opening and closing body and shared vehicle

Similar Documents

Publication Publication Date Title
CN107291867B (en) Dialog processing method, device and equipment based on artificial intelligence and computer readable storage medium
Griol et al. An automatic dialog simulation technique to develop and evaluate interactive conversational agents
US20190005013A1 (en) Conversation system-building method and apparatus based on artificial intelligence, device and computer-readable storage medium
CN106777135B (en) Service scheduling method, device and robot service system
US20190018694A1 (en) Virtual laboratory assistant platform
WO2015147702A1 (en) Voice interface method and system
CN111190600B (en) Method and system for automatically generating front-end codes based on GRU attention model
CN107463601A (en) Dialogue based on artificial intelligence understands system constituting method, device, equipment and computer-readable recording medium
CN110020873A (en) Artificial customer service switching method, relevant apparatus, equipment and computer-readable medium
CN111813910A (en) Method, system, terminal device and computer storage medium for updating customer service problem
JP6442807B1 (en) Dialog server, dialog method and dialog program
KR101639301B1 (en) Online coding learning method, server and system thereof
CN112836521A (en) Question-answer matching method and device, computer equipment and storage medium
CN113053384A (en) APP voice control method and system and computer equipment
JP6937842B2 (en) Information processing device and information processing method
CN116665662A (en) Man-machine conversation method and conversation data set generation method
CN114117025B (en) Information query method, device, storage medium and system
CN113379375B (en) Method and device for guiding subscription in chat mode and electronic equipment
CN115470798A (en) Training method of intention recognition model, intention recognition method, device and equipment
JP2022068264A (en) Model training method, model training platform, electronic device, and storage medium
CN113791689A (en) Control method and device of intelligent equipment, storage medium and electronic device
Zaguia et al. Using multimodal fusion in accessing web services
CN108415983B (en) Intelligent problem solving method and device based on interaction
CN112100338B (en) Dialog theme extension method, device and system for intelligent robot
CN110209242A (en) Push button function binding, push button function call method, device and projection control equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination