CN111506245A - Terminal control method and device - Google Patents

Terminal control method and device Download PDF

Info

Publication number
CN111506245A
CN111506245A CN202010345593.0A CN202010345593A CN111506245A CN 111506245 A CN111506245 A CN 111506245A CN 202010345593 A CN202010345593 A CN 202010345593A CN 111506245 A CN111506245 A CN 111506245A
Authority
CN
China
Prior art keywords
target
control
user
information
terminal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010345593.0A
Other languages
Chinese (zh)
Inventor
方彦彬
王凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiaomi Pinecone Electronic Co Ltd
Original Assignee
Beijing Xiaomi Pinecone Electronic Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xiaomi Pinecone Electronic Co Ltd filed Critical Beijing Xiaomi Pinecone Electronic Co Ltd
Priority to CN202010345593.0A priority Critical patent/CN111506245A/en
Publication of CN111506245A publication Critical patent/CN111506245A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0487Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0487Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
    • G06F3/0488Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
    • G06F3/04883Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures for inputting data by handwriting, e.g. gesture or text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/451Execution arrangements for user interfaces
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The disclosure provides a terminal control method and a terminal control device, wherein the method comprises the following steps: identifying a control object contained in a terminal display interface by calling an interface monitoring service provided by a terminal operating system; determining a target object from the control object according to the received user input information; and triggering to execute the target operation aiming at the target object. The scheme disclosed by the invention can realize accurate identification of the control object in the current display interface, and is convenient for carrying out corresponding processing on the target object according to the user input information.

Description

Terminal control method and device
Technical Field
The present disclosure relates to the field of terminal control technologies, and in particular, to a terminal control method and apparatus.
Background
In the use process of terminal devices such as smart phones, tablet computers, wearable devices, and the like, a user generally needs to interact with the terminal devices by manipulating manipulation objects displayed in a display interface of the terminal. The manipulation objects displayed in the terminal display interface can be generally divided into application objects belonging to an application program and system objects belonging to a terminal operating system. To implement accurate control of the application object and the system object, firstly, a control object displayed in a terminal display interface needs to be accurately identified.
Disclosure of Invention
In view of this, the present disclosure provides a terminal control method and device, which can accurately identify a control object in a current display interface, and facilitate corresponding processing on a target object according to user input information.
According to a first aspect of the embodiments of the present disclosure, a terminal control method is provided, where the method includes:
identifying a control object contained in a terminal display interface by calling an interface monitoring service provided by a terminal operating system;
determining a target object from the control object according to the received user input information;
and triggering to execute the target operation aiming at the target object.
Optionally, when the terminal operating system is an Android operating system, the interface monitoring service includes an AccessbilitySevice service.
Optionally, the user input information includes a user voice; the determining a target object from the control object according to the received user input information includes:
processing the user voice through an artificial intelligence voice service so as to identify object information indicated by the user voice;
and determining a target object from the control object according to the object information.
Optionally, when the target operation involves a touch action for the target object, the triggering performs the target operation for the target object, including:
and executing the simulated touch action aiming at the target object to finish the target operation.
Optionally, the determining a target object from the control object according to the object information includes:
acquiring corresponding display information of a control object in the terminal display interface;
and determining any one of the control objects as the target object when the object information is matched with the display information of the any one of the control objects.
Optionally, the obtaining of the display information of the control object in the terminal display interface includes:
if the control object is a text object, acquiring display information corresponding to the text object by calling an application program interface of the text object;
and if the control object is a graphic object, the corresponding area of the graphic object is captured, and the display information corresponding to the graphic object is determined based on the image obtained by capturing the screen.
Optionally, the method further includes: displaying object identifications corresponding to the control objects in the terminal display interface;
the determining a target object from the control object according to the object information includes:
and determining the control object matched with the object identification as the target object.
Optionally, the determining, by the user input information, a target object from the control object according to the received user input information includes:
determining position information corresponding to the user operation gesture;
and determining a control object matched with the position information in the control objects as the target object.
Optionally, the target operation is a predefined default operation; alternatively, the first and second electrodes may be,
the method further comprises the following steps: determining the target operation for the target object according to the user input information.
Optionally, the triggering to execute the target operation for the target object includes:
under the condition that the target operation belongs to predefined system-level operation in the terminal operating system, executing the target operation aiming at the target object by calling a preset function of the terminal operating system or calling a preset application program installed on terminal equipment;
and if the target operation belongs to a predefined application-level operation in the application program to which the target object belongs, sending a target operation instruction for the target object to the application program so as to enable the application program to execute the target operation according to the target operation instruction.
According to a second aspect of the embodiments of the present disclosure, there is provided a terminal manipulation apparatus, the apparatus including:
the object identification unit is used for identifying an operation object contained in a terminal display interface by calling interface monitoring service provided by a terminal operating system;
the object determining unit is used for determining a target object from the control object according to the received user input information;
and the operation execution unit is used for triggering and executing the target operation aiming at the target object.
Optionally, when the terminal operating system is an Android operating system, the interface monitoring service includes an AccessbilitySevice service.
Optionally, the user input information includes a user voice; the object determination unit includes:
the information identification subunit is used for processing the user voice through an artificial intelligence voice service so as to identify the object information indicated by the user voice;
and the object determining subunit is used for determining a target object from the control object according to the object information.
Optionally, when the target operation involves a touch action for the target object, the operation execution unit includes:
and the touch simulation subunit is used for executing a simulated touch action aiming at the target object so as to complete the target operation.
Optionally, the object determining subunit includes:
the information acquisition module is used for acquiring corresponding display information of the control object in the terminal display interface;
and the object determining module is used for determining any control object as the target object under the condition that the object information is matched with the display information of any control object.
Optionally, the information obtaining module includes:
the interface calling submodule is used for obtaining display information corresponding to the text object by calling an application program interface of the text object when the control object is the text object;
and the region screen capturing sub-module is used for capturing a corresponding region of the graphic object when the control object is the graphic object, and determining display information corresponding to the graphic object based on an image obtained by screen capturing.
Optionally, the method further includes: the identification display unit is used for displaying object identifications corresponding to the control objects in the terminal display interface;
the object determination subunit further includes:
and the identification determining module is used for determining the control object matched with the object identification as the target object.
Optionally, the user input information includes a user operation gesture, and the object determination unit further includes:
the information determining subunit is used for determining the position information corresponding to the user operation gesture;
and the position determining subunit is used for determining a control object matched with the position information in the control objects as the target object.
Optionally, the target operation is a predefined default operation; alternatively, the first and second electrodes may be,
the device further comprises: an operation determination unit configured to determine the target operation for the target object according to the user input information.
Optionally, the operation execution unit further includes:
the calling and calling subunit is used for calling a preset function of the terminal operating system or calling a preset application program installed on the terminal equipment to execute the target operation aiming at the target object under the condition that the target operation belongs to predefined system-level operation in the terminal operating system;
and the instruction sending subunit is used for sending a target operation instruction according to the target object to the application program to enable the application program to execute the target operation according to the target operation instruction if the target operation belongs to a predefined application-level operation in the application program to which the target object belongs.
According to a third aspect of embodiments of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method of any one of the above first aspects.
According to a fourth aspect of the embodiments of the present disclosure, there is provided a terminal control device, including:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to execute the executable instructions to implement the steps of the method of any of the first aspects above.
The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects: the control object contained in the terminal display interface is identified by calling the interface monitoring service provided by the terminal operating system, so that the control object in the current display interface is accurately identified, and the corresponding processing on the target object in the operation object is facilitated after the user input information is received.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
Fig. 1 is a flowchart illustrating a terminal manipulation method according to an exemplary embodiment of the present disclosure;
fig. 2 is a flow chart illustrating another terminal manipulation method according to an exemplary embodiment of the present disclosure;
FIG. 3 is a schematic diagram of a terminal display interface shown in accordance with an exemplary embodiment of the present disclosure;
FIG. 4 is a schematic diagram of another terminal display interface shown in accordance with an exemplary embodiment of the present disclosure;
FIG. 5 is a schematic diagram illustrating a terminal display interface corresponding to application level operations according to an example embodiment of the present disclosure;
FIG. 6 is a schematic diagram of a terminal display interface corresponding to system level operation shown in the present disclosure in accordance with an exemplary embodiment;
FIG. 7 is a schematic diagram of another terminal display interface corresponding to system level operation shown in the present disclosure in accordance with an exemplary embodiment;
fig. 8 is a flowchart illustrating yet another terminal manipulation method according to an exemplary embodiment of the present disclosure;
FIG. 9 is a schematic diagram illustrating a terminal display interface corresponding to a user-operated gesture according to an exemplary embodiment of the present disclosure;
FIG. 10 is a schematic view of another terminal display interface corresponding to a user-operated gesture shown in the present disclosure according to an exemplary embodiment;
11-19 are respectively block diagrams of one or more terminal operating devices illustrated in accordance with one or more exemplary embodiments of the present disclosure;
fig. 20 is a block diagram illustrating an apparatus for terminal manipulation according to an exemplary embodiment of the present disclosure;
fig. 21 is a block diagram illustrating another apparatus for terminal manipulation according to an exemplary embodiment of the present disclosure.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
The terminology used in the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used in this disclosure and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present disclosure. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
Fig. 1 is a flowchart illustrating a terminal manipulation method according to an exemplary embodiment of the present disclosure. As shown in fig. 1, the method may include the steps of:
and 102, identifying an operation object contained in a terminal display interface by calling an interface monitoring service provided by a terminal operating system.
It should be noted that, the terminal device related in the present disclosure may be a mobile phone, a tablet computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, a wearable device such as a smart watch, a smart glasses, a smart bracelet, a smart running shoe, etc., and accordingly, the specific form of the terminal operating system in the terminal device may be various, such as an Android operating system, an iOS operating system, a Windows operating system, an L inux operating system, a Unix operating system, a DOS operating system, etc., and the present disclosure does not limit the specific form of the terminal device and the terminal operating system thereof, as long as the terminal operating system provides an interface monitoring service for monitoring a terminal display interface.
In an embodiment, when the terminal operating system is an android operating system, the interface monitoring service provided by the terminal operating system may be an accessibility service.
Generally, no matter what interface is currently displayed in the terminal display interface, the display object in the current terminal display interface can be identified by calling the interface monitoring service. The recognized objects may be classified into manipulation objects and non-manipulation objects according to whether the display object is manipulable. The control object is a triggerable object in a terminal display interface, such as a triggerable control, selectable or editable characters, a connection capable of jumping, a picture capable of being viewed in an enlarged mode and the like, and a user can control the control object in the current terminal display interface; the non-control object is an triggerable object in a terminal display interface, such as a built-in picture in the interface, a boundary background between adjacent controls in the interface and the like, and cannot be controlled by a user in the current terminal display interface.
It can be understood that the state of the same object in different terminal display interfaces may be different: for example, in instant messaging software, a chat background picture in a chat interface can not be triggered by a user, and at the moment, the background picture in the chat interface is a non-control object; in the chat background device interface, the user can select to set or replace the background picture, and the background picture is the control object at the moment. In fact, for any terminal display interface, the scheme disclosed herein only focuses on the manipulation object therein, and the disclosure is not limited to the type (space, picture, text, etc.), size, number, etc. of the manipulation object in any terminal display interface.
And 104, determining a target object from the control object according to the received user input information.
And after receiving user input information corresponding to the control behaviors sent by the user, determining a target object from the identified control objects according to the information.
The artificial intelligent Voice service can be selected from Xiao ai Voice, an rhinoceros Voice assistant (L ingxi), Siri (speech Recognition Interface), Xiao na Cortana, Google Assistant and the like.
And receiving and processing user voice sent by a user through the artificial intelligent voice service, and determining a target object corresponding to the user voice in the control object according to the identified object information indicated by the user voice, so that the control object in the terminal display interface is simulated through the voice. The conventional touch operation mode is changed, and a user is allowed to control a control object in a terminal display interface through voice, so that both hands of the user are liberated, and the user experience and the control efficiency are effectively improved.
In an embodiment, the target operation may relate to a touch action and/or a target function for the target object. As an exemplary embodiment, when the target operation involves a touch action for a target object, the triggering execution of the target operation for the target object may be: and executing the simulated touch action aiming at the target object to finish the target operation. The simulation operation of the target object can be realized by determining the touch action related to the target operation corresponding to the voice of the user and then executing the simulation touch action. At the moment, the user can more intuitively send out the operation voice related to the touch action of the target object, the daily use habit of the user is better met, and the learning cost of the user is reduced.
As another exemplary embodiment, when the above target operation relates to a target function for a target object, the target function for implementing the target object is directly triggered, thereby completing the target operation. At this time, the user can send out the user voice of the target function to be realized containing the target object, so that the user can still realize the target function when not knowing which touch action the target function corresponds to, and the application scenarios of the scheme disclosed by the invention are enriched to a certain extent.
In an embodiment, the object information may include complete display information or partial display information of the object. At this time, corresponding display information of the control object in the terminal display interface can be acquired, and then any control object is determined as the target object under the condition that the object information is matched with the display information of the any control object. In this embodiment, any of the above-mentioned control objects may be a text object or a graphic object, and the different types of control objects have different corresponding display information acquisition manners: if any one of the control objects is a text object, display information corresponding to the text object can be acquired by calling an Application Programming Interface (API) of the text object; if any of the above-mentioned control objects is a graphic object, the corresponding region of the graphic object may be captured, and the display information corresponding to the graphic object may be determined based on the image obtained by capturing the screen.
At the moment, the target object is determined by extracting the object information of the target object contained in the user input voice and comparing the object information with the display information of each current control object, so that the user input voice only needs to contain all or part of the object information currently displayed by the target object, and does not need to carry out any special effect processing on all the objects currently displayed in the terminal display interface, thereby ensuring that the determination process of the target object is simpler and the user experience is better.
In an embodiment, after the manipulation objects included in the terminal display interface are recognized, object identifiers corresponding to the respective manipulation objects may be shown in the terminal display interface. Therefore, a subsequent user can send out user voice according to the object identifier corresponding to the target object, that is, the object information corresponding to the user voice can include the object identifier, and at this time, the control object matched with the object identifier included in the object information can be determined as the target object. The object identification corresponding to each control object is displayed in the terminal display interface, so that any control object in the terminal display interface has the unique object identification, and the user voice sent by the user can uniquely correspond to a certain control object in the current terminal display interface, thereby effectively avoiding misjudgment possibly occurring under the condition that the terminal display interface comprises a plurality of operation objects with the same display information, and ensuring the accuracy of determining the target object.
In an embodiment, a user may implement control on a control object through touch input, at this time, the user input information is a user operation gesture, and correspondingly, position information corresponding to the user operation gesture may be determined, and then a control object matched with the position information in the control object is determined to serve as a target object. The position information may be position coordinates and/or a touch track corresponding to a user operation gesture. In addition, the target operation corresponding to the user operation gesture can be comprehensively judged according to the position information and the time information corresponding to the user operation gesture. After the user operation gesture made by the user is received, the target object corresponding to the user operation gesture is determined by determining the position information corresponding to the user operation gesture, and the accuracy of determining the target object is guaranteed.
And 106, triggering and executing the target operation aiming at the target object.
In an embodiment, the target operation may be a predefined default operation, and at this time, the preset target operation may be directly executed on the target object after the target object is determined, so that the target operation for the target object can be executed faster, and further the user operation can be responded faster. Or, the target operation for the target object can be determined according to the user input information, and at this time, the target operation for the target object is determined according to the user input information corresponding to the user control behavior, so that the target operation can be ensured to better meet the operation will of the user, and the user experience is further improved.
Whether the determination of the target operation for the target object is predefined or determined according to the user input information, the target operation may be a system level operation or an application level operation. In an exemplary embodiment, in the case that the target operation belongs to a predefined system-level operation in the terminal operating system, the preset function or the preset application installed on the terminal is invoked by calling the preset function of the terminal operating system, so that the preset function or the preset application performs the target operation on the target object. As another exemplary embodiment, in a case that the target operation belongs to a predefined application-level operation in an application program to which the target object belongs, a target operation instruction for the target object is sent to the application program, so that the application program executes the target operation according to the target operation instruction.
All target operations which can be performed by a user are divided into an application level operation and a system level operation, so that the user can perform the two types of operations aiming at any control object in a terminal display interface, the operation is not limited to predefined operations in an application program, the control modes of the control object are enriched, the control of any control object in various forms is facilitated, the user operation can be simplified to a certain extent, and the user experience is further improved.
According to the embodiment of the disclosure, the control object contained in the terminal display interface is identified by calling the interface monitoring service provided by the terminal operating system, so that the control object in the current display interface is accurately identified, and the corresponding processing on the target object in the operation object is facilitated after the user input information is received.
For the convenience of understanding the technical solution of the present disclosure, the following description is made with reference to the embodiments of the user voice operation shown in fig. 2 to 7 and the user touch operation shown in fig. 8 to 10, respectively.
Fig. 2 is a flow chart illustrating another terminal manipulation method according to an exemplary embodiment of the present disclosure; the method is applied to the terminal equipment and can comprise the following steps:
step 202, calling an interface monitoring service provided by a terminal operating system, and identifying a control object in a terminal display interface.
As mentioned above, there are many possible forms of the terminal operating system of the terminal device, such as an android operating system, an iOS operating system, a Windows operating system, L inux operating system, a Unix operating system, a DOS operating system, etc., and the interface monitoring service provided by the operating system is also different accordingly, for example, when the terminal operating system is the android operating system, the interface monitoring service may be an accessibility service.
For any terminal display interface, all objects displayed by the terminal display interface can comprise a control object and a non-control object. The control object is a triggerable object in the terminal display interface, such as a triggerable control, selectable or editable characters, a connection capable of jumping, a picture capable of being viewed in an enlarged mode and the like, and a user can control the control object in the current terminal display interface. The non-control object is an triggerable object in a terminal display interface, such as a built-in picture in the interface, a boundary background between adjacent controls in the interface and the like, and cannot be controlled by a user in the current terminal display interface. The present disclosure focuses only on the above-mentioned manipulation objects, and the types (space, picture, text, etc.), sizes, numbers, etc. of the manipulation objects in any terminal display interface are not limited.
For convenience of describing the present disclosure, the following description will use a terminal device as a smart phone as an example. Fig. 3 is a schematic diagram illustrating a terminal display interface according to an exemplary embodiment of the present disclosure. As shown in fig. 3, the text-type "bicycle riding", the control-type "travel" control, the picture-type travel promotion picture, and the like that can be controlled by the user in the terminal display interface all belong to touch objects, and the display area boundary line, the display area dividing line, and the like that can not be controlled by the user belong to non-touch objects.
In an embodiment, after the control objects included in the terminal display interface are identified, the object identifiers corresponding to the control objects are shown in the terminal display interface, so that a user can send out a user voice for a certain control object in a targeted manner according to the object identifiers corresponding to the control objects.
FIG. 4 is a schematic diagram of another terminal display interface shown in accordance with an exemplary embodiment of the present disclosure. As shown in fig. 4, the object numbers corresponding to the control objects are respectively displayed at the corresponding positions of the control objects in the terminal display interface. It can be understood that the type of the identification is not limited to a numerical number in practical application, and the identification can be in other forms; moreover, corresponding object identifiers may be shown for each control object according to the display area, the object type, and the like, for example, the control objects in different display areas adopt object identifiers of different colors, the control objects in different types adopt object identifiers of different shapes, and the like.
The object identification corresponding to each control object is displayed in the terminal display interface, so that any control object in the terminal display interface has the unique user identification, and further, the user voice sent by the user can uniquely correspond to a certain control object in the current terminal display interface, thereby effectively avoiding misjudgment possibly occurring under the condition that the terminal display interface comprises a plurality of operation objects with the same display information, and ensuring the accuracy of determining the target object.
Step 204, receiving the user voice sent by the user.
In one embodiment, an artificial intelligence voice service may be invoked to receive a user's voice from a user. As an exemplary embodiment, the voice wakeup word detection process may be run for a long time in the background, and when the process detects that the user voice includes a preset voice wakeup word, the process wakes up the artificial intelligence voice service to start receiving the user voice sent by the user; correspondingly, the artificial intelligence voice service can be closed when the user sends a preset voice closing word or the user voice sent by the user is not received after the preset time length is exceeded. At the moment, the user is allowed to control the on and off of the artificial intelligence voice service completely in a voice mode without manual operation of the user, and the hands of the user are further liberated.
As another exemplary embodiment, a voice service wake-up control may be displayed in a terminal display interface, and when a user triggers the control, the artificial intelligence voice service is woken up to start receiving user voice sent by the user; accordingly, the artificial intelligence voice service can be turned off when the user triggers the control again. At the moment, the user manually controls the on and off of the artificial intelligence voice service, and the on and off of the artificial intelligence voice service can be controlled more accurately.
And after the artificial intelligence voice service is awakened, receiving user voice input by a user through the artificial intelligence voice service, wherein the user voice contains object information used for indicating a target object in a terminal display interface.
In step 206, a target object corresponding to the user voice in the manipulation object is identified.
After receiving the user voice, the artificial intelligence voice service can process the user voice to identify the object information indicated by the user voice; further, the target object may be determined from the currently displayed control objects on the terminal display interface according to the object information, or the object information may be sent to the terminal operating system, and the terminal operating system determines the target object from the currently displayed control objects on the terminal display interface according to the object information. The artificial intelligence voice service can also send the user voice to a terminal operating system, the terminal operating system processes the user voice to recognize object information indicated by the user voice, and a target object is determined from a control object currently displayed on a terminal display interface according to the object information.
The target object can be determined from the currently displayed control object on the terminal display interface in various modes according to the object information corresponding to the user voice. In an embodiment, after determining each control object in the terminal display interface, further determining display information corresponding to each control object; and then, under the condition that the object information corresponding to the voice of the user is matched with the display information of any one control object, determining the any one control object as a target object.
As shown in the terminal display interface shown in fig. 3, if the user wants to open the "food" control, the object information indicated by the user voice sent by the user may be "food"; if the user wants to open the link corresponding to the travel promotion picture, the object information indicated by the user voice sent by the user can be the character 'take family to go to travel' or 'XX prairie' in the picture, and can also be the graphic element 'prairie' or 'sun' in the picture. Taking the example that the object information indicated by the user voice sent by the user is 'food', sequentially matching the object information 'food' with the display information corresponding to each control object, and determining the control as the target object when the object information 'food' is matched with the 'food' controls in the first row and the first column in the control display area. The process of matching the display information in the form of pictures is similar to that described above, and is not repeated.
In this embodiment, because the form of the manipulation object is different, the manner of acquiring the display information corresponding to the manipulation object is also different. If the control object is a text object, display information corresponding to the text object can be acquired by calling an API of the text object; if the control object is a graphic object, the corresponding area of the graphic object can be captured, and the display information corresponding to the graphic object is determined based on the image obtained by capturing the screen. For a specific API calling mode and a screen capture mode of the text object, reference may be made to the contents intuitively disclosed in the related art, which is not limited by the present disclosure.
When the number of the control objects included in the terminal display interface is large or the display information corresponding to each control object is complex, the display information may be the same. If the display information of the target object is matched with the display information, the control object corresponding to the display information with the maximum correlation with the display information is determined as the target object, so that the real target object corresponding to the user operation can be accurately identified under the condition that the display information of the control objects in the terminal display interface is the same. If the display information of the control 17 in fig. 4 is "travel", the text "take family to travel XX grassland and welcome you" contained in the travel promotion picture 24 below also contains "travel". When the user voice contains a "travel" two character, in order to avoid the target object misjudgment which may occur, the correlation between the display information "travel" of the control 17 and the "travel" contained in the travel promotion picture 24 in the Chinese language and the user voice can be respectively calculated, and the control object corresponding to the display information with larger correlation is determined as the target object, so that the real target object corresponding to the user operation can be accurately identified under the condition that the display information of a plurality of control objects in the terminal display interface is the same.
In another embodiment, corresponding to the object identifier corresponding to each control object displayed in the terminal display interface, the object information indicated by the user voice sent by the user may be the object identifier, and accordingly, the control object matched with the object identifier may be determined as the target object. As shown in the terminal display interface shown in fig. 4, the user voice sent by the user may include an object identifier of the touch object (i.e., a number corresponding to the touch object in the drawing), and after receiving the user voice, the artificial intelligence voice service or the terminal operating system may recognize the object identifier included in the user voice, and then determine, as the target object, the control object whose own corresponding object identifier matches the object identifier included in the user voice. The control objects are marked through different object identifications, so that the object identifications contained in the user voice sent by the user have definite directivity, and the identification accuracy of the target object is improved.
As shown in the terminal display interface shown in fig. 4, the user speech uttered by the user may include "3", and at this time, the search input box control corresponding to the object identifier No. 3 is determined as the target object; similarly, the user voice sent by the user may include "26", and at this time, the "discovery" homepage control corresponding to the object identifier No. 26 is determined as the target object, which is not described again.
At step 208, a target operation for the target object is determined.
After the target object corresponding to the user voice is determined, the target operation aiming at the target object needs to be determined. In an embodiment, a predefined default operation may be determined as the target operation. For example, the default operation may be "click" or "open", and accordingly, when the target object corresponding to the user voice is determined, the target object is directly triggered to be clicked or opened.
In another embodiment, a target operation for a target object may be determined from user input information. As an exemplary embodiment, the user input voice as the user input information may include a target touch operation for the target object, and accordingly, the target operation is the target touch operation. As shown in the terminal display interface shown in fig. 3, the user voice uttered by the user may be "click to sweep a sweep" or "long press XX grassland", and the like, and at this time, the target operations for promoting the picture by the "sweep" control and the "XX grassland" are "click" and "long press" touch operations, respectively. As shown in the terminal display interface shown in fig. 4, the user voice uttered by the user may be "double-click 11" or "long-press 16", and the like, and at this time, the target operations for the "take-away" control and the "ticket" control are "double-click" and "long-press" touch operations, respectively.
In addition, the target object is not limited to the single and clear manipulation object, and may be the whole current terminal display interface or a partial region of the terminal display interface. Assuming that the user voice sent by the user for the terminal display interface shown in fig. 4 is "click 13", at this time, the terminal display interface jumps to the order detail interface shown in fig. 5, and fig. 5 is a schematic view of the terminal display interface corresponding to the application-level operation shown in this disclosure according to an exemplary embodiment. Further, for the order detail interface shown in fig. 5, the user may send a user voice including a target touch operation such as "slide up" or "slide down", so as to control the order in the terminal display interface to move up and down for easy viewing. At this time, the target object corresponding to the target touch operation such as "slide up" or "slide down" is not a certain control object (e.g., a certain order) in the current terminal display interface, but is the whole order display area in the terminal display interface.
The simulation operation of the target object can be realized by determining the touch action related to the target operation corresponding to the voice of the user and then executing the simulation touch action. In this case, the user can more intuitively generate the operation voice related to the touch action on the target object, and the operation voice more conforms to the daily use habit of the user (when the user normally uses the operation voice, the user performs the action of 'clicking' or 'long pressing' by controlling the finger), so that the learning cost of the user is reduced.
As another exemplary embodiment, a target function for a target object may be included in a user input voice as user input information. As shown in the terminal display interface shown in fig. 3, the user voice uttered by the user may be "open sweep" or "search XX grassland", and the like, and at this time, the target operations of promoting the picture for the "sweep" control and the "XX grassland" are the "open" and "search" functions, respectively. As shown in the terminal display interface of fig. 4, the user voice uttered by the user may be "open 11" or "map search 14", etc., and the target operations for the "take away" control and the "movie" space at this time are the "open" and "map search theater" functions, respectively.
Similarly, for the order detail interface shown in fig. 5, the user may send a user voice including a target touch operation such as "return" or "exit" to control switching of the current display interface in the terminal display interface, where at this time, the target object corresponding to the target function such as "return" or "exit application" is not a certain control object in the current terminal display interface, but is an order display area in the terminal display interface or the whole terminal display interface. At this time, the user can send out the user voice of the target function to be realized containing the target object, so that the user can still realize the target function when not knowing which touch action the target function corresponds to, and the application scenarios of the scheme disclosed by the invention are enriched to a certain extent.
After the terminal display interface changes, the object identifier of the control object in the current terminal display interface may be updated, which is shown in fig. 4 and 5 for comparison and is not described again.
In one embodiment, the target operation matching the target touch operation or the target function in the user input voice can be searched in the operation information table pre-stored locally to accelerate the determination speed of the target operation. The target touch operation or the target function in the user input voice can be sent to pre-correlation equipment such as a server, and the target operation which is determined by inquiring the operation information table and is matched with the target touch operation or the target function in the user input voice and returned by the server can be received, so that the computing capability of the pre-correlation equipment can be fully utilized, and the related function program of the terminal equipment can be simplified.
Similarly, in another embodiment, an operation voice feature can be extracted from the operation voice of the user through an artificial intelligence voice service, and then a target operation matched with the operation voice feature is inquired in a prestored operation voice information table; or the extracted operation voice features are sent to pre-correlation equipment such as a server, and target operation which is determined by inquiring an operation information table and is matched with the operation voice features and returned by the server is received.
In one embodiment, the target operations may include both system level operations and application level operations. The system level operation is predefined operation for the control object in the terminal operating system, for example, translation operation for text information corresponding to the control object, image searching operation for the control object in an image form, e-commerce searching operation for commodity images or links, map searching operation for place name keywords, and the like. Such system level operations are operations controlled and executed by the terminal operating system, do not depend on the application program to which the target object belongs in the terminal display interface, i.e. are independent of the currently-exposed application program, and can be independently executed on the currently-exposed application program for the target object or relevant information thereof. The terminal control system predefines the system level operation, and expands the possible operation modes of the control object, thereby enriching the operation types of the target object for the user, and being beneficial to simplifying the operation flow of the user to improve the user experience.
The application level operation is predefined operation in an application program to which a target object in a terminal display interface belongs, such as control triggering operation on the target object in a control form, amplification viewing operation on the target object in a picture form, jump operation on the target object in a hyperlink text form, and the like. Such application-level operations are operations predefined by the application to which the target object belongs, such operations being dependent on the application and being operations that can only be controlled to be performed by the application itself.
As described above, the target operation for the target object may be a predefined default operation or an operation determined according to user input information. It will be appreciated that, regardless of the manner in which the target operation is determined, the target operation for the target object that is ultimately determined may be an application-level operation or a system-level operation. After determining the target operation aiming at the target object, judging the type of the target operation: in the case where the target operation is an application level operation, proceed to step 210; otherwise, in the case where the target operation is a system level operation, proceed to step 212.
Step 210, sending a target operation instruction to the application program to which the target object belongs.
At this time, the target operation is determined to be an application-level operation, and therefore, a target operation instruction may be sent to the application program to which the target object belongs, so that the application program performs the target operation on the target object.
In an embodiment, the artificial intelligence voice service may generate a target operation instruction according to the determined target object and the target operation, and send the target operation instruction to the application program to which the target object belongs. In another embodiment, the terminal operating system may generate a target operation instruction according to the determined target object and target operation and send the target operation instruction to the application program. The target operation instruction may include information such as position information of the target object, an object identifier of the target object, a target touch operation corresponding to the target operation, a target touch operation corresponding to the operation function, and/or a target function.
Still taking fig. 4 as an example, if the target object is an "order" control and the target operation is an "open" target function, an open instruction for the "order" control may be generated and sent to the application program; or generating and sending a single-click instruction for the order control to the application program according to the single-click operation corresponding to the opening. Similarly, if the target object is an "order" control and the target operation is a "click" target function, a click instruction for the "order" control may be generated and sent to the application program; or the corresponding opening function can be operated according to the click operation, and an opening instruction for the order control is generated and sent to the application program. In other words, in the case that the target operation is an application-level operation, the target operation instruction generated by the artificial intelligence voice service or the terminal operating system and sent to the application program may be an instruction in a target touch operation form, or may be an instruction in a target function form, which is limited by the present disclosure. In the above embodiment, no matter what kind of instruction is generated and sent, the terminal display interface after the application program executes the target operation is as shown in fig. 5.
Step 212, target operations are performed on the target object.
At this time, it is determined that the target operation is a system level operation, and therefore, the preset function of the terminal operating system may be called or a preset application installed on the terminal device may be called, so that the preset function or the preset application performs the target operation on the target object.
In an embodiment, the target operation is executed for the target object by calling a preset function of the terminal operating system. The preset function may have various forms such as content recognition, text extraction, text translation, search, and the like.
The preset function is taken as text translation for explanation, and it is assumed that the touch operation corresponding to the predefined text search in the terminal operating system is long-time pressing. Still taking fig. 4 as an example, if the user voice input by the user is "long press 22", a text translation function preset by the terminal operating system is invoked to translate the target object 22 in the text form, and a terminal display interface is shown in fig. 6 (a), where fig. 6 is a schematic view of a terminal display interface corresponding to the system level operation shown in this disclosure according to an exemplary embodiment.
As can be seen from fig. 6 (a), the object identifiers of the control objects in the translation frame displayed in an overlapping manner are numbered in an accumulated manner on the basis of the object identifiers of the control objects in the underlying display interface (original terminal display interface). Of course, the visible manipulation objects in the current terminal display interface after the translation frame is displayed in an overlapping manner may also be renumbered, and the effect is shown in fig. 6 (b).
In another embodiment, the target operation is executed for the target object by calling up a preset application program installed on the terminal device. The preset application program may be a third-party application program or an integrated application program of a terminal operating system, and the preset application program may be preset by the terminal operating system or may be preset by a user through self-definition, which is not limited by the present disclosure.
The preset application is taken as a map application for explanation: assume that the touch operation corresponding to the predefined control 14 in the terminal operating system is a long press. Still taking fig. 4 as an example, if the user voice input by the user is "long press 14", the map search function preset by the terminal operating system is invoked to search for "movie theaters" in the map, and the terminal display interface corresponding to the search result is shown in fig. 7, where fig. 7 is another terminal display interface schematic diagram corresponding to the system level operation shown in the present disclosure according to an exemplary embodiment.
As can be seen from the foregoing embodiments, when the target operation involves a touch action for the target object, that is, the user input voice includes the target touch operation for the target object, step 210 and step 212 are to perform a simulated touch action for the target object to complete the target operation. In fact, the above steps 210 and 212 are executed by the application program and the terminal operating system (or its predefined preset function or preset application program) respectively to perform the target operation for the target object, and finally the terminal display effect as shown in fig. 5-7 is achieved.
Although the steps 806 to 812 are independent steps, in order to facilitate the intuitive description of the display effect of the terminal display interface corresponding to each step, the terminal display interface corresponding to some embodiments in the above description process is directly described corresponding to each step, and is not completely placed after the steps 810 and 812.
Fig. 8 is a flowchart illustrating yet another terminal manipulation method according to an exemplary embodiment of the present disclosure; the method is applied to the terminal equipment and can comprise the following steps:
and step 802, calling an interface monitoring service provided by a terminal operating system, and identifying a control object in a terminal display interface.
The detailed process of step 802 is not substantially different from that of step 202, and reference may be made to step 202, which is not described herein again.
And step 804, receiving a user operation gesture made by a user.
In this embodiment, the user input information received by the terminal device may be a user operation gesture made by the user. The user operation gesture may include position information and/or time information corresponding to the user operation gesture, where the position information may be a position coordinate and/or a touch trajectory corresponding to the user operation gesture. The specific process of the terminal device receiving the user operation gesture made by the user can refer to the content disclosed in the related art, and the disclosure does not limit this.
It can be understood that after receiving any gesture, the terminal device can perform preliminary judgment on the gesture to eliminate false "gestures" formed by user misoperation or non-user operation, and only the real user operation after the preliminary judgment can be used as the user operation gesture related to the disclosure, so that invalid responses corresponding to the false gestures are reduced, and the computing resources of the terminal device can be saved.
In step 806, a target object corresponding to the user operation gesture in the manipulation object is identified.
After receiving the user operation gesture, extracting the position information corresponding to the user operation gesture, and then determining a control object matched with the position information in the control object to be used as a target object.
In an embodiment, the position coordinates corresponding to the user operation gestures are determined according to the extracted position information, then matching operation is performed on the position coordinates and the display area of the control object which is recognized in advance, and the control object of which the display area is matched with the position coordinates corresponding to the user operation gestures is determined as the target object. As an exemplary embodiment, the position coordinate corresponding to the user operation gesture may be a point coordinate, and at this time, a control object whose minimum straight-line distance between the display area and the position coordinate corresponding to the user operation gesture is smaller than a first distance threshold may be determined as a target object; the control object of which the minimum distance between the edge of the display area and the starting point of the operation position corresponding to the operation gesture of the user is smaller than the second distance threshold value can also be determined as the target object; and determining the control object with the minimum distance between the center point of the display area and the starting point of the operation position corresponding to the user operation gesture smaller than the third distance threshold value as the target object and the like.
Fig. 9 is a schematic diagram of a terminal display interface corresponding to a user operation gesture according to an exemplary embodiment of the present disclosure. Referring to the terminal display interface shown in fig. 9, it is assumed that the user operation gesture made by the user is a single click, and the central point of the corresponding click position is exactly located in the display area corresponding to the text "going to travel XX prairie with family to welcome you" so that it can be determined that the single click of the target object corresponding to the user operation gesture is the control object in the text form "going to travel XX prairie with family to welcome you". Similarly, the target object corresponding to multiple user operation gestures such as double-click and long-press can be determined through the method.
As an exemplary embodiment, the position coordinate corresponding to the user operation gesture may be a trajectory coordinate, at this time, a control object corresponding to a start point coordinate or an end point coordinate of the trajectory coordinate may be determined as a target object, or a control object corresponding to a central point of an area surrounded by a trajectory corresponding to the trajectory coordinate may be determined as a target object, and the like.
Fig. 10 is a schematic view of another terminal display interface corresponding to a user operation gesture according to an exemplary embodiment of the present disclosure. Referring to the terminal display interface shown in fig. 10, it is assumed that the user operation gesture made by the user is dragging, and the corresponding initial point of the dragging track is exactly located in the display area corresponding to the text "take family to travel XX grassland and welcome you", so that it can be determined that dragging the target object corresponding to the user operation gesture is the control object in the text form "take family to travel XX grassland and welcome you". Similarly, target objects corresponding to various user operation gestures such as sliding up, sliding down, irregular screen capture and the like can be determined in a response mode, and are not repeated.
As shown in fig. 10, a manipulation object corresponding to the end point of the dragging trace may also be determined as a called object for achieving a target operation with respect to a target object. Such as the "map" and "translate" controls in the figure. When the user pushes the target object to the "map", a preset map application program or a preset map function plug-in may be invoked to search for the position of the "XX grassland", and the corresponding terminal display interface is similar to fig. 7, which is not described herein again. When the user drags the target object to the 'translation', a preset translation function plug-in or a preset translation application program can be called to translate the text 'take family to travel to the XX grassland to welcome you', and a corresponding terminal display interface is similar to that in fig. 6 and is not described again.
At step 808, a target operation for the target object is identified.
After the target object is identified, the position coordinates, the touch trajectory and/or the time information corresponding to the user operation gesture may be extracted, and then the operation type matched with the position coordinates, the touch trajectory and/or the time information is determined as the target operation for the target object.
In an embodiment, the time information corresponding to the user operation gesture may be calculated according to the pressing and lifting actions corresponding to the user operation gesture, for example, the time interval between two adjacent pressing actions may be calculated by a double-click action; the long press action can calculate the time interval from press to lift, and the like, and can calculate the corresponding time information according to the specific user operation gesture, and the repeated description is omitted.
In an embodiment, when the terminal operating system is an android operating system, a corresponding trigger function entry may be added to the control object in advance by modifying a framework code of the control object, and trigger logics for different user operation gestures are predefined in a Touch function of the control object. So that after the position coordinate, the touch track and/or the time information corresponding to the user operation gesture are extracted, the target operation corresponding to the user operation gesture is determined based on the information and the predefined trigger logic.
In an embodiment, after receiving the user operation gesture, the pre-installed gesture analysis plug-in may be used to analyze the user operation gesture to obtain the corresponding position coordinate, touch trajectory and/or time information, so as to improve the determination speed of the target operation. The user operation gesture can also be sent to pre-association equipment such as a server, the pre-association equipment analyzes the user operation gesture to obtain corresponding position coordinates, touch tracks and/or time information, and the information returned by the association equipment is received, so that the computing capability of the association equipment is fully utilized.
As described above, the target operation for the target object may be a predefined default operation or an operation determined according to user input information. It will be appreciated that, regardless of the manner in which the target operation is determined, the target operation for the target object that is ultimately determined may be an application-level operation or a system-level operation. After determining the target operation aiming at the target object, judging the type of the target operation: in the case where the target operation is an application level operation, proceed to step 810; otherwise, in the case where the target operation is a system level operation, proceed to step 812.
Step 810, sending a target operation instruction to the application program to which the target object belongs.
At this time, the target operation is determined to be an application-level operation, and therefore, a target operation instruction may be sent to the application program to which the target object belongs, so that the application program performs the target operation on the target object. The target operation instruction may include position information of a target object, a touch trajectory corresponding to a user operation gesture, target operation corresponding to the user operation gesture, and the like.
Still taking fig. 9 as an example, if the target object is an "order" control, the user operation gesture is "click", and the target operation is "open", a click instruction for the "order" control may be sent to the application program, and an open instruction for the "order" control may also be sent to the application program. Similarly, taking fig. 10 as an example, if the target object is an "order" control, the user operation gesture is "dragging", and the target operation is "translating", a translation function calling instruction may be sent to the control object of the picture text "take family to travel to XX grassland to welcome you", so that the control object calls a translation function to translate the text; and a translation instruction containing text information of 'take family to travel to XX prairie to welcome you' can be sent to the 'translation' control, so that the control realizes the translation of the text.
Step 812, a target operation is performed for the target object.
At this time, it is determined that the target operation is a system-level operation, and therefore, the target operation may be performed on the target object by calling a preset function of the terminal operating system or calling a preset application installed on the terminal device. The preset function may have various forms, such as content recognition, text extraction, text translation, search, and the like. The preset application program may be a third-party application program or an integrated application program of a terminal operating system, and the preset application program may be preset by the terminal operating system or may be preset by a user through self-definition, which is not limited by the present disclosure.
The preset function is taken as text translation for explanation, and it is assumed that the touch operation corresponding to the predefined text search in the terminal operating system is long-time pressing. Still taking fig. 9 as an example, if the operation gesture of the user is "long press", a text translation function preset by the terminal operation system is invoked to translate the text message "take family to travel XX grassland and welcome you". Taking fig. 10 as an example, if the user operates a gesture to "drag" the target object to a preset map search application installed on the terminal device, the map search application is invoked to search for "XX prairies"; and if the user operates the gesture to drag the target object to a preset translation function of the terminal equipment, calling the translation function to translate the text message 'take family to travel to XX prairie and welcome you'.
Although the steps 806 to 812 are independent steps, in order to facilitate the intuitive description of the display effect of the terminal display interface corresponding to each step, the terminal display interface corresponding to some embodiments in the above description process is directly described corresponding to each step, and is not completely placed after the steps 810 and 812.
While, for purposes of simplicity of explanation, the foregoing method embodiments have been described as a series of acts or combination of acts, it will be appreciated by those skilled in the art that the present disclosure is not limited by the order of acts, as some steps may, in accordance with the present disclosure, occur in other orders and concurrently.
Further, those skilled in the art should also appreciate that the embodiments described in the specification are exemplary embodiments and that acts and modules referred to are not necessarily required by the disclosure.
Corresponding to the embodiment of the application function implementation method, the disclosure also provides an embodiment of an application function implementation device and a corresponding terminal.
Referring to fig. 11-19, fig. 11-19 are block diagrams of one or more terminal operators illustrated in accordance with one or more exemplary embodiments of the present disclosure, respectively. The apparatus may include:
an object identification unit 1101, configured to identify an operation object included in a terminal display interface by invoking an interface monitoring service provided by a terminal operating system;
an object determining unit 1102, configured to determine a target object from the control object according to the received user input information;
an operation executing unit 1103, configured to trigger execution of a target operation for the target object.
Optionally, when the terminal operating system is an Android operating system, the interface monitoring service includes an AccessbilitySevice service.
Optionally, referring to fig. 12, the user input information includes a user voice; the object determination unit includes:
an information identification subunit 1102A, configured to process the user voice through an artificial intelligence voice service to identify object information indicated by the user voice;
an object determining subunit 1102B, configured to determine a target object from the control object according to the object information.
Alternatively, referring to fig. 13, when the target operation involves a touch action with respect to the target object, the operation execution unit includes:
a touch simulation subunit 1103A, configured to execute a simulated touch action for the target object to complete the target operation.
Optionally, referring to fig. 14, the object determination subunit 1102B includes:
the information acquisition module 1102B-1 is configured to acquire display information corresponding to the control object in the terminal display interface;
an object determining module 1102B-2, configured to determine any manipulation object as the target object if the object information matches display information of the manipulation object.
Optionally, referring to fig. 15, the information obtaining module 1102B-1 includes:
the interface calling submodule 1102B-1A is used for calling an application program interface of the text object to acquire display information corresponding to the text object when the control object is the text object;
the region screen capture sub-module 1102B-1B is configured to, when the control object is a graphical object, capture a corresponding region of the graphical object, and determine display information corresponding to the graphical object based on an image obtained by the screen capture.
Optionally, as shown in fig. 16, the method further includes: an identifier display unit 1104 for displaying object identifiers corresponding to respective manipulation objects in the terminal display interface;
the object determination subunit 1102B further includes:
and an identification determining module 1102B-1, configured to determine the manipulation object matching the object identification as the target object.
Optionally, referring to fig. 17, the user input information includes a user operation gesture, and the object determination unit 1102 further includes:
the information determining subunit 1102C is configured to determine position information corresponding to the user operation gesture;
a position determining subunit 1102D, configured to determine, as the target object, a manipulation object in the manipulation objects, which matches the position information.
Alternatively, referring to fig. 18, the target operation is a predefined default operation; alternatively, the first and second electrodes may be,
the device further comprises: an operation determining unit 1105 configured to determine the target operation for the target object according to the user input information.
Optionally, referring to fig. 19, the operation executing unit 1103 further includes:
a call evoking subunit 1103B, configured to, in a case that the target operation belongs to a predefined system-level operation in the terminal operating system, invoke a preset function of the terminal operating system or evoke a preset application installed on a terminal device to perform the target operation on the target object;
an instruction sending subunit 1103C, configured to, if the target operation belongs to a predefined application-level operation in an application to which the target object belongs, send a target operation instruction according to the target object to the application, so that the application executes the target operation according to the target operation instruction.
For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the disclosed solution. One of ordinary skill in the art can understand and implement it without inventive effort.
Accordingly, in one aspect, an embodiment of the present disclosure provides a terminal control device, including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to:
identifying a control object contained in a terminal display interface by calling an interface monitoring service provided by a terminal operating system;
determining a target object from the control object according to the received user input information;
and triggering to execute the target operation aiming at the target object.
Fig. 20 is a block diagram illustrating an apparatus for terminal manipulation according to an exemplary embodiment of the present disclosure. For example, the apparatus 2000 may be a user device, which may be embodied as a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, a wearable device such as a smart watch, smart glasses, a smart bracelet, a smart running shoe, and the like.
Referring to fig. 20, the apparatus 2000 may include one or more of the following components: a processing component 2002, a memory 2004, a power component 2006, a multimedia component 2008, an audio component 2010, an input/output (I/O) interface 2012, a sensor component 2014, and a communication component 2016.
The processing component 2002 generally controls the overall operation of the device 2000, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 2002 may include one or more processors 2014 to execute instructions to perform all or portions of the steps of the methods described above. Further, the processing component 2002 can include one or more modules that facilitate interaction between the processing component 2002 and other components. For example, the processing component 2002 may include a multimedia module to facilitate interaction between the multimedia component 2008 and the processing component 2002.
The memory 2004 is configured to store various types of data to support operation at the device 2000. Examples of such data include instructions for any application or method operating on device 2000, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 2004 may be implemented by any type or combination of volatile or non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
The power supply component 2006 provides power to the various components of the device 2000. The power supply components 2006 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the device 2000.
The multimedia assembly 2008 includes a front camera and/or a rear camera, when the device 2000 is in an operational mode, such as a capture mode or a video mode, the front camera and/or the rear camera may receive external multimedia data.
Audio component 2010 is configured to output and/or input audio signals. For example, audio component 2010 includes a Microphone (MIC) configured to receive external audio signals when apparatus 2000 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 2004 or transmitted via the communication component 2016. In some embodiments, audio assembly 2010 also includes a speaker for outputting audio signals.
The I/O interface 2012 provides an interface between the processing component 2002 and peripheral interface modules, which can be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
The sensor assembly 2014 includes one or more sensors for providing various aspects of state assessment for the device 2000. For example, sensor assembly 2014 may detect an open/closed state of device 2000, a relative positioning of components, such as a display and keypad of apparatus 2000, a change in position of apparatus 2000 or a component of apparatus 2000, the presence or absence of user contact with apparatus 2000, an orientation or acceleration/deceleration of apparatus 2000, and a change in temperature of apparatus 2000. The sensor assembly 2014 may include a proximity sensor configured to detect the presence of a nearby object in the absence of any physical contact. The sensor assembly 2014 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 2014 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 2016 may also include a Near Field Communication (NFC) module to facilitate short range communication, in one exemplary embodiment, the communication component 2016 may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the apparatus 2000 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), programmable logic devices (P L D), Field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic components for performing the above-described methods.
In an exemplary embodiment, there is also provided a non-transitory computer-readable storage medium, such as the memory 2004, including instructions that, when executed by the processor 2014 of the apparatus 2000, enable the apparatus 2000 to perform a terminal steering method, the method including:
identifying a control object contained in a terminal display interface by calling an interface monitoring service provided by a terminal operating system;
determining a target object from the control object according to the received user input information;
and triggering to execute the target operation aiming at the target object.
The non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
As shown in fig. 21, fig. 21 is a block diagram of another apparatus for terminal manipulation according to an exemplary embodiment of the present disclosure. For example, the apparatus 2100 may be provided as an application server. Referring to fig. 21, the apparatus 2100 includes a processing component 2122 that further includes one or more processors and memory resources, represented by memory 2116, for storing instructions, e.g., applications, that are executable by the processing component 2122. The application programs stored in the memory 2116 may include one or more modules each corresponding to a set of instructions. Further, the processing component 2122 is configured to execute instructions to perform the above-described terminal steering method.
The device 2100 may also include a power component 2126 configured to perform power management of the device 2100, a wired or wirelessA network interface 2150 is configured to connect the apparatus 2100 to a network, and an input/output (I/O) interface 2158. The device 2100 may operate based on an operating system stored in the memory 2116, such as Android, iOS, windows serverTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTMOr the like.
In an exemplary embodiment, a non-transitory computer readable storage medium comprising instructions, such as the memory 2116 comprising instructions, executable by the processing component 2122 of the apparatus 2100 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
Wherein the instructions in the memory 2116, when executed by the processing component 2122, enable the apparatus 2100 to perform a terminal steering method comprising:
identifying a control object contained in a terminal display interface by calling an interface monitoring service provided by a terminal operating system;
determining a target object from the control object according to the received user input information;
and triggering to execute the target operation aiming at the target object.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (22)

1. A terminal control method is characterized by comprising the following steps:
identifying a control object contained in a terminal display interface by calling an interface monitoring service provided by a terminal operating system;
determining a target object from the control object according to the received user input information;
and triggering to execute the target operation aiming at the target object.
2. The method according to claim 1, wherein when the terminal operating system is Android operating system, the interface monitoring service comprises an accessibility service.
3. The method of claim 1, wherein the user input information comprises user speech; the determining a target object from the control object according to the received user input information includes:
processing the user voice through an artificial intelligence voice service so as to identify object information indicated by the user voice;
and determining a target object from the control object according to the object information.
4. The method of claim 3, wherein when the target operation involves a touch action for the target object, the triggering execution of the target operation for the target object comprises:
and executing the simulated touch action aiming at the target object to finish the target operation.
5. The method of claim 3, wherein the determining a target object from the manipulated objects according to the object information comprises:
acquiring corresponding display information of a control object in the terminal display interface;
and determining any one of the control objects as the target object when the object information is matched with the display information of the any one of the control objects.
6. The method according to claim 5, wherein the obtaining of the corresponding display information of the control object in the terminal display interface comprises:
if the control object is a text object, acquiring display information corresponding to the text object by calling an application program interface of the text object;
and if the control object is a graphic object, the corresponding area of the graphic object is captured, and the display information corresponding to the graphic object is determined based on the image obtained by capturing the screen.
7. The method of claim 3,
further comprising: displaying object identifications corresponding to the control objects in the terminal display interface;
the determining a target object from the control object according to the object information includes:
and determining the control object matched with the object identification as the target object.
8. The method of claim 1, wherein the user input information comprises a user manipulation gesture, and wherein determining a target object from the manipulation objects based on the received user input information comprises:
determining position information corresponding to the user operation gesture;
and determining a control object matched with the position information in the control objects as the target object.
9. The method of claim 1,
the target operation is a predefined default operation; alternatively, the first and second electrodes may be,
the method further comprises the following steps: determining the target operation for the target object according to the user input information.
10. The method of claim 9, wherein the triggering execution of the target operation for the target object comprises:
under the condition that the target operation belongs to predefined system-level operation in the terminal operating system, executing the target operation aiming at the target object by calling a preset function of the terminal operating system or calling a preset application program installed on terminal equipment;
and if the target operation belongs to a predefined application-level operation in the application program to which the target object belongs, sending a target operation instruction for the target object to the application program so as to enable the application program to execute the target operation according to the target operation instruction.
11. A terminal operating device, comprising:
the object identification unit is used for identifying an operation object contained in a terminal display interface by calling interface monitoring service provided by a terminal operating system;
the object determining unit is used for determining a target object from the control object according to the received user input information;
and the operation execution unit is used for triggering and executing the target operation aiming at the target object.
12. The apparatus of claim 11, wherein the interface monitoring service comprises an accessibility service when the terminal operating system is an Android operating system.
13. The apparatus of claim 11, wherein the user input information comprises user speech; the object determination unit includes:
the information identification subunit is used for processing the user voice through an artificial intelligence voice service so as to identify the object information indicated by the user voice;
and the object determining subunit is used for determining a target object from the control object according to the object information.
14. The apparatus according to claim 13, wherein when the target operation involves a touch action with respect to the target object, the operation execution unit includes:
and the touch simulation subunit is used for executing a simulated touch action aiming at the target object so as to complete the target operation.
15. The apparatus of claim 13, wherein the object determination subunit comprises:
the information acquisition module is used for acquiring corresponding display information of the control object in the terminal display interface;
and the object determining module is used for determining any control object as the target object under the condition that the object information is matched with the display information of any control object.
16. The apparatus of claim 15, wherein the information obtaining module comprises:
the interface calling submodule is used for obtaining display information corresponding to the text object by calling an application program interface of the text object when the control object is the text object;
and the region screen capturing sub-module is used for capturing a corresponding region of the graphic object when the control object is the graphic object, and determining display information corresponding to the graphic object based on an image obtained by screen capturing.
17. The apparatus of claim 13,
further comprising: the identification display unit is used for displaying object identifications corresponding to the control objects in the terminal display interface;
the object determination subunit further includes:
and the identification determining module is used for determining the control object matched with the object identification as the target object.
18. The apparatus of claim 11, wherein the user input information comprises a user operation gesture, the object determination unit further comprising:
the information determining subunit is used for determining the position information corresponding to the user operation gesture;
and the position determining subunit is used for determining a control object matched with the position information in the control objects as the target object.
19. The apparatus of claim 11,
the target operation is a predefined default operation; alternatively, the first and second electrodes may be,
the device further comprises: an operation determination unit configured to determine the target operation for the target object according to the user input information.
20. The apparatus of claim 19, wherein the operation performing unit further comprises:
the calling and calling subunit is used for calling a preset function of the terminal operating system or calling a preset application program installed on the terminal equipment to execute the target operation aiming at the target object under the condition that the target operation belongs to predefined system-level operation in the terminal operating system;
and the instruction sending subunit is used for sending a target operation instruction according to the target object to the application program to enable the application program to execute the target operation according to the target operation instruction if the target operation belongs to a predefined application-level operation in the application program to which the target object belongs.
21. A non-transitory computer readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the steps of the method of any of claims 1-10.
22. A terminal operating device, comprising:
a processor;
a memory for storing processor-executable instructions;
the processor is used for executing the executable instructions to realize the terminal control method according to any one of claims 1-10.
CN202010345593.0A 2020-04-27 2020-04-27 Terminal control method and device Pending CN111506245A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010345593.0A CN111506245A (en) 2020-04-27 2020-04-27 Terminal control method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010345593.0A CN111506245A (en) 2020-04-27 2020-04-27 Terminal control method and device

Publications (1)

Publication Number Publication Date
CN111506245A true CN111506245A (en) 2020-08-07

Family

ID=71869498

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010345593.0A Pending CN111506245A (en) 2020-04-27 2020-04-27 Terminal control method and device

Country Status (1)

Country Link
CN (1) CN111506245A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111968639A (en) * 2020-08-14 2020-11-20 北京小米松果电子有限公司 Voice control method and device, electronic equipment and storage medium
CN111968640A (en) * 2020-08-17 2020-11-20 北京小米松果电子有限公司 Voice control method and device, electronic equipment and storage medium
CN111968637A (en) * 2020-08-11 2020-11-20 北京小米移动软件有限公司 Operation mode control method and device of terminal equipment, terminal equipment and medium
CN114764363A (en) * 2020-12-31 2022-07-19 上海擎感智能科技有限公司 Prompting method, prompting device and computer storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103019426A (en) * 2011-09-28 2013-04-03 腾讯科技(深圳)有限公司 Interacting method and interacting device in touch terminal
CN104038631A (en) * 2014-06-11 2014-09-10 华为技术有限公司 Communication initiating method and terminal device
CN106843669A (en) * 2016-12-06 2017-06-13 北京小度信息科技有限公司 Application interface operating method and device
CN109739424A (en) * 2018-04-04 2019-05-10 北京字节跳动网络技术有限公司 Operating method and device and touch control terminal applied to touch control terminal
CN111061452A (en) * 2019-12-17 2020-04-24 北京小米智能科技有限公司 Voice control method and device of user interface

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103019426A (en) * 2011-09-28 2013-04-03 腾讯科技(深圳)有限公司 Interacting method and interacting device in touch terminal
CN104038631A (en) * 2014-06-11 2014-09-10 华为技术有限公司 Communication initiating method and terminal device
CN106843669A (en) * 2016-12-06 2017-06-13 北京小度信息科技有限公司 Application interface operating method and device
CN109739424A (en) * 2018-04-04 2019-05-10 北京字节跳动网络技术有限公司 Operating method and device and touch control terminal applied to touch control terminal
CN111061452A (en) * 2019-12-17 2020-04-24 北京小米智能科技有限公司 Voice control method and device of user interface

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111968637A (en) * 2020-08-11 2020-11-20 北京小米移动软件有限公司 Operation mode control method and device of terminal equipment, terminal equipment and medium
CN111968639A (en) * 2020-08-14 2020-11-20 北京小米松果电子有限公司 Voice control method and device, electronic equipment and storage medium
CN111968640A (en) * 2020-08-17 2020-11-20 北京小米松果电子有限公司 Voice control method and device, electronic equipment and storage medium
CN114764363A (en) * 2020-12-31 2022-07-19 上海擎感智能科技有限公司 Prompting method, prompting device and computer storage medium
CN114764363B (en) * 2020-12-31 2023-11-24 上海擎感智能科技有限公司 Prompting method, prompting device and computer storage medium

Similar Documents

Publication Publication Date Title
US10956706B2 (en) Collecting fingreprints
CN111506245A (en) Terminal control method and device
CN109062479B (en) Split screen application switching method and device, storage medium and electronic equipment
EP3285185A1 (en) Information retrieval method and apparatus, electronic device and server, computer program and recording medium
CN108664303B (en) Webpage content display method and device
US20200007944A1 (en) Method and apparatus for displaying interactive attributes during multimedia playback
CN111901896A (en) Information sharing method, information sharing device, electronic equipment and storage medium
CN108984089B (en) Touch operation method and device, storage medium and electronic equipment
CN112905103B (en) False touch processing method and device and storage medium
CN112882779A (en) Lock screen display control method and device, mobile terminal and storage medium
CN114205447B (en) Shortcut setting method and device of electronic equipment, storage medium and electronic equipment
CN108803892B (en) Method and device for calling third party application program in input method
CN110851745B (en) Information processing method, information processing device, storage medium and electronic equipment
CN107544740B (en) Application processing method and device, storage medium and electronic equipment
CN112381091A (en) Video content identification method and device, electronic equipment and storage medium
CN112286611A (en) Icon display method and device and electronic equipment
CN112181351A (en) Voice input method and device and electronic equipment
CN110580486B (en) Data processing method, device, electronic equipment and readable medium
US20230034462A1 (en) Display control method and apparatus for virtual item, and display method and apparatus for virtual item
CN115016710B (en) Application program recommendation method
CN112667852B (en) Video-based searching method and device, electronic equipment and storage medium
CN107340881B (en) Input method and electronic equipment
CN111201512A (en) Method and equipment for scrolling and displaying notification message
CN113253884A (en) Touch method, touch device and electronic equipment
CN107643821B (en) Input control method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination