CN108605074B - Method and equipment for triggering voice function - Google Patents

Method and equipment for triggering voice function Download PDF

Info

Publication number
CN108605074B
CN108605074B CN201780004960.7A CN201780004960A CN108605074B CN 108605074 B CN108605074 B CN 108605074B CN 201780004960 A CN201780004960 A CN 201780004960A CN 108605074 B CN108605074 B CN 108605074B
Authority
CN
China
Prior art keywords
interface content
contact
terminal equipment
voice
interface
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201780004960.7A
Other languages
Chinese (zh)
Other versions
CN108605074A (en
Inventor
王培�
何小文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN108605074A publication Critical patent/CN108605074A/en
Application granted granted Critical
Publication of CN108605074B publication Critical patent/CN108605074B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/725Cordless telephones

Abstract

The embodiment of the invention relates to a method and equipment for triggering a voice function, wherein the method comprises the following steps: the method comprises the steps that the terminal equipment obtains interface content presented on an interface of the terminal equipment; and if the working mode of the terminal equipment is detected to be switched into a first mode, converting the interface content into voice information for voice output, wherein the first mode is a working mode that the terminal equipment is accessed into an earphone or starts the earphone function of the terminal equipment. The embodiment of the invention can improve the operation efficiency, reduce the operation of learning and memorizing a large number of gestures by the user and improve the experience of the user.

Description

Method and equipment for triggering voice function
The present application claims priority from a chinese patent application filed by the chinese patent office on 26/1/2017 under application number 201710061841.7 entitled "method and apparatus for headset mode triggering automatic voice feature," the entire contents of which are incorporated herein by reference.
Technical Field
The present invention relates to the field of communications, and in particular, to a method and an apparatus for triggering a voice function.
Background
At present, when a User wants to acquire content displayed on a User Interface (UI) of a terminal, the User can browse the UI through eyes, but the acquisition mode is single, and in addition, when the User is inconvenient to operate the UI, such as turning up and down pages, the User cannot browse the content presented by the UI quickly, and only can browse the current UI Interface.
In the prior art, for the viewing and operation of the content displayed on the UI, a multi-touch mode, a key mode, or a gesture recognition mode may be adopted. In a multi-touch mode, when a user browses contents displayed on a UI, the user needs to continuously interact with an interface to continuously check the contents; in the manner of keys and gestures, because each gesture can only complete a certain specific operation, browsing the UI content includes not only one operation in the specific operation; therefore, the user needs to learn and memorize a large number of operations corresponding to the gestures. In conclusion, the operation efficiency is low in the above mode, the learning and memory burden of the user is improved, and the experience degree of the user is reduced.
Disclosure of Invention
The embodiment of the invention relates to a method and equipment for triggering a voice function. The problem that a user is low in operation efficiency in the process of browsing the UI and the user needs to learn and memorize operations corresponding to a large number of gestures to finish browsing the UI is solved.
In a first aspect, an embodiment of the present invention provides a method for triggering a voice function, where the method includes: the method comprises the steps that the terminal equipment obtains interface content presented on an interface of the terminal equipment; and if the working mode of the terminal equipment is detected to be switched into a first mode, the terminal equipment converts the interface content into voice information to carry out voice output, and the first mode is a working mode in which the terminal equipment is accessed into an earphone or the earphone function of the terminal equipment is started.
According to the embodiment of the invention, by detecting the working mode of the terminal equipment, when the working mode of the terminal equipment is switched to the first mode, the voice function of the terminal equipment can be triggered, and the interface content of the terminal equipment is converted into the voice information to be output. And then improve operating efficiency, reduce the operation that the user study and memory a large amount of gestures correspond, improve user's experience degree.
In a possible embodiment, after the step of obtaining the interface content presented on the interface of the terminal device, the method further includes: the terminal device determines the type of the interface element in the interface content.
In one possible embodiment, the types of interface elements in the interface content include: one or more of text information, picture information, contact names, phone numbers, and contact avatars.
In one possible embodiment, the type of interface element in the interface content includes picture information including one or more of text information, contact name, phone number, and contact avatar.
In one possible embodiment, the interface content includes textual information; the terminal equipment converts the interface content into voice information to be output, and the method comprises the following steps: and converting the text information into voice information, and outputting the voice.
In one possible embodiment, the interface content includes one or more of a contact name, a phone number, and a contact image; the terminal equipment converts the interface content into voice information to be output, and the method comprises the following steps: and the terminal equipment dials according to the contact person associated with one or more of the contact person name, the telephone number and the contact person image.
In a possible embodiment, the step of the terminal device converting the interface content into voice information for voice output further includes: detecting whether the interface content comprises system voice information of the terminal equipment; if the interface content does not include the system language information of the terminal equipment, converting the interface content into target voice information; or, converting the interface content into language information corresponding to the user requirement.
In a second aspect, an embodiment of the present invention provides a terminal device, where the terminal device includes: the terminal equipment comprises an acquisition unit, a display unit and a display unit, wherein the acquisition unit is used for acquiring interface content presented on an interface of the terminal equipment; the detection unit is used for detecting the working mode of the terminal equipment; if the detection unit detects that the working mode of the terminal equipment is switched to the first mode, the execution unit converts the interface content of the terminal equipment into voice information, and then the output unit outputs the voice. The first mode is a working mode that the terminal equipment is accessed into the earphone or the earphone function of the terminal equipment is started.
According to the terminal device provided by the embodiment of the invention, when the detection unit detects that the working mode of the terminal device is switched to the first mode, the execution unit converts the interface content of the terminal device into the voice information, and then the output unit outputs the voice information. The embodiment of the invention can improve the operation efficiency, reduce the operation of learning and memorizing a large number of gestures by the user and improve the experience of the user.
In one possible embodiment, the processing unit is configured to: a type of an interface element in the interface content is determined.
In one possible embodiment, the types of interface elements in the interface content include: one or more of text information, picture information, contact names, phone numbers, and contact avatars.
In one possible embodiment, the type of interface element in the interface content includes picture information including one or more of text information, contact name, phone number, and contact avatar.
In one possible embodiment, if the interface content includes text information; converting the interface content of the terminal equipment into voice information for outputting, comprising: and converting the text information into voice information, and outputting the voice.
In one possible embodiment, if the interface content includes one or more of a contact name, a phone number, and a contact image; converting the interface content into voice information for output, including: and dialing according to the contact person associated with one or more of the contact person name, the telephone number and the contact person image.
In a possible embodiment, the step of converting the interface content into voice information for voice output further comprises: detecting whether the interface content comprises system voice information of the terminal equipment; if the interface content does not include the system language information of the terminal equipment, converting the interface content of the terminal equipment into target voice information; or, converting the interface content into language information corresponding to the user requirement.
In a possible embodiment, the terminal device further includes an input unit, and the input unit is configured to receive a voice input of the user during a voice interaction between the terminal device and the user.
In three aspects, an embodiment of the present invention provides another terminal device, where the terminal device includes: a memory for storing program instructions; a processor for performing the following operations according to program instructions stored in the memory: acquiring interface content presented on an interface of terminal equipment; and if the working mode of the terminal equipment is detected to be switched into a first mode, converting the interface content into voice information for voice output, wherein the first mode is a working mode that the terminal equipment is accessed into an earphone or starts the earphone function of the terminal equipment.
In one possible embodiment, the processor is further configured to perform the following operations according to program instructions stored in the memory: after the step of obtaining the interface content presented on the interface of the terminal device, determining the type of the interface element in the interface content.
In one possible embodiment, the types of interface elements in the interface content include: one or more of text information, picture information, contact names, phone numbers, and contact avatars.
In one possible embodiment, the types of interface elements in the interface content include: picture information including one or more of text information, contact names, phone numbers, and contact avatars.
In one possible embodiment, if the interface content includes text information; a processor for performing the following operations according to program instructions stored in the memory: converting the interface content into voice information for output, including: and converting the text information into voice information, and outputting the voice.
In one possible embodiment, if the interface content includes one or more of a contact name, a phone number, and a contact image; a processor for performing the following operations according to program instructions stored in the memory: converting the interface content into voice information for output, including: and dialing according to the contact person associated with one or more of the contact person name, the telephone number and the contact person image.
In one possible embodiment, the processor is further configured to perform the following operations according to program instructions stored in the memory: before converting the interface content into voice information and outputting the voice, detecting whether the interface content comprises system voice information of the terminal equipment; if the interface content does not include the system language information of the terminal equipment, converting the interface content into target voice information; or, converting the interface content into language information corresponding to the user requirement.
Based on the above technical solution, the method and device for triggering a voice function provided in the embodiments of the present invention switch the operating mode of the terminal device to the first mode, convert the interface content into voice information, and output the voice information, where the first mode is an operating mode in which the terminal device accesses an earphone or starts an earphone function of the terminal device. The embodiment of the invention can improve the operation efficiency, reduce the operation of learning and memorizing a large number of gestures by the user and improve the experience of the user.
Drawings
Fig. 1 is a schematic structural diagram of a terminal device according to an embodiment of the present invention;
fig. 2 is a flowchart illustrating a method for triggering a voice function according to an embodiment of the present invention;
fig. 3 is a possible implementation of a method for triggering a voice function according to an embodiment of the present invention;
4a-4d illustrate another possible implementation of a triggered voice function according to an embodiment of the present invention;
5a-5c illustrate yet another possible implementation of a triggered voice function according to an embodiment of the present invention;
FIGS. 6a-6e illustrate yet another possible implementation of a trigger voice function according to an embodiment of the present invention;
FIG. 7 is a possible implementation of converting interface content into voice information according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of another terminal device according to an embodiment of the present invention.
Detailed Description
Fig. 1 is a schematic structural diagram of a terminal device according to an embodiment of the present invention. As shown in fig. 1, the terminal device may include: processor 180, memory 120, Radio Frequency (RF) circuitry 110, and peripheral systems 170. These components may communicate over one or more communication buses 210.
The system is mainly used for realizing the interactive function between the terminal equipment and the user/external environment and comprises an input and output device of the terminal equipment. In some embodiments, peripheral system 170 may include: other device controllers 171, a sensor controller 172, and a display controller 173. Wherein each controller may be coupled to a respective peripheral device (e.g., other input devices 130, sensors 150, display screen 140). It should be noted that the peripheral system 170 may also include other I/O peripherals.
The display screen 140 may be used to display information input by the user or present information to the user, for example, various menus of the terminal device, interfaces of running applications, such as buttons (Button), Text input boxes (Text), sliders (Scroll Bar), menus (Menu), and so on may be presented. The Display screen 140 may include a Display panel 141 and a touch panel 142, and optionally, the Display panel 141 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch panel 142 can cover the display panel, and when the touch panel 142 detects a touch operation on or near the touch panel, the touch panel is transmitted to the processor 180 to determine the type of the touch event, and then the processor 180 provides a corresponding visual output on the display panel 141 according to the type of the touch event. The touch panel 142 and the display panel 141 are two independent components to implement the input and output functions of the terminal device, but in some embodiments, the touch panel 142 and the display panel 141 may be integrated to implement the input and output functions of the terminal device.
Radio Frequency (RF) circuitry 110 is used to receive and transmit radio frequency signals and primarily integrates the receiver and transmitter of the terminal equipment. Radio Frequency (RF) circuitry 110 communicates with communication networks and other communication devices via radio frequency signals. In some embodiments, the Radio Frequency (RF) circuitry 110 may include, but is not limited to: an antenna system, an RF transceiver, one or more amplifiers, a tuner, one or more oscillators, a digital signal processor, a CODEC chip, a SIM card, a storage medium, and the like. In some embodiments, the Radio Frequency (RF) circuitry 110 may be implemented on a separate chip. In general, WIreless transmission, such as Bluetooth (Bluetooth) transmission, WIreless Fidelity (WI-FI) transmission, third Generation mobile communication technology (3rd-Generation, 3G) transmission, fourth Generation mobile communication technology (4G) transmission, etc., may be performed through the radio frequency circuit B03.
The audio circuit 160 is used for single MP3 audio streaming and bi-directional voice transmission over a network. The audio circuitry 160 may include a speaker 161 and a microphone 162.
The memory 120 is coupled to the processor 180 for storing various software programs and/or sets of instructions. In some embodiments, memory 120 may include high speed random access memory and may also include non-volatile memory, such as one or more magnetic disk storage devices, flash memory devices, or other non-volatile solid state storage devices. The memory 120 may store an operating system (hereinafter referred to simply as a system), such as an embedded operating system like ANDROID, IOS, WINDOWS, or LINUX. The memory 1120 may also store a network communication program that may be used to communicate with one or more additional devices, one or more terminal devices, one or more network devices. The memory 120 may further store a user interface program, which may vividly display the content of the application program through a graphical operation interface, and receive a control operation of the application program from a user through input controls such as menus, dialog boxes, and buttons.
It should be understood that the terminal device is only one example provided by the embodiments of the present invention, and the terminal device may have more or less components than those shown, may combine two or more components, or may have a different configuration implementation of the components.
The above is a schematic structural diagram of a typical terminal device, and of course, different device forms may be added or subtracted on the basis, for example, there may be no audio circuit, no speaker, no microphone, no RF circuit, or no other input device; for example, a WIFI circuit, a Bluetooth circuit, an infrared circuit and the like can be added.
Fig. 2 is a flowchart of a method for triggering a voice function according to an embodiment of the present invention. As shown in fig. 2, the method for triggering the voice function may include the steps of:
step 201: the terminal equipment acquires interface content presented on an interface of the terminal equipment.
Specifically, the terminal device may obtain interface content presented on an interface of the terminal device through system software of the terminal device, where the system software is various independent hardware in the terminal device, so that they may work in coordination. For the sake of the aspect discussion, the interface content presented on the interface of the terminal device is hereinafter simply referred to as the interface content of the terminal device.
In one possible embodiment, after step 201, the method for triggering the voice function further comprises: the type of the interface element in the interface content of the terminal equipment is determined.
Specifically, the terminal device determines a format of an interface element in the interface content, and further determines a type of the interface element. For example, the format of the interface element includes a format of a text (. TXT) and a format of a picture (. JPG), and after determining the format of the interface element in the interface content, the terminal device may determine that the type of the interface element in the interface content corresponds to text information and picture information.
In one possible embodiment, the types of interface elements in the interface content of the terminal device include one or more of text information, picture information, contact names, phone numbers, and contact avatars.
In one possible embodiment, the type of the interface element in the interface content of the terminal device includes picture information including one or more of a contact name, a phone number, and a contact avatar.
Step 202: if the working mode of the terminal equipment is detected to be switched into a first mode, the terminal equipment converts the interface content of the terminal equipment into voice information to carry out voice output, and the first mode is a working mode that the terminal equipment is accessed into an earphone or starts the earphone function of the terminal equipment.
In one possible embodiment, the interface content of the terminal device includes, but is not limited to, text information, picture information, contact names, phone numbers, and contact avatars.
In a possible embodiment, the interface content of the terminal device includes text information, and if it is detected that the operating mode of the terminal device is switched to the first mode, the terminal device converts the text information into voice information, and performs voice broadcast on the interface content of the terminal device through the terminal device. In fig. 3, text information is taken as an example. The text information may include, but is not limited to, a text interface, a picture interface displaying text.
In a possible embodiment, the interface content of the terminal device includes picture information, and if it is detected that the operating mode of the terminal device is switched to the first mode, the terminal device converts the picture information into voice information, and performs voice broadcast on the interface content of the terminal device through the terminal device.
In one possible embodiment, the interface content of the terminal device may include one or more of a contact name, a phone number, and a contact avatar.
In one possible embodiment, the interface content of the terminal device may include a contact name, or a phone number, or a contact avatar. If the terminal equipment is detected to be switched to the first mode, the terminal equipment carries out dialing operation according to a contact person name, a telephone number or a contact person associated with a contact person head portrait on the user interface. Contact 1 is only an example in fig. 4 a. And if the terminal equipment is detected to be switched to the first mode, the terminal equipment carries out dialing operation on the contact person associated with the contact person 1.
It should be noted that: when the interface content of the terminal equipment comprises a contact person head portrait, the terminal equipment can also realize that the contact person associated with the contact person head portrait carries out dialing operation in the above mode.
In one possible embodiment, the user interface of the terminal device comprises a contact name, the contact corresponding to at least two telephone numbers; or a contact photo, the contact photo corresponding to at least two phone numbers. For convenience of description, two phone numbers corresponding to one contact are taken as an example for illustration. And if the terminal equipment is detected to be switched to the first mode, the terminal equipment selects one of the two telephone numbers corresponding to one contact person to carry out dialing operation.
The terminal device selecting one of two telephone numbers corresponding to one contact person to perform dialing operation may include the following several ways:
the first mode is as follows: the production merchant may be set before the terminal device leaves the factory, or set by the user after leaving the factory. For example, the first ranked object is used by default for dialing operations. In fig. 4b, the phone numbers corresponding to contact 1 are phone number 1(xxxx-xxxxxxxx) and phone number 2 (xxx-xxxx-xxxx). The telephone number 1 is ranked at the first place, and the terminal equipment carries out dialing operation according to the contact person associated with the telephone number 1. Through default setting, the method is very consistent with the operation of the terminal equipment and is relatively simple to operate.
The second mode is as follows: the user may access keys on the headset connected to the terminal device. For example, one of the at least two telephone numbers corresponding to one contact name can be selected through a volume key on the earphone for dialing operation. For example, the volume-down key sequentially selects one of at least two telephone numbers corresponding to one contact person according to the sequence from first to last to carry out dialing operation; and a volume increasing key selects one telephone number from at least two telephone numbers corresponding to one contact person according to the sequence from back to front to dial. In fig. 4b, the dialing operation for telephone number 1 is selected via the volume up key of the headset.
The third mode is as follows: the terminal equipment selects one of at least two telephone numbers corresponding to one contact person to carry out dialing operation by inquiring the user. In fig. 4b, the terminal device inquires that the user needs to communicate with the phone number 1 corresponding to the contact 1, and the terminal device performs a dialing operation according to the phone number 1. Through voice interaction with the user, the user experience is emphasized, and the user experience degree is improved.
It should be noted that: when the interface content of the terminal device includes a contact photo, the contact photo corresponds to at least two telephone numbers, and the dialing operation of one telephone number of the at least two telephone numbers corresponding to the contact photo can be realized through the above mode.
In one possible embodiment, the interface content of the terminal device may include at least two contact names, each contact name corresponding to a telephone number; or, at least two telephone numbers; or at least two contact head portraits, wherein each contact head portrait corresponds to one telephone number. For convenience of description, three contact names are illustrated as an example. And if the terminal equipment is detected to be switched to the first mode, the terminal equipment selects a telephone number corresponding to one of the three contact names to carry out dialing operation.
The terminal device selecting a telephone number corresponding to one of the three contact names to perform dialing operation may include the following several ways:
the first mode is as follows: the production merchant may be set before the terminal device leaves the factory, or set by the user after leaving the factory. For example, the contact associated with the contact name with the first order is dialed by default. In fig. 4c, the name of the first-ranked contact is contact 1, and the terminal device performs dialing operation according to the contact associated with contact 1. Through default setting, the method is very consistent with the operation of the terminal equipment and is relatively simple to operate.
The second mode is as follows: the user may access keys on the headset connected to the terminal device. For example, one of the three contacts may be selected for dialing operation through a volume button on the headset. For example, a volume down key, which sequentially selects contact 1, contact 2, and contact 3 in order from first to last; a volume up key to select contact 3, contact 2 and contact 1 in order from back to front. Suppose that the user selects contact 1 through the volume key, and the terminal device performs dialing operation according to the contact associated with contact 1.
The third mode is as follows: and voice interaction is carried out between the earphone and the user, and the terminal equipment selects to carry out dialing operation on one of the three contacts by inquiring the user. In fig. 4c, the terminal device receives an instruction that the user wants to dial the contact 1 through voice interaction with the user, and the terminal device performs a dialing operation according to the contact associated with the contact 1. Through voice interaction with the user, the user experience is emphasized, and the user experience degree is improved.
It should be noted that: when the interface content of the terminal device includes at least two contact person avatars, the terminal device can select a contact person associated with one contact person avatar in the at least two contact person avatars to perform dialing operation in the above manner.
In one possible embodiment, the user content of the terminal device includes at least two contacts, each contact corresponding to at least two phone numbers; or, at least two contact avatars, each contact avatar corresponding to at least two phone numbers. For convenience of description, two contacts and two phone numbers corresponding to each contact are taken as an example for illustration. And if the terminal equipment is detected to be switched to the first mode, the terminal equipment selects one telephone number corresponding to one of the two contacts to carry out dialing operation.
The terminal device selecting a phone number corresponding to one of the two contacts to perform dialing operation may include the following several ways:
the first mode is as follows: the production merchant may be set before the terminal device leaves the factory, or set by the user after leaving the factory. For example, the first ranked object is selected by default for a dialing operation. In fig. 4d, the phone numbers corresponding to contact 1 are phone number 1(xxxx-xxxxxxxx) and phone number 2(xxx-xxxx-xxxx), and the phone numbers corresponding to contact 2 are phone number 3 (xxxx-xxxxxxxxxx) and phone number 4 (xxx-xxxx-xxxx). The telephone number 1 corresponding to the contact 1 is ranked at the first position, and the terminal equipment carries out dialing operation according to the telephone number 1. Through default setting, the method is very consistent with the operation of the terminal equipment and is relatively simple to operate.
The second mode is as follows: the user may access keys on the headset connected to the terminal device. For example, the object may be selected for dialing operation through volume keys on the headset. For example, the volume-down keys sequentially select objects in order from first to last to perform dialing operation; and a volume increasing key for selecting the object in the order from back to front to perform dialing operation. In fig. 4 d. Suppose that the user selects telephone number 1 corresponding to the contact through the volume key, and the terminal device performs dialing operation according to the telephone number 1.
The third mode is as follows: the terminal equipment selects an object to carry out dialing operation by inquiring the user through carrying out voice interaction with the user through the earphone. In fig. 4d, it is known through interaction with the user that the user needs to communicate with the phone number 1 corresponding to the contact 1, and the terminal device performs a dialing operation according to the phone number 1 after receiving the instruction of the user. Through voice interaction with the user, the user experience is emphasized, and the user experience degree is improved.
It should be noted that: when the interface content of the terminal device includes at least two contact person avatars corresponding to at least two telephone numbers, the dialing operation of one telephone number corresponding to one contact person avatar in the at least two contact person avatars can be realized through the above mode.
In one possible embodiment, the interface content of the terminal device comprises a contact name and a telephone number; or, a phone number and a contact avatar; or, contact name and contact avatar; or contact name, phone number, and contact avatar. And each contact person head portrait corresponds to one telephone number. For convenience of description, a contact name and a contact avatar are illustrated as an example. And if the terminal equipment is detected to be switched to the first mode, the terminal equipment selects the contact person associated with the contact person name or the contact person head portrait to carry out dialing operation.
The terminal device selecting the contact associated with the contact name or the contact icon to perform dialing operation may include the following several ways: the first mode is as follows: the production merchant may be set before the terminal device leaves the factory, or set by the user after leaving the factory. For example, the default ordering performs a dialing operation on the first ordered object. In fig. 5a, the contact names are sorted in the first order, and the terminal device performs a dialing operation according to the contact associated with the contact name.
The second mode is as follows: through keys on the headset connected to the terminal device. For example, the volume-down keys sequentially select objects in order from first to last to perform dialing operation; and a volume increasing key for selecting the object in the order from back to front to perform dialing operation. In fig. 5 a. Assuming that the contact name can be selected through a volume key on the earphone, the terminal device performs dialing operation according to the contact related to the contact name.
The third mode is as follows: the terminal equipment carries out voice interaction with the user through the earphone, and the terminal equipment selects to carry out dialing operation on the object by inquiring the user. In fig. 5b, the terminal is arranged to instruct the terminal device to talk to contact 1 by asking the user by voice. And the terminal equipment carries out dialing operation according to the contact person associated with the contact person 1. Through voice interaction with the user, the user experience is emphasized, and the user experience degree is improved.
In one possible embodiment, the interface content of the terminal device comprises a contact name and a telephone number; or, a phone number and a contact avatar; or, contact name and contact avatar; or contact name, phone number, and contact avatar. Each contact name and contact head portrait corresponds to at least two telephone numbers. For the purpose of the aspect description, an example is described in which the contact name and the contact avatar, and the contact avatar respectively correspond to two phone numbers. And if the terminal equipment is detected to be switched to the first mode, the terminal equipment selects a telephone number corresponding to the contact name or the contact head portrait to carry out dialing operation.
The terminal device selecting a phone number corresponding to the contact name or the contact icon to perform dialing operation may include the following several ways:
the first mode is as follows: the production merchant may be set before the terminal device leaves the factory, or set by the user after leaving the factory. For example, the first ranked object is used by default for dialing operations. In fig. 5c, contact 1 corresponds to telephone numbers of telephone number 1(xxxx-xxxxxxxx) and telephone number 2(xxx-xxxx-xxxx), and the contact header corresponds to telephone numbers of telephone number 3 (xxxx-xxxxxxxxxxxx) and telephone number 4 (xxx-xxxx-xxxx). The telephone number 1 corresponding to the contact 1 is ranked at the first position, and the terminal equipment carries out dialing operation according to the telephone number 1. Through default setting, the method is very consistent with the operation of the terminal equipment and is relatively simple to operate.
The second mode is as follows: the user may access keys on the headset connected to the terminal device. For example, the object may be selected for dialing operation through volume keys on the headset. For example, the volume-down keys sequentially select objects in order from first to last to perform dialing operation; and a volume increasing key for selecting the object in the order from back to front to perform dialing operation. In fig. 5 c. Suppose that the user selects the telephone number 1 corresponding to the contact 1 through the volume key, and the terminal device performs dialing operation according to the telephone number 1.
The third mode is as follows: the terminal equipment selects an object to carry out dialing operation by inquiring the user through carrying out voice interaction with the user through the earphone. In fig. 5c, by interacting with the user, the user needs to communicate with the phone number 1 corresponding to the contact 1, and the terminal device performs a dialing operation according to the phone number 1 after receiving the instruction of the user. Through voice interaction with the user, the user experience is emphasized, and the user experience degree is improved.
In one possible embodiment, the interface content of the terminal device comprises text information, picture information, contact names or phone numbers or contact avatars.
In one possible embodiment, the interface content of the terminal device includes picture information and one or more of a contact name, a phone number, and a contact avatar.
In one possible embodiment, the interface content of the terminal device includes text information and one or more of a contact name, a phone number, and a contact avatar. As shown in fig. 6 a. Text information is converted to speech by default. And then in the process of voice reading, the user can be inquired through voice when the contact name is met, or the user can be inquired whether to execute the dialing operation through voice after the interface content is read, and if the user needs to communicate with the contact associated with the contact name, the terminal equipment carries out the dialing operation according to the contact associated with the contact name.
In one possible embodiment, the interface content of the terminal device includes text information and contact names, and each contact name corresponds to a telephone number; or, text messages and telephone numbers; or, the text information and the contact head portrait, wherein each contact head portrait corresponds to one telephone number. In fig. 6b, text information and contact names are illustrated as an example. If the terminal equipment is detected to be switched into the first mode, the terminal equipment can carry out voice interaction with a user, and when the user gives an instruction of converting text information into voice, the terminal equipment converts the text information into voice; or, when the user gives an instruction for dialing, the terminal device performs dialing operation according to the contact associated with the contact name.
In one possible embodiment, the interface content of the terminal device comprises text information and contact names, wherein each contact name corresponds to at least two telephone numbers; or, text messages and telephone numbers; or the text information and the contact head portrait, wherein each contact head portrait corresponds to at least two telephone numbers. In fig. 6c, the text information and contact 1, contact 1 corresponding to phone number 1(xxxx-xxxxxxxx) and phone number 2(xxx-xxxx-xxxx) are illustrated as an example. If the terminal equipment is detected to be switched into the first mode, the terminal equipment can carry out voice interaction with a user, and when the user gives an instruction of converting text information into voice, the terminal equipment converts the text information into voice; or, when the user gives an instruction of dialing the telephone number 1 corresponding to the contact 1, the terminal device performs dialing operation according to the telephone number 1.
In one possible embodiment, the interface content of the terminal device includes text information, contact names and telephone numbers, wherein each contact name corresponds to one telephone number; or, the text information, the telephone number and the contact person head portrait, wherein each contact person head portrait corresponds to one telephone number; or, the text information, the contact names and the contact head portraits, wherein each contact name and each contact head portraits respectively correspond to one telephone number; or the text information, the contact names, the telephone numbers and the contact head images, wherein each contact name and each contact head image respectively correspond to one telephone number. In fig. 6d, text information, contact names, phone numbers and contact avatars are illustrated as an example. If the terminal equipment is detected to be switched into the first mode, the terminal equipment can carry out voice interaction with a user, and when the user gives an instruction of converting text information into voice, the terminal equipment converts the text information into voice; or, when the user gives an instruction of dialing the telephone number, the terminal equipment performs dialing operation according to the contact person associated with the telephone number.
In one possible embodiment, the interface content of the terminal device includes text information, contact names and telephone numbers, wherein each contact name corresponds to at least two telephone numbers; or, the text information, the telephone numbers and the contact person head portraits, wherein each contact person head portraits corresponds to at least two telephone numbers; or, the text information, the contact names and the contact head portraits, wherein each contact name and each contact head portraits respectively correspond to at least two telephone numbers; or the text information, the contact names, the telephone numbers and the contact head portraits, wherein each contact name and each contact head portraits respectively correspond to at least two telephone numbers. In fig. 6e, an example is illustrated of text information, contact 1, a phone number and a contact avatar, contact 1 corresponding to phone number 1(xxxx-xxxxxxxx) and phone number 2 (xxx-xxxx-xxxx). If the terminal equipment is detected to be switched into the first mode, the terminal equipment can carry out voice interaction with a user, and when the user gives an instruction of converting text information into voice, the terminal equipment converts the text information into voice; or, when the user gives an instruction of dialing the telephone number 1 corresponding to the contact 1, the terminal device performs dialing operation according to the telephone number 1.
In one possible embodiment, the interface content of the terminal device includes picture information including one or more of text information, contact names, phone numbers, and contact avatars. And if the terminal equipment is detected to be switched to the first mode, the terminal equipment converts the text information in the picture information into voice information, and the terminal equipment carries out dialing operation according to the contact person related to the contact person name, the telephone number and the contact person head portrait.
In one possible embodiment, the picture information includes text information, and a contact name or phone number or contact avatar.
In one possible embodiment, the picture information comprises a contact name or phone number or contact avatar
It should be noted that, from at least two contact names, each contact corresponds to a telephone number or a plurality of telephone numbers; or, at least two contact person avatars, each contact person avatar corresponding to one phone number or a plurality of phone numbers; or, one telephone number is selected from at least two telephone numbers for dialing, and the default setting of the terminal equipment system can be adopted; or, a volume key of the earphone is adopted; alternatively, voice communication is employed. The specific implementation process can refer to the corresponding embodiment.
In the embodiment of the present invention, the contact name may include a contact stored in the terminal device; but also include, but are not limited to, instant messaging, etc., account contacts registered with, or bound to, a telephone number. The contact photo can comprise: the head portrait set by the contact person stored in the terminal equipment by the user; but also include, but are not limited to, instant messaging, etc., an avatar of an account contact registered with, or bound to, a telephone number. Instant Message (IM) refers to a service capable of both sending and receiving internet messages and the like.
In the embodiment of the present invention, the first mode refers to an operation mode in which the terminal device accesses an earphone or an earphone function inside the terminal device.
In a possible embodiment, before the step that the terminal device converts the interface content of the terminal device into voice information for voice output, the method further includes: the terminal equipment detects whether the interface content comprises voice information of a terminal equipment system; if the interface content does not include the language information of the terminal equipment system, converting the interface content into target voice information; or, converting the interface content into language information corresponding to the user requirement.
Specifically, as shown in fig. 7, if the interface content of the terminal device does not include the system voice information of the terminal device, the terminal device converts the interface content of the terminal device into the voice information of the terminal device system according to the default setting of the terminal device; or the terminal device communicates with the user, if the user issues an instruction for translation, in one case, the user gives a target language, and the terminal device converts the interface content of the terminal device into the target language, in the other case, the user does not give the target voice, and the terminal device converts the interface content of the terminal device into the voice information of the terminal device system; and if the user does not issue the translation instruction, directly converting the interface content of the terminal equipment into the voice information.
It should be noted that the language information of the interface content of the terminal device is consistent with the language information of the terminal device system, for example, the system language of the terminal device and the voice of the text information are both simplified chinese. The terminal device may not perform the conversion operation, and directly convert the interface content of the terminal device into text information.
In the embodiment of the present invention, the language information may refer to a voice, or may refer to a type of a language, for example, english, chinese, and the like. Or a combination of both.
Step 203: and the terminal equipment executes corresponding operation according to the interface content of the terminal equipment.
In a possible embodiment, when it is detected that the operating mode of the terminal device is switched to the first mode, the terminal device detects interface content of the terminal device, and converts the interface content of the terminal device into voice information for voice output, where the first mode is an operating mode in which the terminal device enters an earphone or starts an earphone function of the terminal device. And when the working mode of the detection terminal equipment is switched to the first mode, executing the operation of detecting the interface of the terminal equipment. And compared with the method, when the working mode of the terminal equipment is detected to be switched into the first mode, the power consumption of the terminal equipment is saved.
According to the embodiment of the invention, by detecting the working mode of the terminal equipment, when the working mode of the terminal equipment is switched to the first mode, the voice function of the terminal equipment can be triggered, and the interface content of the terminal equipment is converted into the voice information to be output. And then improve operating efficiency, reduce the operation that the user study and memory a large amount of gestures correspond, improve user's experience degree.
Fig. 8 is a schematic structural diagram of another terminal device according to an embodiment of the present invention. As shown in fig. 8, the terminal device includes: an acquisition unit 810, a detection unit 820, an execution unit 830, an output unit 840 and a processing unit 850.
It will be appreciated by those skilled in the art that fig. 8 merely shows a simplified design of the structure of the terminal device. The terminal structure shown in fig. 8 does not constitute a limitation of the terminal, and the terminal device may include more or less components than those shown in fig. 8, for example, the terminal device may further include a storage unit for storing instructions corresponding to a communication algorithm.
In fig. 8, an obtaining unit 810, configured to obtain interface content presented on an interface of a terminal device; a detecting unit 820, configured to detect an operating mode of the terminal device; if the detecting unit 820 detects that the operation mode of the terminal device is switched to the first mode, the executing unit 830 converts the interface content of the terminal device into voice information, and then the output unit 850 outputs the voice. The first mode is a working mode that the terminal equipment is accessed into the earphone or the earphone function of the terminal equipment is started.
In the terminal device provided by the embodiment of the present invention, when the detection unit 820 detects that the working mode of the terminal device is switched to the first mode, the execution unit 830 converts the interface content of the terminal device into the voice information, and the output unit 840 outputs the voice information. The embodiment of the invention can improve the operation efficiency, reduce the operation of learning and memorizing a large number of gestures by the user and improve the experience of the user.
In one possible embodiment, the processing unit is configured to: a type of an interface element in the interface content is determined.
In one possible embodiment, the types of interface elements in the interface content include: one or more of text information, picture information, contact names, phone numbers, and contact avatars.
In one possible embodiment, the type of interface element in the interface content includes picture information including one or more of text information, contact name, phone number, and contact avatar.
In one possible embodiment, if the interface content of the terminal device includes text information; converting the interface content of the terminal equipment into voice information for outputting, comprising: and converting the text information into voice information, and outputting the voice.
In one possible embodiment, if the interface content of the terminal device includes one or more of a contact name, a phone number and a contact image; converting the interface content of the terminal equipment into voice information for outputting, comprising: and dialing according to the contact person associated with one or more of the contact person name, the telephone number and the contact person image.
In a possible embodiment, the step of converting the interface content into voice information for voice output further comprises: detecting whether the interface content comprises system voice information of the terminal equipment; if the interface content does not include the system language information of the terminal equipment, converting the interface content of the terminal equipment into target voice information; or, converting the interface content into language information corresponding to the user requirement.
In a possible embodiment, the terminal device further comprises an input unit 860, said input unit 860 being configured to receive a voice input of the user during a voice interaction of the terminal device with the user.
According to the method and the device for triggering the voice function provided by the embodiment of the invention, when the working mode of the terminal device is switched to the first mode, the interface content of the terminal device is converted into the voice information for voice output, and the first mode is the working mode that the terminal device is accessed into an earphone or the earphone function of the terminal device is started. The embodiment of the invention can improve the operation efficiency, reduce the operation of learning and memorizing a large number of gestures by the user and improve the experience of the user.
Those of skill would further appreciate that the various illustrative components and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It will be understood by those of ordinary skill in the art that all or part of the steps in the method for implementing the above embodiments may be implemented by a program, and the program may be stored in a computer-readable storage medium, where the storage medium is a non-transitory (non-transitory) medium, such as a random access memory, a read-only memory, a flash memory, a hard disk, a solid state drive, a magnetic tape (magnetic tape), a floppy disk (floppy disk), an optical disk (optical disk) and any combination thereof.
The above description is only for the preferred embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (12)

1. A method of triggering a voice function, comprising:
acquiring interface content presented on an interface of terminal equipment;
if the working mode of the terminal equipment is detected to be switched into a first mode, converting the interface content into voice information to carry out voice output, and operating the interface content according to the voice information, wherein the first mode is a working mode that the terminal equipment is accessed into an earphone or starts the earphone function of the terminal equipment; the operating the interface content according to the voice information comprises: when the contact name or the contact head portrait in the interface content corresponds to at least two telephone numbers, the terminal equipment interacts with a user through an earphone and selects one telephone number for dialing operation; the terminal equipment interacts with the user through the earphone and comprises the following steps: interact with the user through keys on the headset or interact with the user through the headset in a voice manner.
2. The method according to claim 1, wherein after the step of obtaining the interface content presented on the interface of the terminal device, the method further comprises: determining the type of the interface element in the interface content.
3. The method of claim 2, wherein the types of page elements in the interface content comprise: one or more of text information, picture information, contact names, phone numbers, and contact avatars.
4. The method of claim 2, wherein the type of page element in the interface content comprises picture information including one or more of text information, contact name, phone number, and contact avatar.
5. The method according to any one of claims 2 to 4, wherein if the interface content includes a text message;
converting the interface content into voice information for outputting, including:
and converting the text information into voice information, and outputting the voice.
6. The method of claim 1, wherein the step of converting the interface content into voice information for voice output further comprises:
detecting whether the interface content comprises system voice information of the terminal equipment;
if the interface content does not include the system language information of the terminal equipment, converting the interface content into target voice information; alternatively, the first and second electrodes may be,
and converting the interface content into language information corresponding to the user requirement.
7. A terminal, comprising:
a memory for storing program instructions;
a processor for performing the following operations according to program instructions stored in the memory:
acquiring interface content presented on an interface of terminal equipment;
if the working mode of the terminal equipment is detected to be switched into a first mode, converting the interface content into voice information to carry out voice output, and operating the interface content according to the voice information, wherein the first mode is a working mode that the terminal equipment is accessed into an earphone or starts the earphone function of the terminal equipment; the operating the interface content according to the voice information comprises: when the contact name or the contact head portrait in the interface content corresponds to at least two telephone numbers, the terminal equipment interacts with a user through an earphone and selects one telephone number for dialing operation; the terminal equipment interacts with the user through the earphone and comprises the following steps: interact with the user through keys on the headset or interact with the user through the headset in a voice manner.
8. The terminal of claim 7, wherein the processor is further configured to perform the following operations according to program instructions stored in the memory: after the step of obtaining the interface content presented on the interface of the terminal equipment, determining the type of the interface element in the interface content.
9. The terminal of claim 8, wherein the types of interface elements in the interface content comprise: one or more of text information, picture information, contact names, phone numbers, and contact avatars.
10. The terminal of claim 8, wherein the type of interface element in the interface content comprises picture information, the picture information comprising one or more of text information, contact name, phone number, and contact avatar.
11. A terminal according to any of claims 8 to 10, wherein if the interface content comprises a text message; the processor is configured to perform the following operations according to program instructions stored in the memory: converting the interface content into voice information for outputting, including:
and converting the text information into voice information, and outputting the voice.
12. The terminal of claim 7, wherein the processor is further configured to perform the following operations according to program instructions stored in the memory: before converting the interface content into voice information for voice output, detecting whether the interface content comprises system voice information of the terminal equipment; if the interface content does not include the system language information of the terminal equipment, converting the interface content into target voice information; or converting the interface content into language information corresponding to the user requirement.
CN201780004960.7A 2017-01-26 2017-06-12 Method and equipment for triggering voice function Active CN108605074B (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN2017100618417 2017-01-26
CN201710061841 2017-01-26
PCT/CN2017/087984 WO2018137306A1 (en) 2017-01-26 2017-06-12 Method and device for triggering speech function

Publications (2)

Publication Number Publication Date
CN108605074A CN108605074A (en) 2018-09-28
CN108605074B true CN108605074B (en) 2021-01-05

Family

ID=62978868

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201780004960.7A Active CN108605074B (en) 2017-01-26 2017-06-12 Method and equipment for triggering voice function

Country Status (2)

Country Link
CN (1) CN108605074B (en)
WO (1) WO2018137306A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113835669B (en) * 2020-06-24 2024-03-29 青岛海信移动通信技术有限公司 Electronic equipment and voice broadcasting method thereof

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103959751A (en) * 2011-09-30 2014-07-30 苹果公司 Automatically adapting user interfaces for hands-free interaction
CN104142778A (en) * 2013-09-25 2014-11-12 腾讯科技(深圳)有限公司 Text processing method and device as well as mobile terminal
CN104469027A (en) * 2014-10-31 2015-03-25 百度在线网络技术(北京)有限公司 Call processing method and device
CN105208232A (en) * 2015-10-10 2015-12-30 网易(杭州)网络有限公司 Method and device for automatically making call
CN105791502A (en) * 2016-04-28 2016-07-20 北京小米移动软件有限公司 Contact person searching method and apparatus thereof
WO2016178984A1 (en) * 2015-05-01 2016-11-10 Ring-A-Ling, Inc. Methods and systems for management of video and ring tones among mobile devices
US9538226B2 (en) * 2013-12-06 2017-01-03 Samsung Electronics Co., Ltd. Method for operating moving pictures and electronic device thereof

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7650170B2 (en) * 2004-03-01 2010-01-19 Research In Motion Limited Communications system providing automatic text-to-speech conversion features and related methods
CN102857637B (en) * 2012-09-03 2016-03-23 小米科技有限责任公司 A kind of associated person information acquisition methods, system and device
CN104104767B (en) * 2013-04-07 2018-05-01 腾讯科技(深圳)有限公司 The treating method and apparatus of associated person information in portable intelligent terminal
CN104123114A (en) * 2013-04-27 2014-10-29 腾讯科技(深圳)有限公司 Method and device for playing voice
CN103747511B (en) * 2014-01-07 2018-03-09 加一联创电子科技有限公司 information broadcasting method and system
CN104184867A (en) * 2014-08-29 2014-12-03 广东欧珀移动通信有限公司 Intelligent mobile terminal incoming-call linkman message voice broadcasting method and system
CN104346038B (en) * 2014-09-24 2018-05-01 广东欧珀移动通信有限公司 End message read method and system
CN104461346B (en) * 2014-10-20 2017-10-31 天闻数媒科技(北京)有限公司 A kind of method of visually impaired people's Touch Screen, device and intelligent touch screen mobile terminal
CN104461545B (en) * 2014-12-12 2018-09-07 百度在线网络技术(北京)有限公司 Content in mobile terminal is provided to the method and device of user
CN105657174A (en) * 2016-01-26 2016-06-08 努比亚技术有限公司 Voice converting method and terminal
CN105955609A (en) * 2016-04-25 2016-09-21 乐视控股(北京)有限公司 Voice reading method and apparatus
CN106339160A (en) * 2016-08-26 2017-01-18 北京小米移动软件有限公司 Browsing interactive processing method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103959751A (en) * 2011-09-30 2014-07-30 苹果公司 Automatically adapting user interfaces for hands-free interaction
CN104142778A (en) * 2013-09-25 2014-11-12 腾讯科技(深圳)有限公司 Text processing method and device as well as mobile terminal
US9538226B2 (en) * 2013-12-06 2017-01-03 Samsung Electronics Co., Ltd. Method for operating moving pictures and electronic device thereof
CN104469027A (en) * 2014-10-31 2015-03-25 百度在线网络技术(北京)有限公司 Call processing method and device
WO2016178984A1 (en) * 2015-05-01 2016-11-10 Ring-A-Ling, Inc. Methods and systems for management of video and ring tones among mobile devices
CN105208232A (en) * 2015-10-10 2015-12-30 网易(杭州)网络有限公司 Method and device for automatically making call
CN105791502A (en) * 2016-04-28 2016-07-20 北京小米移动软件有限公司 Contact person searching method and apparatus thereof

Also Published As

Publication number Publication date
CN108605074A (en) 2018-09-28
WO2018137306A1 (en) 2018-08-02

Similar Documents

Publication Publication Date Title
CN109375890B (en) Screen display method and multi-screen electronic equipment
US9298519B2 (en) Method for controlling display apparatus and mobile phone
US8552996B2 (en) Mobile terminal apparatus and method of starting application
US11237703B2 (en) Method for user-operation mode selection and terminals
CN108089891B (en) Application program starting method and mobile terminal
CN108958580B (en) Display control method and terminal equipment
CN107943374B (en) Method for starting application program in foldable terminal and foldable terminal
US10956025B2 (en) Gesture control method, gesture control device and gesture control system
CN111371949A (en) Application program switching method and device, storage medium and touch terminal
CN108446058B (en) Mobile terminal operation method and mobile terminal
CN108174103B (en) Shooting prompting method and mobile terminal
CN109491738B (en) Terminal device control method and terminal device
CN111104029B (en) Shortcut identifier generation method, electronic device and medium
JP2017527928A (en) Text input method, apparatus, program, and recording medium
CN111327458A (en) Configuration information sharing method, terminal device and computer readable storage medium
US11144422B2 (en) Apparatus and method for controlling external device
CN108476339B (en) Remote control method and terminal
EP3699743B1 (en) Image viewing method and mobile terminal
EP2843532A1 (en) Electronic device and method
CN109683768B (en) Application operation method and mobile terminal
CN110865745A (en) Screen capturing method and terminal equipment
CN109683802B (en) Icon moving method and terminal
CN108170329B (en) Display control method and terminal equipment
CN109491741B (en) Method and terminal for switching background skin
CN110769303A (en) Playing control method and device and mobile terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant