WO2018157499A1 - 一种语音输入的方法和相关设备 - Google Patents
一种语音输入的方法和相关设备 Download PDFInfo
- Publication number
- WO2018157499A1 WO2018157499A1 PCT/CN2017/087662 CN2017087662W WO2018157499A1 WO 2018157499 A1 WO2018157499 A1 WO 2018157499A1 CN 2017087662 W CN2017087662 W CN 2017087662W WO 2018157499 A1 WO2018157499 A1 WO 2018157499A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- terminal
- voice input
- message
- voice
- input function
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 75
- 238000012905 input function Methods 0.000 claims abstract description 122
- 238000004891 communication Methods 0.000 claims description 37
- 230000006870 function Effects 0.000 claims description 18
- 230000001960 triggered effect Effects 0.000 claims description 18
- 238000012545 processing Methods 0.000 claims description 12
- 230000004913 activation Effects 0.000 claims description 9
- 230000005236 sound signal Effects 0.000 claims description 9
- 230000000977 initiatory effect Effects 0.000 claims description 5
- 238000004590 computer program Methods 0.000 claims description 4
- 230000008569 process Effects 0.000 abstract description 30
- 230000003213 activating effect Effects 0.000 abstract 2
- 230000003993 interaction Effects 0.000 description 17
- 238000010586 diagram Methods 0.000 description 14
- 241001422033 Thestylus Species 0.000 description 12
- 238000013461 design Methods 0.000 description 10
- 230000009471 action Effects 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 8
- 238000007726 management method Methods 0.000 description 7
- 239000000284 extract Substances 0.000 description 6
- 238000001514 detection method Methods 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000003825 pressing Methods 0.000 description 3
- 230000001413 cellular effect Effects 0.000 description 2
- 238000010295 mobile communication Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000010897 surface acoustic wave method Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/72—Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
- H04M1/725—Cordless telephones
Definitions
- the present application relates to the field of voice processing technologies, and in particular, to a voice input method and related device.
- smart terminal devices including smart phones, tablets, wearable devices, etc.
- intelligent voice assistants to provide users with voice transfer and recording finishing services.
- the user uses the smart terminal device to record the voice, and the smart terminal device uploads the voice to the cloud or stores it locally, and the recorded voice is processed by the cloud or the local voice processing engine, and converted into text for presentation to the user.
- the high-efficiency, high-accuracy audio transfer function saves users a lot of time and improves work efficiency.
- the existing intelligent voice input scheme is: the user opens the smart terminal device, starts the voice assistant application (application, APP), clicks the voice input button in the APP, and then starts talking.
- the voice assistant starts recording voice, and the voice assistant APP recognizes the input voice through voice recognition technology and converts the voice into text.
- the user can also set in the voice assistant APP, and set the voice assistant as a notification bar shortcut tool. After the user opens the smart terminal device, the voice input is quickly performed through the notification bar shortcut tool.
- the embodiment of the present application provides a method for voice input, which is used to simplify the process of starting a voice input function.
- a first aspect of the present application provides a method for voice input, including: establishing a connection relationship between a first terminal and a second terminal, so that information can be transmitted between the first terminal and the second terminal. After the connection relationship is established, the first terminal receives the first message sent by the second terminal; the first terminal extracts related information from the first message, and starts a voice input function on the first terminal according to the first message; The voice data is acquired by the voice input function, and then the first terminal identifies the acquired voice data and converts it into text information.
- the voice input function of the first terminal is quickly started by simple interaction with the second terminal, and the process of starting the voice input function is simplified.
- the first terminal according to the first message, the voice input function is: the first terminal according to the first A message initiates a voice input function on the first terminal; when the first message does not include voice data, the first terminal collects a voice signal around the first terminal through the voice input function of the first terminal, and converts the voice signal into voice data.
- the embodiment of the present application provides a process for collecting a voice signal by using a voice input function and generating voice data, which increases the achievability and operability of the embodiment of the present application.
- the first terminal according to the first message, the voice input function is: the first terminal is configured according to the first A message initiates a voice input function; when the first message includes voice data, the first terminal receives the voice data carried in the first message by using the voice input function of the first terminal.
- the embodiment of the present application provides a process of directly acquiring voice data from a first message, adding The achievability and operability of the embodiments of the present application.
- the method further includes: The first terminal detects an operation instruction triggered on the first terminal; the first terminal edits the information according to the operation instruction.
- the embodiment of the present application adds a process of editing the converted text information, which increases the achievability and integrity of the embodiment of the present application.
- the method further includes: the first terminal displaying the text information.
- the embodiment of the present application adds a process of displaying the converted text information, which makes the embodiment of the present application more logical.
- a second aspect of the present application provides a method for voice input, including: when a user performs a corresponding operation on a second terminal, the second terminal detects an operation instruction triggered by the user on the second terminal; Comparing the detected operation instruction with the database, and determining whether the operation instruction is an instruction to start a voice input function; if the operation instruction is an instruction to start a voice input function, the second terminal sends the instruction to the first terminal The first message is used to initiate a voice input function.
- the simple interaction with the first terminal enables the first terminal to quickly activate the voice input function, thereby simplifying the process of starting the voice input function.
- the method further includes: the second terminal collecting voice data; the second terminal generating a first message, where the first message carries the collected voice data.
- the second terminal collects the voice data, and sends the voice data and the control command for starting the voice input function of the first terminal to the first terminal, which is an implementation manner of the embodiment of the present application.
- the method further includes: the second terminal An operation instruction of the user on the first terminal is performed.
- the embodiment of the present application provides a process for the second terminal to perform a corresponding operation according to the user's selection, so that the embodiment of the present application is more complete in the steps.
- a third aspect of the present application provides a device for voice input, including: a receiving unit, configured to receive a first message sent by a second terminal, and an activation unit, configured to start a voice input function according to the first message; And a unit, configured to identify the acquired voice data and convert the text data into text information, where the voice data is acquired by the voice input function.
- the voice input function of the first terminal is quickly started by simple interaction with the second terminal, and the process of starting the voice input function is simplified.
- the starting unit includes: a first activation subunit, configured to start a voice input function according to the first message; a subunit for collecting voice data through the voice input function.
- the embodiment of the present application provides a process for collecting a voice signal by using a voice input function and generating voice data, which increases the achievability and operability of the embodiment of the present application.
- the starting unit includes: a second activation subunit, configured to start a voice input function according to the first message; Subunit for communication The voice input function is received by the voice input function, and the voice data is included in the first message.
- the embodiment of the present application provides a process for directly acquiring voice data from a first message, which increases the achievability and operability of the embodiment of the present application.
- the device further includes: a detecting unit, configured to detect an operation instruction triggered on the first terminal; And for editing the information of the text according to the operation instruction.
- the embodiment of the present application adds a process of editing the converted text information, which increases the achievability and integrity of the embodiment of the present application.
- the device further includes: a display unit, configured to display the text information.
- the embodiment of the present application adds a process of displaying the converted text information, which makes the embodiment of the present application more logical.
- a fourth aspect of the present application provides a device for voice input, including: a detecting unit, configured to detect an operation instruction triggered by a user on the second terminal; and a determining unit, configured to determine whether the operation instruction is started The instruction of the voice input function; the sending unit, if the operation instruction is an instruction to start the voice input function, is used to send a first message to the first terminal, where the first message is used to start the voice input function.
- the simple interaction with the first terminal enables the first terminal to quickly activate the voice input function, thereby simplifying the process of starting the voice input function.
- the device further includes: an acquiring unit, configured to collect voice data, and a generating unit, configured to generate a first message, where The first message carries the collected voice data.
- the second terminal collects the voice data, and sends the voice data and the control command for starting the voice input function of the first terminal to the first terminal, which is an implementation manner of the embodiment of the present application.
- the device further includes: an executing unit, configured to execute an operation instruction of the user on the first terminal.
- the embodiment of the present application provides a process for the second terminal to perform a corresponding operation according to the user's selection, so that the embodiment of the present application is more complete in the steps.
- a fifth aspect of the embodiments of the present application provides a voice input device, including: a processor, a power module, a sensing module, a control module, a communication module, and a storage module connected through a bus; the processor is stored or executed by being stored or executed.
- the power module is used for powering and power management of the device;
- the sensing module is used for Receiving input numeric or character information, and generating signal input related to user settings and function control of the device;
- the control module is configured to generate a control command;
- the communication module is configured to communicate with the network and other devices through wireless communication;
- the storage module is used to store software programs, instruction sets, and data.
- a sixth aspect of the embodiments of the present application provides a voice input device, including: a processor, a power module, an audio module, a display module, a communication module, an input module, and a storage module connected through a bus; Executing a software program or a set of instructions stored in the storage module, and calling data stored in the storage module, performing various functions of the device and processing data; the power module is used to supply power and power management to the device; The module is configured to convert the collected sound signal into audio data, and may also transmit the converted electrical signal of the received audio data to a speaker, and convert the sound signal into a sound signal output; the display module is configured to display the converted text information.
- the communication module is used to pass wireless The communication is in communication with the network and other devices;
- the input module is configured to receive input numeric or character information, and generate key signal input related to user settings and function control of the device;
- the storage module is configured to store a software program, an instruction set And data.
- the voice input function of the first terminal is quickly started by simple interaction with the second terminal, and the process of starting the voice input function is simplified.
- a seventh aspect of embodiments of the present application provides a computer readable storage medium having instructions stored therein that, when executed on a computer, cause the computer to perform the methods described in the above aspects.
- An eighth aspect of the embodiments of the present application provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the methods described in the above aspects.
- the first terminal receives the first message sent by the second terminal; the first terminal starts a voice input function according to the first message; and the first terminal accesses the obtained voice data.
- the recognition is performed and converted into text information, which is acquired by the voice input function.
- the voice input function of the first terminal is quickly started by simple interaction with the second terminal, and the process of starting the voice input function is simplified.
- FIG. 1 is a schematic diagram of a system architecture of an embodiment of the present application.
- FIG. 2 is a schematic diagram of an embodiment of a method for voice input in an embodiment of the present application
- FIG. 3 is a schematic diagram of an embodiment of another method for voice input in an embodiment of the present application.
- FIG. 4 is a schematic diagram of a circle operation in a touch operation command according to an embodiment of the present application.
- FIG. 5 is a schematic diagram of a click operation in a touch operation instruction according to an embodiment of the present application.
- FIG. 6 is a schematic diagram of a double-click operation in a touch operation instruction according to an embodiment of the present application.
- FIG. 7 is a schematic diagram of a line drawing operation in a touch operation instruction according to an embodiment of the present application.
- FIG. 8 is a schematic diagram of an embodiment of a device for voice input according to an embodiment of the present application.
- FIG. 9 is a schematic diagram of another embodiment of an apparatus for voice input in an embodiment of the present application.
- FIG. 10 is a schematic diagram of another embodiment of an apparatus for voice input in an embodiment of the present application.
- FIG. 11 is a schematic diagram of an embodiment of another voice input device in an embodiment of the present application.
- FIG. 12 is a schematic diagram of interaction between devices input by voice in the embodiment of the present application.
- the embodiment of the present application provides a voice input method and related device, which are used to simplify the process of starting a voice input function.
- the first terminal and the second terminal may be connected by using a Bluetooth mode, and may be connected by other means, which is not limited herein, so that communication between the first terminal and the second terminal is possible.
- the first terminal may be a device such as a notebook, a tablet or a mobile phone
- the second terminal may be a device with a start switch such as a stylus, a voice recorder or a wearable device, and the start switch is used to initiate voice input of the connected first terminal.
- a start switch such as a stylus, a voice recorder or a wearable device
- the start switch is used to initiate voice input of the connected first terminal.
- the user activates the voice input function on the first terminal through the second terminal, simplifies the startup step, and performs voice input efficiently and quickly, so that the user obtains the text information converted by the voice data.
- a method for voice input in the embodiment of the present application includes:
- the second terminal detects an operation instruction triggered by the user on the second terminal.
- the second terminal senses that the user operates on the second terminal, and the second terminal detects an operation instruction triggered by the user on the second terminal.
- the second terminal detects the operation instruction of the user by using the built-in sensing module.
- sensing module is a switch button and the second terminal is a stylus
- the switch button is mounted on the cap or pen of the stylus.
- the sensing module detects the action of the user pressing the switch, and transmits the operation instruction corresponding to the action to the processor of the stylus pen.
- the sensing module is a touch panel and the second terminal is a stylus
- the touch panel is mounted on the cap or pen of the stylus.
- each operation instruction may correspond to an action of the user, and the operation instruction may be processed by hardware of the device, and the following control instruction may be processed by hardware of the device.
- the touch panel can detect different control commands, and different touch actions of the touch panel can be set to different control commands.
- the user's click operation, double-click operation, and long-press operation can be set to click the control command, double-click the control command, and long-press the control command.
- the specific type of the touch panel is, for example, a resistive touch panel or a capacitive touch panel, which is not limited herein.
- the second terminal determines whether the operation instruction is an instruction to start a voice input function.
- the second terminal After the second terminal detects the operation instruction of the user, the second terminal determines whether the detected operation instruction is an instruction to start the voice input function.
- the judgment standard is different according to the form of the sensing module of the second terminal.
- the sensing module is a switch button
- the user closes the switch button
- the switch circuit where the switch button is located is closed
- the second terminal determines that the close command is an instruction to start the voice input function
- the user turns off the switch button
- the switch circuit where the switch button is located When disconnected, the second terminal determines that the disconnection command is an instruction to turn off the voice input function.
- the sensing module is a touch panel
- the user clicks on the touch panel, and the second terminal determines that the click command is an instruction to start the normal voice input function, and converts the voice into text
- the second The terminal determines that the double-click instruction is an instruction to start the conference voice input function, converts the voice into a text, and performs layout according to the conference minutes template, and presents it to the user.
- the second terminal may also be provided with other criteria for judging, which is not limited herein.
- the second terminal sends a first message to the first terminal, where the first message is used to start the voice input function.
- the second terminal determines, according to a preset criterion, whether the detected operation instruction is an instruction to start the voice input function, and if the operation instruction is an instruction to start the voice input function, the second terminal The first message is generated according to the preset rule, and the second terminal sends a first message to the first terminal, where the first message is used to start the voice input function, and the first terminal receives the first message sent by the second terminal. If the operation instruction is not an instruction to start the voice input function, other corresponding operations are performed, which are not limited herein.
- the second terminal generates a control command by using the control module, and the processor transmits the control command to the communication module, where the communication module is encapsulated in the first message and transmitted to the first terminal that is paired with the second terminal, such as Tablet or smartphone, etc. If the voice data collected by the second terminal is also sent to the first terminal, the second terminal encapsulates the collected voice data in the first message.
- the first terminal and the second terminal need to establish a connection relationship, and the connection manner between the first terminal and the second terminal may be multiple. Connected by wired network, wireless network, or Bluetooth connection, you can choose according to the actual situation.
- the specific connection method is not limited here.
- the first terminal starts a voice input function according to the first message.
- the first terminal After the first terminal receives the first message, the first terminal parses the first message, obtains a control command carried by the first message, and starts a voice input function of the first terminal according to the control carried by the first message, where the control command is The second terminal generates.
- the first terminal needs to extract the voice data from the first message.
- the first terminal identifies the obtained voice data and converts the file into text information.
- the first terminal identifies the acquired voice data and converts it into text information, which is obtained by the voice input function.
- the first method is as follows: After the voice input function is enabled on the first terminal, the voice around the first terminal is collected through the built-in voice module. Manner 2: The first terminal extracts voice data from the received first message, that is, the first terminal receives voice data from the second terminal. According to the actual situation of the first terminal, the selection is made, which is not limited herein.
- the first terminal can recognize and convert the voice data into text information through a voice input function.
- the first terminal extracts keywords in the voice data by using a voice recognition technology, and matches the preset reference information in the database, and converts the successfully matched voice into text information. It can be understood that the accuracy of the speech recognition technology currently applied to the first terminal cannot reach 100%. In the process of converting the voice data into text information, the text information may be incorrectly identified. Therefore, after the voice input function converts the voice data into text information, the user may need to edit the text information.
- the locale of the text information can be set according to actual conditions.
- the voice input function can convert the voice data into text information, and the text information can be used including but not limited to: Chinese, English, and the like.
- the first terminal displays text information.
- the first terminal displays the obtained text information.
- the first terminal After completing the identification of the voice data, the first terminal transmits the obtained text information to the display module, and passes the The display module displays the text information. For example, the first terminal displays the text information on the screen of the first terminal.
- the text information may also be arranged in a certain typesetting manner according to a preset environment, and then presented to the user on the screen.
- the second terminal performs an operation instruction of the user on the first terminal.
- the user finds that the converted text information is not required by the user, the user performs some operation instruction on the text information through the second terminal, and the second terminal executes the user in the first An operation instruction on a terminal. For example, the user is dissatisfied with the displayed text information, selects the target text information that needs to be modified, such as drawing a line below the target text information, or circle the target text information.
- the first terminal detects an operation instruction triggered on the first terminal.
- the first terminal detects an operation instruction triggered on the first terminal.
- the operation command may be used by the user to draw, click, double-click, draw a line or other touch operation on the touch panel of the first terminal (tablet, smart phone, etc.) using the second terminal (which may be a stylus). It can be understood that the selection of the specific touch operation is flexible as long as a similar effect can be achieved.
- the touch panel of the tablet detects the input of the stylus, and collects pressure touch information when the touch input is input, including pressure value and coordinate information, and
- the pressure touch information is processed according to preset rules.
- the preset rule may be: when the touch panel of the tablet detects the line drawing operation instruction, the text in the corresponding coordinate range is eliminated; when the circle operation instruction is detected, the text in the corresponding coordinate range is Eliminate and move the cursor to the erased text. The user fills in the correct text using the stylus or the tablet input method.
- the click operation command is detected, the text in the corresponding coordinate range is highlighted. ;
- a double-click operation command is detected, the text in the corresponding coordinate range is copied to the clipboard. It can also be other preset rules, which are not limited here.
- the first terminal receives an operation instruction of the second terminal through the built-in input module, and sends the detected operation instruction to the processor.
- the input module of the first terminal may be a touch panel, and the touch panel may be of various types such as resistive, capacitive, infrared, and surface acoustic waves.
- the touch panel collects user operation instructions and sends them to the processor.
- the first terminal edits the information according to the operation instruction.
- the first terminal edits the information according to the operation instruction, so that the text information presented to the user is text information required by the user.
- the first terminal deletes the target text information selected by the user.
- the operation instruction detected by the first terminal is a copy operation instruction (ie, a double-click operation instruction)
- the first terminal copies the target text information selected by the user.
- the first terminal is a tablet computer
- the second terminal is a stylus
- the sensing module of the stylus includes a distance sensor.
- the processor of the stylus issues an instruction
- the control module generates a control command
- the processor transmits the command to the communication module
- the tablet enlarges the text in the area indicated by the stylus, so that the user can quickly select the text information to be edited.
- the voice input function of the first terminal is quickly started by pressing a simple interaction action such as a switch of the second terminal, and the text converted by the voice data is processed according to a preset rule, and the interaction is simpler and more efficient. higher.
- the first terminal in the embodiment of the present application is a terminal device having a voice input function
- the second terminal in the embodiment of the present application is an external device having a startup function
- the startup function is used to start voice on the terminal device connected to the external device.
- Input function includes but is not limited to: a notebook computer, a tablet computer, a smart phone, etc.
- the external devices include, but are not limited to, a stylus, a voice recorder, a wearable device, and the like.
- another method for voice input in the embodiment of the present application includes:
- the external device detects an operation instruction triggered by the user on the external device.
- the external device senses that the user operates on the external device, and the external device detects the operation instruction triggered by the user on the external device.
- the terminal device can be a mobile phone, a notebook computer, a tablet computer, etc.
- the external device can be a voice recorder, a stylus pen, a wearable device, etc.
- the terminal device and the external device can have various combinations, for example, the combination mode 1: the terminal device is a mobile phone, The external device is a voice recorder; the combination mode 2: the terminal device is a mobile phone, and the external device is a stylus; the combination mode 3: the terminal device is a tablet computer, and the external device is a stylus pen.
- Other combinations can also be used, which are not limited herein.
- the terminal device may be a mobile phone, and the external device may be a voice recorder.
- the recorder detects the user's operating commands through the built-in sensing module.
- sensing module is a switch button
- the switch button is installed on the cap or the pen of the recorder.
- the sensing module collects the action of the user pressing the switch and transmits it to the processor of the recorder.
- the sensing module is a touch panel
- the touch panel is mounted on the cap or pen of the recorder.
- the touch panel collects the user's touch operation and transmits it to the processor of the recorder.
- the sensing module of the recording pen exists in the form of a touch panel.
- the touch panel of the recorder can detect different control commands, and different touch actions of the touch panel can be set to different control commands.
- the user's click operation, double-click operation, and long-press operation can be set to click the control command, double-click the control command, and long-press the control command.
- the touch panel of the recorder is a capacitive touch panel.
- the external device determines whether the operation instruction is an instruction to start the voice input function.
- the external device After the external device detects the user's operation instruction, the external device determines whether the detected operation instruction is an instruction to start the voice input function.
- the terminal device may be a mobile phone
- the external device may be a voice recorder.
- the sensing module of the recording pen is a touch panel, and the user clicks the touch panel, and the recorder determines that the click command is an instruction to start the normal voice input function, and converts the voice into text; when the user double-clicks the touch panel, the recorder determines the Double-click the command to start the voice input function of the conference, convert the voice into text, and typeset according to the meeting minutes template to present to the user.
- the external device sends a first message to the terminal device, where the first message is used to start the voice input function.
- the external device sends the first message to the terminal device, first The message is used to activate the voice input function.
- the terminal device may be a mobile phone
- the external device may be a voice recorder.
- the recording pen judges whether the detected operation instruction is an instruction to start the voice input function according to a preset standard, and if the operation instruction is an instruction to start the voice input function, the recording pen is generated according to the preset rule.
- the first message the recorder sends a first message to the mobile phone, and the first message is used to activate the voice input function. If the operation instruction is not an instruction to start the voice input function, other corresponding operations are performed, which are not limited herein.
- the recorder generates a control command through the control module, and the processor transmits the control command to the communication module, and the communication module encapsulates the control command in the first message and transmits the control command to the mobile phone.
- the voice recorder encapsulates the collected voice data in the first message, and sends the first message carrying the voice data to the mobile phone.
- the mobile phone and the recorder are connected by a Bluetooth function.
- the terminal device starts a voice input function according to the first message.
- the terminal device starts the voice input function according to the first message.
- the terminal device may be a mobile phone
- the external device may be a voice recorder.
- the mobile phone After receiving the first message, the mobile phone parses the first message, and the mobile phone extracts the voice data and the control command generated by the voice recorder from the first message, and starts the voice input function of the mobile phone according to the control carried by the first message.
- the terminal device identifies the obtained voice data and converts the file into text information.
- the terminal device identifies the acquired voice data and converts it into text information, which is obtained by the voice input function.
- the manner in which the terminal device acquires the voice data is: the terminal device extracts the voice data from the received first message, that is, the terminal device receives the voice data from the external device.
- the terminal device may be a mobile phone
- the external device may be a voice recorder.
- the mobile phone needs to recognize and convert the voice data into text information through the voice input function.
- the mobile phone extracts the keywords in the voice data through the voice recognition technology, matches the preset reference information in the database, and converts the successfully matched voice into text information.
- the terminal device displays text information.
- the terminal device displays the acquired text information on the screen.
- the terminal device may be a mobile phone
- the external device may be a voice recorder.
- the obtained text information is transmitted to the display module, and the text information is displayed by the display module, that is, the mobile phone displays the text information on the screen of the mobile phone.
- the external device executes an operation instruction of the user on the terminal device.
- the external device performs an operation instruction of the user on the terminal device.
- the terminal device may be a mobile phone
- the external device may be a voice recorder.
- the voice data is converted into text information and displayed on the screen for display to the user
- the user finds that the converted text information is not required by the user, and the user performs some operation instructions on the text information through the recording pen, and the recorder performs the user's operation on the mobile phone.
- Operation instructions For example, the user is dissatisfied with the displayed text information, selects the target text information that needs to be modified, such as drawing a line below the target text information, or circle the target text information.
- the terminal device detects an operation instruction triggered on the terminal device.
- the terminal device detects an operation instruction triggered on the terminal device.
- Operation instructions can be used by the user at the end of the external device Circle, click, double-click, draw lines, or other touch operations on the touch panel of the end device.
- the circle operation in the specific touch operation command is shown in FIG. 4, and the click operation in the specific touch operation command is as shown in FIG. 5, and the double-click operation in the specific touch operation command is as shown in FIG.
- the drawing operation in the operation command is shown in Fig. 7.
- the terminal device may be a mobile phone
- the external device may be a voice recorder.
- the touch panel of the mobile phone detects the input of the recording pen, and collects the pressure touch information when the touch input is input, including the pressure value and the coordinate information, and processes the pressure touch information according to a preset rule.
- the preset rule is that when the touch panel of the mobile phone detects the line drawing operation instruction, the text in the corresponding coordinate range is eliminated; when the circle operation instruction is detected, the text in the corresponding coordinate range is eliminated.
- the user fills in the correct text using the recording pen or the input method of starting the mobile phone; when the click operation command is detected, the text in the corresponding coordinate range is highlighted; When you double-click the operation command, copy the text in the corresponding coordinate range to the clipboard.
- the terminal device edits the information according to the operation instruction.
- the terminal device edits the information herein according to the operation instruction, so that the text information presented to the user is text information required by the user.
- the terminal device may be a mobile phone
- the external device may be a voice recorder. If the operation command detected by the mobile phone is a delete operation instruction (ie, a line drawing operation instruction), the mobile phone deletes the target text information selected by the user. If the operation command detected by the mobile phone is a copy operation instruction (ie, a double-click operation instruction), the mobile phone copies the target text information selected by the user.
- a delete operation instruction ie, a line drawing operation instruction
- the mobile phone deletes the target text information selected by the user.
- the operation command detected by the mobile phone is a copy operation instruction (ie, a double-click operation instruction)
- the mobile phone copies the target text information selected by the user.
- the voice input function of the terminal device is quickly activated by the interaction of the touch panel of the external device, and the text converted by the voice data is processed according to a preset rule, and the interaction is simpler and more efficient. .
- the method for voice input in the embodiment of the present application is described above.
- the device for voice input ie, the first terminal in the embodiment of the present application is described below.
- FIG. 8 an implementation of the voice input device in the embodiment of the present application is described. Examples include:
- the receiving unit 801 is configured to receive a first message sent by the second terminal.
- the initiating unit 802 is configured to start a voice input function according to the first message
- the processing unit 803 is configured to identify the acquired voice data and convert the text data into text information, where the voice data is acquired by the voice input function.
- the initiating unit 802 may further include:
- a first activation subunit 8021 configured to start a voice input function according to the first message
- the collecting subunit 8022 is configured to collect voice data by using the voice input function.
- the initiating unit 802 may further include:
- a second activation sub-unit 8023 configured to start a voice input function according to the first message
- the receiving subunit 8024 is configured to receive voice data by using the voice input function, where the voice data is included in the first message.
- the device for voice input may further include:
- the display unit 804 is configured to display the text information.
- the device for voice input may further include:
- a detecting unit 805, configured to detect an operation instruction triggered on the first terminal
- the editing unit 806 is configured to edit the information of the text according to the operation instruction.
- the voice input function of the first terminal is quickly started by the interaction between the user and the second terminal, and the text converted by the voice data is processed according to a preset rule, and the interaction is simpler and more efficient.
- another embodiment of a device for voice input ie, a second terminal in the embodiment of the present application includes:
- the detecting unit 901 is configured to detect an operation instruction triggered by the user on the second terminal;
- the determining unit 902 is configured to determine whether the operation instruction is an instruction to start a voice input function
- the sending unit 903 is configured to send a first message to the first terminal, where the operation instruction is an instruction to start the voice input function, where the first message is used to start the voice input function.
- the device for voice input may further include:
- the collecting unit 904 is configured to collect voice data.
- the generating unit 905 is configured to generate a first message, where the first message carries the collected voice data.
- the device for voice input may further include:
- the executing unit 906 is configured to execute an operation instruction of the user on the first terminal.
- the interaction between the second terminal and the user quickly starts the voice input function of the first terminal, and causes the first terminal to process the voice converted text according to a preset rule, and the interaction is simpler and more efficient. high.
- an embodiment of a device for voice input in an embodiment of the present application includes:
- FIG. 10 is a schematic structural diagram of an external device according to an embodiment of the present disclosure.
- the external device 1000 includes a processor 1010, a power module 1020, a sensing module 1030, a control module 1040, a communication module 1050, and a storage module 1060.
- the processor 1010 is a control center of the external device, connects various parts of the external device with various interfaces and lines, runs or executes a software program or instruction set stored in the storage module 1060, and calls data stored in the storage module 1060. Execute various functions and processing data of the external device to perform overall monitoring of the external device.
- the power module 1020 is connected to the processor 1010 for supplying power and power management to the external device.
- the power module 1020 can include a power management system, one or more power sources (eg, battery, AC), a recharging system, a power fault detection circuit, a power converter or inverter, a power status indicator (eg, illuminating Diode) and any other component associated with the generation, management, and distribution of power in an external device.
- the sensing module 1030 can be configured to receive input digital or character information and to generate signal inputs related to user settings and function control of the external device.
- the sensing module 1030 can be a switch button that collects the action of the user to activate or deactivate the switch.
- the sensing module 1030 can also be a resistive or capacitive touch panel, which can collect touch operations on or near the user (for example, the user uses any suitable part or object such as a finger or a stylus on the touch panel or Operation near the touch panel).
- the sensing module 1030 can also be a capacitive tip that collects characters written by a user using a stylus on a terminal device such as a tablet, smart phone, or wearable device.
- the sensing module 1030 can also be a combination of a switch button, a touch panel, a capacitive tip, or other input device.
- the control module 1040 can be configured to generate a control command, where the control command can be used to control an external device to execute a command, It can also be used to control terminal devices and other devices to execute commands.
- the communication module 1050 can communicate with the network and other devices via wireless communication, which can be the Internet, an intranet, and/or a wireless network (eg, a cellular telephone network, a wireless local area network, and/or a metropolitan area network).
- Wireless communication can use any of a variety of communication standards, protocols, and technologies including, but not limited to, global system for mobile communications (GSM), enhanced data GSM environment, high speed downlink packet access, high speed uplink Link packet access, Bluetooth, wireless fidelity (eg, IEEE 802.11a, IEEE 802.11b, IEEE 802.11g, and/or IEEE 802.11n), or any other suitable communication protocol.
- GSM global system for mobile communications
- Bluetooth wireless fidelity
- IEEE 802.11a, IEEE 802.11b, IEEE 802.11g, and/or IEEE 802.11n wireless fidelity
- FIG. 11 is a schematic structural diagram of a terminal device according to an embodiment of the present disclosure.
- the terminal device 1100 includes a processor 1110, a power module 1120, an audio module 1130, a display module 1140, a communication module 1150, an input module 1160, and a storage module 1170. These components communicate via one or more communication buses or signal lines.
- the processor 1110 is a control center of the terminal device that connects various portions of the terminal device using various interfaces and lines, by running or executing a software program or instruction set stored in the storage module 1170, and calling data stored in the storage module 1170. Performing various functions and processing data of the terminal device to perform overall monitoring of the smart terminal device.
- the power module 1120 is connected to the processor 1110 for powering and powering the terminal device.
- the power module 1120 can include a power management system, one or more power sources (eg, battery, AC), a recharging system, a power fault detection circuit, a power converter or inverter, a power status indicator (eg, illuminating Diode) and any other component associated with the generation, management, and distribution of power in an external device.
- a power management system e.g, one or more power sources (eg, battery, AC), a recharging system, a power fault detection circuit, a power converter or inverter, a power status indicator (eg, illuminating Diode) and any other component associated with the generation, management, and distribution of power in an external device.
- the audio module 1130 can include a speaker and a microphone.
- the audio module 1130 can transmit the converted electrical data of the received audio data to a speaker, and convert the sound signal to a sound signal output.
- the microphone converts the collected sound signal into an electrical signal, which is received by the audio module 1130 and then converted. For audio data.
- the display module 1140 can be used to display information input by the user or information provided to the user as well as various menus of the terminal device.
- the display module 1140 can include a display panel.
- the display panel is configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), or the like.
- the input module 1160 can cover the display module 1140.
- the touch panel of the input module is overlaid on the LCD display panel, and is sent to the processor when the touch panel detects the operation of the external device on or near the device. 1110.
- the processor 1110 then provides a corresponding visual output on the display module according to the operation of the external device, the visual output including text, graphics, icons, video, and any combination thereof.
- the communication module 1150 can communicate with the network and other devices via wireless communication, which can be the Internet, an intranet, and/or a wireless network (eg, a cellular telephone network, a wireless local area network, and/or a metropolitan area network).
- Wireless communication can use any of a variety of communication standards, protocols, and technologies including, but not limited to, global system for mobile communications (GSM), enhanced data GSM environment, high speed downlink packet access, high speed uplink Link packet access, Bluetooth, wireless fidelity (eg, IEEE 802.11a, IEEE 802.11b, IEEE 802.11g, and/or IEEE 802.11n), or any other suitable communication protocol.
- GSM global system for mobile communications
- the communication module 1150 of the terminal device communicates with the communication module 1050 of the external device via a Bluetooth protocol.
- the input module 1160 can be configured to receive input numeric or character information and to generate key signal inputs related to user settings and function control of the terminal device.
- the input module 1160 can include a touch panel and other input devices, and the touch panel can adopt various types such as resistive, capacitive, infrared, and surface acoustic waves.
- Touch panel collects users a touch operation on or near it (for example, the user uses a finger, a stylus, or the like on any suitable part or object on or near the touch panel), and drives the corresponding connection device according to a preset program. .
- the sensing module 1030 of the external device includes a capacitive pen tip, and the user operates on the touch panel of the terminal device with a stylus, and the touch panel collects operation information of the capacitive tip of the stylus.
- the interaction between the terminal device and the external device is shown in FIG.
- the computer program product includes one or more computer instructions.
- the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
- the computer instructions can be stored in a computer readable storage medium or transferred from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions can be from a website site, computer, server or data center Transmission to another website site, computer, server or data center via wired (eg coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (eg infrared, wireless, microwave, etc.).
- wired eg coaxial cable, fiber optic, digital subscriber line (DSL)
- wireless eg infrared, wireless, microwave, etc.
- the computer readable storage medium can be any available media that can be stored by a computer or a data storage device such as a server, data center, or the like that includes one or more available media.
- the usable medium may be a magnetic medium (eg, a floppy disk, a hard disk, a magnetic tape), an optical medium (eg, a DVD), or a semiconductor medium (eg, a solid state disk (SSD)) or the like.
- the disclosed system, apparatus, and method may be implemented in other manners.
- the device embodiments described above are merely illustrative.
- the division of the unit is only a logical function division.
- there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
- the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
- each functional unit in the embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
- the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
- the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium.
- a computer readable storage medium A number of instructions are included to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present application.
- the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program code. .
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
一种语音输入的方法,用于简化启动语音输入功能的过程。方法包括:第一终端接收第二终端发送的第一消息;第一终端根据第一消息启动语音输入功能(204);第一终端对获取到的语音数据进行识别并转换成文本信息(205),语音数据由语音输入功能获取。还提供相关设备,能够简化启动语音输入功能的过程。
Description
本申请要求于2017年02月28日提交中国专利局、申请号为201710114667.8、发明名称为“一种基于手写笔的智能语音输入的方法和设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请涉及语音处理技术领域,尤其涉及一种语音输入的方法和相关设备。
随着语音识别技术的发展,越来越多的智能终端设备(包括智能手机、平板电脑、可穿戴设备等)上安装了智能语音助手,为用户提供语音转写、录音整理服务。用户使用智能终端设备录制语音,智能终端设备将语音上传至云端或存储在本地,由云端或本地的语音处理引擎对录制的语音进行处理,转化成文字呈现给用户。高效率、高准确率的音频转写功能为用户节省了大量的时间,有效提升工作效率。
现有的智能语音输入方案为:用户打开智能终端设备,启动语音助手应用程序(application,APP),点击APP中的语音输入按钮,然后开始说话。语音助手即开始录制语音,语音助手APP通过语音识别技术对输入的语音进行识别并将语音转化为文字。用户还可以在语音助手APP中进行设置,将语音助手设置为通知栏快捷工具,用户打开智能终端设备后,通过通知栏快捷工具快速进行语音输入。
可以看出,目前的语音输入方案中,用户需要经过多个操作步骤才能启动语音输入功能,操作复杂。
发明内容
本申请实施例提供了一种语音输入的方法,用于简化启动语音输入功能的过程。
本申请实施例的第一方面提供一种语音输入的方法,包括:第一终端与第二终端之间建立连接关系,以使得第一终端与第二终端之间可以传输信息。在建立连接关系后,第一终端接收第二终端发送的第一消息;第一终端从第一消息中提取相关信息,并根据该第一消息启动第一终端上的语音输入功能;第一终端通过所述语音输入功能获取语音数据,然后第一终端对获取到的语音数据进行识别并转换成文本信息。本申请实施例中,通过与第二终端的简单交互,快速启动第一终端的语音输入功能,简化启动语音输入功能的过程。
在一种可能的设计中,在本申请实施例第一方面的第一种实现方式中,所述第一终端根据所述第一消息启动语音输入功能包括:所述第一终端根据所述第一消息启动第一终端上的语音输入功能;当第一消息中不包含语音数据时,第一终端通过第一终端的语音输入功能采集第一终端周围的语音信号,并转换成语音数据。本申请实施例提供了通过语音输入功能采集语音信号并生成语音数据的过程,增加了本申请实施例的可实现性和可操作性。
在一种可能的设计中,在本申请实施例第一方面的第二种实现方式中,所述第一终端根据所述第一消息启动语音输入功能包括:所述第一终端根据所述第一消息启动语音输入功能;当第一消息中包含语音数据时,第一终端通过第一终端的语音输入功能接收第一消息中携带的语音数据。本申请实施例提供了从第一消息中直接获取语音数据的过程,增加
了本申请实施例的可实现性和可操作性。
在一种可能的设计中,在本申请实施例第一方面的第三种实现方式中,所述第一终端对获取到的语音数据进行识别并转换成文本信息之后,所述方法还包括:所述第一终端检测在所述第一终端上触发的操作指令;所述第一终端根据所述操作指令对所述本文信息进行编辑。本申请实施例增加了对转换成的文本信息编辑的过程,增加了本申请实施例的可实现性和完整性。
在一种可能的设计中,在本申请实施例第一方面的第四种实现方式中,所述第一终端对获取到的语音数据进行识别并转换成文本信息之后,所述第一终端接收所述第二终端的操作指令之前,所述方法还包括:所述第一终端显示所述文本信息。本申请实施例增加了显示转换后的文本信息的过程,使本申请实施例更加具有逻辑性。
本申请实施例的第二方面提供一种语音输入的方法,包括:用户在第二终端上执行相应的操作时,第二终端检测用户在所述第二终端上触发的操作指令;第二终端将检测到的操作指令与数据库进行对比,并判断所述操作指令是否为启动语音输入功能的指令;若所述操作指令为启动语音输入功能的指令,则所述第二终端向第一终端发送第一消息,所述第一消息用于启动语音输入功能。本申请实施例中,通过与第一终端的简单交互,以使得第一终端快速启动语音输入功能,简化启动语音输入功能的过程。
在一种可能的设计中,在本申请实施例第二方面的第一种实现方式中,所述第二终端判断所述操作指令是否为启动语音输入功能的指令之后,所述第二终端向第一终端发送第一消息之前,所述方法还包括:所述第二终端采集语音数据;所述第二终端生成第一消息,所述第一消息携带有采集到的所述语音数据。本申请实施例中提供了第二终端采集语音数据,并将语音数据和启动第一终端的语音输入功能的控制命令发送至第一终端,增加了本申请实施例的可实现方式。
在一种可能的设计中,在本申请实施例第二方面的第二种实现方式中,所述第二终端向第一终端发送第一消息之后,所述方法还包括:所述第二终端执行用户在所述第一终端上的操作指令。本申请实施例中提供了第二终端根据用户的选择执行相应操作的过程,使本申请实施例在步骤上更完善。
本申请实施例的第三方面提供一种语音输入的设备,包括:接收单元,用于接收第二终端发送的第一消息;启动单元,用于根据所述第一消息启动语音输入功能;处理单元,用于对获取到的语音数据进行识别并转换成文本信息,所述语音数据由所述语音输入功能获取。本申请实施例中,通过与第二终端的简单交互,快速启动第一终端的语音输入功能,简化启动语音输入功能的过程。
在一种可能的设计中,在本申请实施例第三方面的第一种实现方式中,所述启动单元包括:第一启动子单元,用于根据所述第一消息启动语音输入功能;采集子单元,用于通过所述语音输入功能采集语音数据。本申请实施例提供了通过语音输入功能采集语音信号并生成语音数据的过程,增加了本申请实施例的可实现性和可操作性。
在一种可能的设计中,在本申请实施例第三方面的第二种实现方式中,所述启动单元包括:第二启动子单元,用于根据所述第一消息启动语音输入功能;接收子单元,用于通
过所述语音输入功能接收语音数据,所述语音数据包含于所述第一消息。本申请实施例提供了从第一消息中直接获取语音数据的过程,增加了本申请实施例的可实现性和可操作性。
在一种可能的设计中,在本申请实施例第三方面的第三种实现方式中,所述设备还包括:检测单元,用于检测在所述第一终端上触发的操作指令;编辑单元,用于根据所述操作指令对所述本文信息进行编辑。本申请实施例增加了对转换成的文本信息编辑的过程,增加了本申请实施例的可实现性和完整性。
在一种可能的设计中,在本申请实施例第三方面的第四种实现方式中,所述设备还包括:显示单元,用于显示所述文本信息。本申请实施例增加了显示转换后的文本信息的过程,使本申请实施例更加具有逻辑性。
本申请实施例的第四方面提供一种语音输入的设备,包括:检测单元,用于检测用户在所述第二终端上触发的操作指令;判断单元,用于判断所述操作指令是否为启动语音输入功能的指令;发送单元,若所述操作指令为启动语音输入功能的指令,则用于向第一终端发送第一消息,所述第一消息用于启动语音输入功能。本申请实施例中,通过与第一终端的简单交互,以使得第一终端快速启动语音输入功能,简化启动语音输入功能的过程。
在一种可能的设计中,在本申请实施例第四方面的第一种实现方式中,所述设备还包括:采集单元,用于采集语音数据;生成单元,用于生成第一消息,所述第一消息携带有采集到的所述语音数据。本申请实施例中提供了第二终端采集语音数据,并将语音数据和启动第一终端的语音输入功能的控制命令发送至第一终端,增加了本申请实施例的可实现方式。
在一种可能的设计中,在本申请实施例第四方面的第二种实现方式中,所述设备还包括:执行单元,用于执行用户在所述第一终端上的操作指令。本申请实施例中提供了第二终端根据用户的选择执行相应操作的过程,使本申请实施例在步骤上更完善。
本申请实施例第五方面提供了一种语音输入的设备,包括:通过总线相连的处理器、电源模块、感应模块、控制模块、通信模块和存储模块;所述处理器通过运行或执行存储在存储模块中的软件程序或指令集,以及调用存储在存储模块中的数据,执行设备的各种功能和处理数据;所述电源模块用于给设备进行供电及电源管理;所述感应模块用于接收输入的数字或字符信息,以及产生与设备的用户设置以及功能控制有关的信号输入;所述控制模块用于产生控制命令;所述通信模块用于通过无线通信与网络以及其他设备通信;所述存储模块用于存储软件程序、指令集和数据。本申请实施例中,通过与第一终端的简单交互,以使得第一终端快速启动语音输入功能,简化启动语音输入功能的过程。
本申请实施例第六方面提供了一种语音输入的设备,包括:通过总线相连的处理器、电源模块、音频模块,显示模块,通信模块,输入模块和存储模块;所述处理器通过运行或执行存储在存储模块中的软件程序或指令集,以及调用存储在存储模块中的数据,执行设备的各种功能和处理数据;所述电源模块用于给设备进行供电及电源管理;所述音频模块用于将收集的声音信号转换为音频数据,还可以将接收到的音频数据转换后的电信号,传输到扬声器,由扬声器转换为声音信号输出;所述显示模块用于显示转化的文本信息、由用户输入的信息或提供给用户的信息以及设备的各种菜单;所述通信模块用于通过无线
通信与网络以及其他设备通信;所述输入模块用于接收输入的数字或字符信息,以及产生与设备的用户设置以及功能控制有关的键信号输入;所述存储模块用于存储软件程序、指令集和数据。本申请实施例中,通过与第二终端的简单交互,快速启动第一终端的语音输入功能,简化启动语音输入功能的过程。
本申请实施例第七方面提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述各方面所述的方法。
本申请实施例第八方面提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述各方面所述的方法。
本申请实施例提供的技术方案中,第一终端接收第二终端发送的第一消息;所述第一终端根据所述第一消息启动语音输入功能;所述第一终端对获取到的语音数据进行识别并转换成文本信息,所述语音数据由所述语音输入功能获取。本申请实施例中,通过与第二终端的简单交互,快速启动第一终端的语音输入功能,简化启动语音输入功能的过程。
图1为本申请实施例的系统架构的示意图;
图2为本申请实施例中一个语音输入的方法的实施例示意图;
图3为本申请实施例中另一个语音输入的方法的实施例示意图;
图4为本申请实施例触控操作指令中画圈操作的示意图;
图5为本申请实施例触控操作指令中单击操作的示意图;
图6为本申请实施例触控操作指令中双击操作的示意图;
图7为本申请实施例触控操作指令中画线操作的示意图;
图8为本申请实施例中一个语音输入的设备的实施例示意图;
图9为本申请实施例中另一个语音输入的设备的实施例示意图;
图10为本申请实施例中另一个语音输入的设备的实施例示意图;
图11为本申请实施例中另一个语音输入的设备的实施例示意图;
图12为本申请实施例中语音输入的设备之间的交互示意图。
本申请实施例提供了一种语音输入的方法和相关设备,用于简化启动语音输入功能的过程。
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例进行描述。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”或“具有”及其任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
如图1所示的系统架构,第一终端与第二终端可以通过蓝牙方式相连,还可以通过其他方式连接,具体此处不做限定,以使得第一终端和第二终端之间能进行通信,第一终端可以是笔记本、平板电脑或手机等设备,第二终端可以是手写笔、录音笔或可穿戴设备等具备启动开关的设备,该启动开关用于启动连接的第一终端的语音输入功能。用户通过第二终端启动第一终端上的语音输入功能,简化启动步骤,高效、快速的进行语音输入,以使得用户获取到语音数据转化的文本信息。
为便于理解,下面对本申请实施例的具体流程进行描述,请参阅图2,本申请实施例中一个语音输入的方法实施例包括:
201、第二终端检测用户在第二终端上触发的操作指令。
第二终端感应到用户在第二终端上进行操作,第二终端检测用户在第二终端上触发的操作指令。
可选的,第二终端通过内置的感应模块检测用户的操作指令。感应模块的形式有多种,不同形式的感应模块,检测的方式也不同。例如,当感应模块为开关按键,第二终端为手写笔时,开关按键安装在手写笔的笔帽或笔身上。当用户按下笔帽或笔身上的开关按键时,感应模块检测到用户按下开关的动作,将该动作对应的操作指令传输给手写笔的处理器。当感应模块为触控面板,第二终端为手写笔时,触控面板安装在手写笔的笔帽或笔身上。当用户触碰笔帽或笔身时,触控面板检测到用户的触碰操作,将该操作对应的操作指令传输给手写笔的处理器。需要说明的是,每一个操作指令都可以对应用户的一个动作,所述操作指令可以通过设备的硬件进行处理,下文的控制指令可以通过设备的硬件进行处理。
可以理解的是,当感应模块为触控面板时,触控面板可以检测不同的控制命令,用户对触控面板的不同触控动作,可以设置成不同的控制命令。例如,可以将用户的单击操作、双击操作、长按操作分别设置成单击控制命令、双击控制命令和长按控制命令。触控面板的具体类型,例如,可以是电阻式触控面板或电容式触控面板,此处不做限定。
202、第二终端判断操作指令是否为启动语音输入功能的指令。
第二终端检测到用户的操作指令后,第二终端判断检测到的操作指令是否为启动语音输入功能的指令。
需要说明的是,第二终端通过内置的感应模块检测到用户的操作指令后,根据第二终端的感应模块的形式不同,判断标准也不同。例如,当感应模块为开关按键时,用户闭合开关按键,开关按键所在的开关电路闭合,第二终端判断该闭合指令为启动语音输入功能的指令,用户断开开关按键,开关按键所在的开关电路断开,第二终端判断该断开指令为关闭语音输入功能的指令。例如,当感应模块为触控面板时,用户单击触控面板,第二终端判断该单击指令为启动普通语音输入功能的指令,将语音转换为文字;用户双击触控面板时,第二终端判断该双击指令为启动会议语音输入功能的指令,将语音转换为文字,并按照会议纪要模板进行排版,呈现给用户。第二终端还可以设置有其他判断标准,具体此处不做限定。
203、若操作指令为启动语音输入功能的指令,则第二终端向第一终端发送第一消息,第一消息用于启动语音输入功能。
第一终端与第二终端建立连接关系后,第二终端按照预置的标准判断检测到的操作指令是否为启动语音输入功能的指令,若操作指令为启动语音输入功能的指令,则第二终端按照预置规则生成第一消息,第二终端向第一终端发送第一消息,第一消息用于启动语音输入功能,第一终端接收第二终端发送的第一消息。若操作指令不为启动语音输入功能的指令,则执行其他相应操作,具体此处不做限定。
可选的,第二终端通过控制模块产生控制命令,处理器将控制指令传输给通信模块,通信模块将该控制指令封装在第一消息中传输给与第二终端配对连接的第一终端,如平板电脑或智能手机等。若需要将第二终端采集的语音数据也发送给第一终端,则第二终端将采集到的语音数据封装在第一消息中。
可以理解的是,第一终端在接收第二终端发送的第一消息之前,第一终端和第二终端需要建立连接关系,第一终端与第二终端之间的连接方式有多种,可以通过有线网络、无线网络或蓝牙连接等方式连接,可以根据实际情况进行选择,具体连接方式此处不做限定。
204、第一终端根据第一消息启动语音输入功能。
第一终端接收到第一消息后,第一终端对第一消息进行解析,得到该第一消息携带的控制指令,根据第一消息携带的控制启动第一终端的语音输入功能,该控制指令由第二终端生成。
可选的,第一终端通过通信模块若第一消息中携带有语音数据,则第一终端需要从第一消息中提取该语音数据。
205、第一终端对获取到的语音数据进行识别并转换成文本信息。
第一终端对获取到的语音数据进行识别并转换成文本信息,该语音数据由语音输入功能获取。
需要说明的是,第一终端获取语音数据的方式有两种,方式一:第一终端开启语音输入功能后,通过内置的语音模块采集第一终端周围的语音。方式二:第一终端从接收到的第一消息中提取语音数据,即第一终端从第二终端处接收语音数据。根据第一终端的实际情况,进行选择,具体此处不做限定。
举例说明,第一终端可以通过语音输入功能对语音数据进行识别并转换成文本信息。可选的,第一终端通过语音识别技术提取语音数据中的关键字,与数据库中的预置参考信息进行匹配,将匹配成功的语音转化成文本信息。可以理解的是,目前应用在第一终端上的语音识别技术的准确率还无法达到百分之百,在语音数据转换为文本信息的过程中,可能出现文本信息识别有误的情况。因此语音输入功能将语音数据转换为文本信息后,用户可能需要对文本信息进行编辑。文本信息的语言环境可以根据实际情况进行设置,例如,语音输入功能可以将语音数据转换成文本信息,该文本信息中可以使用包括但不限于:中文、英文等。
206、第一终端显示文本信息。
第一终端显示获取到的文本信息。
第一终端在完成对语音数据的识别后,将获取到的文本信息传输到显示模块上,通过
显示模块显示该文本信息。例如,第一终端将文本信息显示在第一终端的屏幕上。可选的,该文本信息还可以根据预置环境,采用一定的排版方式对文本信息进行布局,然后在屏幕上呈现给用户。
207、第二终端执行用户在第一终端上的操作指令。
当语音数据转换成文本信息并显示在屏幕上显示给用户时,用户发现转换成的文本信息不是用户所需要的,用户通过第二终端对文本信息进行一些操作指令,第二终端执行用户在第一终端上的操作指令。例如,用户对显示的文本信息不满意,选择需要进行修改的目标文本信息,如在目标文本信息的下方画线,或圈出目标文本信息。
208、第一终端检测在第一终端上触发的操作指令。
第一终端检测在第一终端上触发的操作指令。操作指令可以为用户使用第二终端(可以是手写笔)在第一终端(平板电脑、智能手机等设备)的触控面板上画圈、单击、双击、画线或其它触控操作。可以理解的是,只要能达到类似的效果,具体触控操作的选择是灵活的。
举例说明,当第一终端为平板电脑,第二终端为手写笔,平板电脑的触控面板检测手写笔的输入,并收集触控输入时的压力触控信息,包括压力值和坐标信息,并把压力触控信息按照预设的规则进行处理。可选的,该预设的规则可以为,平板电脑的触控面板检测到画线操作指令时,将对应坐标范围内的文字消除;检测到画圈操作指令时,将对应坐标范围内的文字消除,并将光标移动到被消除的文字处,用户使用手写笔或启动平板电脑的输入法把正确的文字补上;检测到单击操作指令时,将对应坐标范围内的文字进行高亮标记;检测到双击操作指令时,将对应坐标范围内的文字复制到剪切板。还可以是其他预设的规则,具体此处不做限定。
需要说明的是,第一终端接收通过内置的输入模块检测第二终端的操作指令,并将检测到的操作指令发送给处理器。第一终端的输入模块可以是触控面板,触控面板可以采用电阻式、电容式、红外线以及表面声波等多种类型,触控面板收集用户的操作指令并发送给处理器。
209、第一终端根据操作指令对本文信息进行编辑。
第一终端根据操作指令对本文信息进行编辑,以使得呈现给用户的文本信息是用户需要的文本信息。
举例说明,若第一终端检测到的操作指令是删除操作指令(即画线操作指令),则第一终端将用户选择的目标文本信息删除。若第一终端检测到的操作指令是复制操作指令(即双击操作指令),则第一终端复制用户选择的目标文本信息。根据上述步骤中的预设的规则,还可以根据实际情况设置其他的编辑操作,具体此处不做限定。
可选的,当第一终端为平板电脑,第二终端为手写笔,手写笔的感应模块包括距离传感器。当用户使用手写笔接近平板电脑时,手写笔的处理器发出指令,控制模块产生控制命令,处理器传输指令给通信模块,将控制命令通过蓝牙传输给与手写笔配对的平板电脑。平板电脑将手写笔所指区域的文字进行放大,方便用户快速选择需要编辑的文本信息。
本申请实施例中,通过按下第二终端的开关等简单的交互动作,快速启动第一终端的语音输入功能,并且按照预设的规则对语音数据转换的文字进行处理,交互更简单、效率更高。
本申请实施例涉及的第一终端为具备语音输入功能的终端设备,本申请实施例涉及的第二终端为具备启动功能的外接设备,启动功能用于启动与外接设备连接的终端设备上的语音输入功能。终端设备包括但不限于:笔记本电脑、平板电脑、智能手机等,外接设备包括但不限于:手写笔、录音笔、可穿戴设备等。
如图3所示,本申请实施例中另一个语音输入的方法实施例包括:
301、外接设备检测用户在外接设备上触发的操作指令。
外接设备感应到用户在外接设备上进行操作,外接设备检测用户在外接设备上触发的操作指令。
终端设备可以手机、笔记本电脑、平板电脑等,外接设备可以是录音笔、手写笔、可穿戴设备等,终端设备和外接设备可以有多种组合形式,例如,组合方式一:终端设备为手机、外接设备为录音笔;组合方式二:终端设备为手机,外接设备为手写笔;组合方式三:终端设备为平板电脑,外接设备为手写笔。还可以为其他组合方式,具体此处不做限定。
举例说明,终端设备可以是手机,外接设备可以是录音笔。录音笔通过内置的感应模块检测用户的操作指令。感应模块的形式有多种,不同形式的感应模块,检测的方式也不同。本申请实施例中,当感应模块为开关按键时,开关按键安装在录音笔的笔帽或笔身上。当用户按下笔帽或笔身上的开关按键时,感应模块收集到用户按下开关的动作,传输给录音笔的处理器。当感应模块为触控面板时,触控面板安装在录音笔的笔帽或笔身上。当用户触碰笔帽或笔身时,触控面板收集到用户的触碰操作,传输给录音笔的处理器。为了便于理解,本申请实施例中,录音笔的感应模块以触控面板的形式存在。
可以理解的是,录音笔的触控面板可以检测不同的控制命令,用户对触控面板的不同触控动作,可以设置成不同的控制命令。例如,可以将用户的单击操作、双击操作、长按操作分别设置成单击控制命令、双击控制命令和长按控制命令。录音笔的触控面板为电容式触控面板。
302、外接设备判断操作指令是否为启动语音输入功能的指令。
外接设备检测到用户的操作指令后,外接设备判断检测到的操作指令是否为启动语音输入功能的指令。
举例说明,终端设备可以是手机,外接设备可以是录音笔。录音笔的感应模块为触控面板,用户单击触控面板,录音笔判断该单击指令为启动普通语音输入功能的指令,将语音转换为文字;用户双击触控面板时,录音笔判断该双击指令为启动会议语音输入功能的指令,将语音转换为文字,并按照会议纪要模板进行排版,呈现给用户。
303、若操作指令为启动语音输入功能的指令,则外接设备向终端设备发送第一消息,第一消息用于启动语音输入功能。
若操作指令为启动语音输入功能的指令,则外接设备向终端设备发送第一消息,第一
消息用于启动语音输入功能。
举例说明,终端设备可以是手机,外接设备可以是录音笔。手机与录音笔建立连接关系后,录音笔按照预置的标准判断检测到的操作指令是否为启动语音输入功能的指令,若操作指令为启动语音输入功能的指令,则录音笔按照预置规则生成第一消息,录音笔向手机发送第一消息,第一消息用于启动语音输入功能。若操作指令不为启动语音输入功能的指令,则执行其他相应操作,具体此处不做限定。
需要说明的是,录音笔通过控制模块产生控制命令,处理器将控制指令传输给通信模块,通信模块将该控制指令封装在第一消息中传输给手机。本申请实施例中,录音笔将采集到的语音数据封装在第一消息中,并将携带有语音数据的第一消息发送给手机。本实施例中,手机与录音笔之间通过蓝牙功能连接。
304、终端设备根据第一消息启动语音输入功能。
终端设备根据第一消息启动语音输入功能。
举例说明,终端设备可以是手机,外接设备可以是录音笔。手机接收到第一消息后,手机对第一消息进行解析,手机从第一消息中提取语音数据和由录音笔生成的控制指令,根据第一消息携带的控制启动手机的语音输入功能。
305、终端设备对获取到的语音数据进行识别并转换成文本信息。
终端设备对获取到的语音数据进行识别并转换成文本信息,该语音数据由语音输入功能获取。
需要说明的是,本实施例中,终端设备获取语音数据的方式为:终端设备从接收到的第一消息中提取语音数据,即终端设备从外接设备处接收语音数据。
举例说明,终端设备可以是手机,外接设备可以是录音笔。手机需要通过语音输入功能对语音数据进行识别并转换成文本信息。手机通过语音识别技术提取语音数据中的关键字,与数据库中的预置参考信息进行匹配,将匹配成功的语音转化成文本信息。
306、终端设备显示文本信息。
终端设备在屏幕上显示获取到的文本信息。
举例说明,终端设备可以是手机,外接设备可以是录音笔。手机在完成对语音数据的识别后,将获取到的文本信息传输到显示模块上,通过显示模块显示该文本信息,即手机将文本信息显示在手机的屏幕上。
307、外接设备执行用户在终端设备上的操作指令。
外接设备执行用户在终端设备上的操作指令。
举例说明,终端设备可以是手机,外接设备可以是录音笔。当语音数据转换成文本信息并显示在屏幕上显示给用户时,用户发现转换成的文本信息不是用户所需要的,用户通过录音笔对文本信息进行一些操作指令,录音笔执行用户在手机上的操作指令。例如,用户对显示的文本信息不满意,选择需要进行修改的目标文本信息,如在目标文本信息的下方画线,或圈出目标文本信息。
308、终端设备检测在终端设备上触发的操作指令。
终端设备检测在终端设备上触发的操作指令。操作指令可以为用户使用外接设备在终
端设备的触控面板上画圈、单击、双击、画线或其它触控操作。具体触控操作指令中的画圈操作如图4所示,具体触控操作指令中的单击操作如图5所示,具体触控操作指令中的双击操作如图6所示,具体触控操作指令中的画线操作如图7所示。
举例说明,终端设备可以是手机,外接设备可以是录音笔。手机的触控面板检测录音笔的输入,并收集触控输入时的压力触控信息,包括压力值和坐标信息,并把压力触控信息按照预设的规则进行处理。本实施例中,该预设的规则为,手机的触控面板检测到画线操作指令时,将对应坐标范围内的文字消除;检测到画圈操作指令时,将对应坐标范围内的文字消除,并将光标移动到被消除的文字处,用户使用录音笔或启动手机的输入法把正确的文字补上;检测到单击操作指令时,将对应坐标范围内的文字进行高亮标记;检测到双击操作指令时,将对应坐标范围内的文字复制到剪切板。
309、终端设备根据操作指令对本文信息进行编辑。
终端设备根据操作指令对本文信息进行编辑,以使得呈现给用户的文本信息是用户需要的文本信息。
举例说明,终端设备可以是手机,外接设备可以是录音笔。若手机检测到的操作指令是删除操作指令(即画线操作指令),则手机将用户选择的目标文本信息删除。若手机检测到的操作指令是复制操作指令(即双击操作指令),则手机复制用户选择的目标文本信息。
本申请实施例中,通过触控外接设备的触控面板的交互动作,快速启动终端设备的语音输入功能,并且按照预设的规则对语音数据转换的文字进行处理,交互更简单、效率更高。
上面对本申请实施例中语音输入的方法进行了描述,下面对本申请实施例中语音输入的设备(即第一终端)进行描述,请参阅图8,本申请实施例中语音输入的设备的一个实施例包括:
接收单元801,用于接收第二终端发送的第一消息;
启动单元802,用于根据所述第一消息启动语音输入功能;
处理单元803,用于对获取到的语音数据进行识别并转换成文本信息,所述语音数据由所述语音输入功能获取。
可选的,启动单元802可以进一步包括:
第一启动子单元8021,用于根据所述第一消息启动语音输入功能;
采集子单元8022,用于通过所述语音输入功能采集语音数据。
可选的,启动单元802可以进一步包括:
第二启动子单元8023,用于根据所述第一消息启动语音输入功能;
接收子单元8024,用于通过所述语音输入功能接收语音数据,所述语音数据包含于所述第一消息。
可选的,语音输入的设备还可进一步包括:
显示单元804,用于显示所述文本信息。
可选的,语音输入的设备还可进一步包括:
检测单元805,用于检测在所述第一终端上触发的操作指令;
编辑单元806,用于根据所述操作指令对所述本文信息进行编辑。
本申请实施例中,通过用户与第二终端的交互动作,快速启动第一终端的语音输入功能,并且按照预设的规则对语音数据转换的文字进行处理,交互更简单、效率更高。
请参阅图9,本申请实施例中语音输入的设备(即第二终端)的另一个实施例包括:
检测单元901,用于检测用户在所述第二终端上触发的操作指令;
判断单元902,用于判断所述操作指令是否为启动语音输入功能的指令;
发送单元903,若所述操作指令为启动语音输入功能的指令,则用于向第一终端发送第一消息,所述第一消息用于启动语音输入功能。
可选的,语音输入的设备还可进一步包括:
采集单元904,用于采集语音数据;
生成单元905,用于生成第一消息,所述第一消息携带有采集到的所述语音数据。
可选的,语音输入的设备还可进一步包括:
执行单元906,用于执行用户在所述第一终端上的操作指令。
本申请实施例中,第二终端与用户的交互动作,快速启动第一终端的语音输入功能,并且使第一终端按照预设的规则对语音数据转换的文字进行处理,交互更简单、效率更高。
上面图8至图9从模块化功能实体的角度分别对本申请实施例中语音输入的设备进行详细描述,下面从硬件处理的角度对本申请实施例中语音输入的设备进行详细描述。请参阅图10,本申请实施例中语音输入的设备一个实施例包括:
图10是本申请实施例提供的外接设备的一种结构示意图,外接设备1000包括处理器1010、电源模块1020、感应模块1030、控制模块1040、通信模块1050和存储模块1060。
处理器1010是外接设备的控制中心,利用各种接口和线路连接外接设备的各个部分,通过运行或执行存储在存储模块1060中的软件程序或指令集,以及调用存储在存储模块1060中的数据,执行外接设备的各种功能和处理数据,从而对外接设备进行整体监控。
电源模块1020与处理器1010连接,用于给外接设备进行供电及电源管理。具体的,电源模块1020可包括电源管理系统、一个或多个电源(例如,电池、交流电)、再充电系统、电力故障检测电路、功率变换器或逆变器、电力状态指示器(例如,发光二极管)和任何其他与外接设备中电力的生成、管理和分配相关联的部件。
感应模块1030可用于接收输入的数字或字符信息,以及产生与外接设备的用户设置以及功能控制有关的信号输入。具体地,感应模块1030可以是一个开关按键,可收集用户启动或关闭开关的动作。感应模块1030也可以是电阻式或电容式触控面板,可收集用户在其上或附近的触控操作(比如用户使用手指、触控笔等任何适合的部位或物件在触控面板上或在触控面板附近的操作)。感应模块1030也可以是电容笔尖,可收集用户使用手写笔在平板电脑、智能手机或可穿戴设备等终端设备上书写的字符。感应模块1030也可以是开关按键、触控面板、电容笔尖或其它输入设备的组合。
控制模块1040可用于产生控制命令,所述控制命令可以用于控制外接设备执行命令,
也可以用于控制终端设备以及其他设备执行命令。
通信模块1050可通过无线通信与网络以及其他设备通信,网络可以是互联网、内联网和/或无线网络(例如蜂窝电话网络、无线局域网和/或城域网)。无线通信可使用多种通信标准、协议和技术中的任何类型,包括但不限于全球移动通信系统(global system for mobile communications,GSM)、增强数据GSM环境、高速下行链路分组接入、高速上行链路分组接入、蓝牙、无线保真(例如,IEEE 802.11a、IEEE 802.11b、IEEE 802.11g和/或IEEE 802.11n)、或者其他任何适当的通信协议。
图11是本申请实施例提供的终端设备的一种结构示意图,终端设备1100包括处理器1110,电源模块1120,音频模块1130,显示模块1140,通信模块1150,输入模块1160,存储模块1170。这些部件通过一个或多个通信总线或信号线来通信。
处理器1110是终端设备的控制中心,利用各种接口和线路连接终端设备的各个部分,通过运行或执行存储在存储模块1170内的软件程序或指令集,以及调用存储在存储模块1170内的数据,执行终端设备的各种功能和处理数据,从而对智能终端设备进行整体监控。
电源模块1120与处理器1110连接,用于给终端设备进行供电及电源管理。具体的,电源模块1120可包括电源管理系统、一个或多个电源(例如,电池、交流电)、再充电系统、电力故障检测电路、功率变换器或逆变器、电力状态指示器(例如,发光二极管)和任何其他与外接设备中电力的生成、管理和分配相关联的部件。
音频模块1130可包括扬声器和传声器。音频模块1130可将接收到的音频数据转换后的电信号,传输到扬声器,由扬声器转换为声音信号输出;另一方面,传声器将收集的声音信号转换为电信号,由音频模块1130接收后转换为音频数据。
显示模块1140可用于显示由用户输入的信息或提供给用户的信息以及终端设备的各种菜单。显示模块1140可包括显示面板,可选的,采用液晶显示器(liquid crystal display,LCD)、有机发光二极管(organic light-emitting diode,OLED)等形式来配置显示面板。可选的,输入模块1160可覆盖显示模块1140,例如,输入模块的触控面板覆盖在LCD显示面板上,当触控面板检测到在外接设备在其上或附近的操作后,传送给处理器1110,随后处理器1110根据外接设备的操作在显示模块上提供相应的视觉输出,视觉输出包括文本、图形、图标、视频及其任意组合。
通信模块1150可通过无线通信与网络以及其他设备通信,网络可以是互联网、内联网和/或无线网络(例如蜂窝电话网络、无线局域网和/或城域网)。无线通信可使用多种通信标准、协议和技术中的任何类型,包括但不限于全球移动通信系统(global system for mobile communications,GSM)、增强数据GSM环境、高速下行链路分组接入、高速上行链路分组接入、蓝牙、无线保真(例如,IEEE 802.11a、IEEE 802.11b、IEEE 802.11g和/或IEEE 802.11n)、或者其他任何适当的通信协议。例如,终端设备的通信模块1150与外接设备的通信模块1050通过蓝牙协议进行通信。
输入模块1160可用于接收输入的数字或字符信息,以及产生与终端设备的用户设置以及功能控制有关的键信号输入。具体地,输入模块1160可包括触控面板以及其他输入设备,触控面板可采用电阻式、电容式、红外线以及表面声波等多种类型。触控面板可收集用户
在其上或附近的触摸操作(例如用户使用手指、手写笔等任何适合的部位或物件在触控面板上或在触控面板附近的操作),并根据预先预设的程序驱动相应的连接装置。例如,外接设备的感应模块1030包括电容笔尖,用户用手写笔在终端设备的触控面板上进行操作,触控面板收集手写笔的电容笔尖的操作信息。终端设备与外接设备的交互如图12所示。
所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存储的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如,固态硬盘(solid state disk,SSD))等。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。
Claims (20)
- 一种语音输入的方法,其特征在于,包括:第一终端接收第二终端发送的第一消息;所述第一终端根据所述第一消息启动语音输入功能;所述第一终端对获取到的语音数据进行识别并转换成文本信息,所述语音数据由所述语音输入功能获取。
- 根据权利要求1所述的方法,其特征在于,所述第一终端根据所述第一消息启动语音输入功能包括:所述第一终端根据所述第一消息启动语音输入功能;所述第一终端通过所述语音输入功能采集语音数据。
- 根据权利要求1所述的方法,其特征在于,所述第一终端根据所述第一消息启动语音输入功能包括:所述第一终端根据所述第一消息启动语音输入功能;所述第一终端通过所述语音输入功能接收语音数据,所述语音数据包含于所述第一消息。
- 根据权利要求1至3中任一项所述的方法,其特征在于,所述第一终端对获取到的语音数据进行识别并转换成文本信息之后,所述方法还包括:所述第一终端检测在所述第一终端上触发的操作指令;所述第一终端根据所述操作指令对所述本文信息进行编辑。
- 根据权利要求4所述的方法,其特征在于,所述第一终端对获取到的语音数据进行识别并转换成文本信息之后,所述第一终端接收所述第二终端的操作指令之前,所述方法还包括:所述第一终端显示所述文本信息。
- 一种语音输入的方法,其特征在于,包括:第二终端检测用户在所述第二终端上触发的操作指令;所述第二终端判断所述操作指令是否为启动语音输入功能的指令;若所述操作指令为启动语音输入功能的指令,则所述第二终端向第一终端发送第一消息,所述第一消息用于启动语音输入功能。
- 根据权利要求6所述的方法,其特征在于,所述第二终端判断所述操作指令是否为启动语音输入功能的指令之后,所述第二终端向第一终端发送第一消息之前,所述方法还包括:所述第二终端采集语音数据;所述第二终端生成第一消息,所述第一消息携带有采集到的所述语音数据。
- 根据权利要求6所述的方法,其特征在于,所述第二终端向第一终端发送第一消息之后,所述方法还包括:所述第二终端执行用户在所述第一终端上的操作指令。
- 一种语音输入的设备,其特征在于,包括:接收单元,用于接收第二终端发送的第一消息;启动单元,用于根据所述第一消息启动语音输入功能;处理单元,用于对获取到的语音数据进行识别并转换成文本信息,所述语音数据由所述语音输入功能获取。
- 根据权利要求9所述的设备,其特征在于,所述启动单元包括:第一启动子单元,用于根据所述第一消息启动语音输入功能;采集子单元,用于通过所述语音输入功能采集语音数据。
- 根据权利要求9所述的设备,其特征在于,所述启动单元包括:第二启动子单元,用于根据所述第一消息启动语音输入功能;接收子单元,用于通过所述语音输入功能接收语音数据,所述语音数据包含于所述第一消息。
- 根据权利要求9至11中任一项所述的设备,其特征在于,所述设备还包括:检测单元,用于检测在所述第一终端上触发的操作指令;编辑单元,用于根据所述操作指令对所述本文信息进行编辑。
- 根据权利要求12所述的方法,其特征在于,所述设备还包括:显示单元,用于显示所述文本信息。
- 一种语音输入的设备,其特征在于,包括:检测单元,用于检测用户在所述第二终端上触发的操作指令;判断单元,用于判断所述操作指令是否为启动语音输入功能的指令;发送单元,若所述操作指令为启动语音输入功能的指令,则用于向第一终端发送第一消息,所述第一消息用于启动语音输入功能。
- 根据权利要求14所述的设备,其特征在于,所述设备还包括:采集单元,用于采集语音数据;生成单元,用于生成第一消息,所述第一消息携带有采集到的所述语音数据。
- 根据权利要求14所述的设备,其特征在于,所述设备还包括:执行单元,用于执行用户在所述第一终端上的操作指令。
- 一种语音输入的设备,其特征在于,包括:通过总线相连的处理器、电源模块、感应模块、控制模块、通信模块和存储模块;所述处理器通过运行或执行存储在存储模块中的软件程序或指令集,以及调用存储在存储模块中的数据,执行设备的各种功能和处理数据;所述电源模块用于给设备进行供电及电源管理;所述感应模块用于接收输入的数字或字符信息,以及产生与设备的用户设置以及功能控制有关的信号输入;所述控制模块用于产生控制命令;所述通信模块用于通过无线通信与网络以及其他设备通信;所述存储模块用于存储软件程序、指令集和数据。
- 一种语音输入的设备,其特征在于,包括:通过总线相连的处理器、电源模块、音频模块,显示模块,通信模块,输入模块和存储模块;所述处理器通过运行或执行存储在存储模块中的软件程序或指令集,以及调用存储在存储模块中的数据,执行设备的各种功能和处理数据;所述电源模块用于给设备进行供电及电源管理;所述音频模块用于将收集的声音信号转换为音频数据,还可以将接收到的音频数据转换后的电信号,传输到扬声器,由扬声器转换为声音信号输出;所述显示模块用于显示转化的文本信息、由用户输入的信息或提供给用户的信息以及设备的各种菜单;所述通信模块用于通过无线通信与网络以及其他设备通信;所述输入模块用于接收输入的数字或字符信息,以及产生与设备的用户设置以及功能控制有关的键信号输入;所述存储模块用于存储软件程序、指令集和数据。
- 一种计算机可读存储介质,包括指令,当其在计算机上运行时,使得计算机执行如权利要求1-8任意一项所述的方法。
- 一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行如权利要求1-8任意一项所述的方法。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201780003470.5A CN108235813A (zh) | 2017-02-28 | 2017-06-09 | 一种语音输入的方法和相关设备 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710114667 | 2017-02-28 | ||
CN201710114667.8 | 2017-02-28 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018157499A1 true WO2018157499A1 (zh) | 2018-09-07 |
Family
ID=63370595
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2017/087662 WO2018157499A1 (zh) | 2017-02-28 | 2017-06-09 | 一种语音输入的方法和相关设备 |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2018157499A1 (zh) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2020021188A (ja) * | 2018-07-31 | 2020-02-06 | 田中 成典 | 自動車の姿勢推定装置 |
CN111968637A (zh) * | 2020-08-11 | 2020-11-20 | 北京小米移动软件有限公司 | 终端设备的操作模式控制方法、装置、终端设备及介质 |
CN112068793A (zh) * | 2019-06-11 | 2020-12-11 | 北京搜狗科技发展有限公司 | 一种语音输入方法及装置 |
CN113205806A (zh) * | 2021-04-02 | 2021-08-03 | 广东小天才科技有限公司 | 一种输出切换方法、鼠标、终端设备及可读存储介质 |
CN114047900A (zh) * | 2021-10-12 | 2022-02-15 | 中电金信软件有限公司 | 业务处理方法、装置、电子设备及计算机可读存储介质 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2757556A1 (en) * | 2013-01-22 | 2014-07-23 | BlackBerry Limited | Method and system for automatically identifying voice tags through user operation |
CN103943108A (zh) * | 2014-04-04 | 2014-07-23 | 广东翼卡车联网服务有限公司 | 通过方向盘控制器实现手机终端语音导航的方法及系统 |
CN104468992A (zh) * | 2014-11-26 | 2015-03-25 | 深圳市航盛电子股份有限公司 | 一种语音控制车载主机和手机的方法及系统 |
CN105991825A (zh) * | 2015-02-04 | 2016-10-05 | 中兴通讯股份有限公司 | 一种语音控制方法、装置及系统 |
CN106297786A (zh) * | 2016-08-12 | 2017-01-04 | 深圳市亚冠电子有限公司 | 一种语音功能遥控开启方法及装置 |
CN106357932A (zh) * | 2016-11-22 | 2017-01-25 | 奇酷互联网络科技(深圳)有限公司 | 一种通话信息记录方法和移动终端 |
-
2017
- 2017-06-09 WO PCT/CN2017/087662 patent/WO2018157499A1/zh active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2757556A1 (en) * | 2013-01-22 | 2014-07-23 | BlackBerry Limited | Method and system for automatically identifying voice tags through user operation |
CN103943108A (zh) * | 2014-04-04 | 2014-07-23 | 广东翼卡车联网服务有限公司 | 通过方向盘控制器实现手机终端语音导航的方法及系统 |
CN104468992A (zh) * | 2014-11-26 | 2015-03-25 | 深圳市航盛电子股份有限公司 | 一种语音控制车载主机和手机的方法及系统 |
CN105991825A (zh) * | 2015-02-04 | 2016-10-05 | 中兴通讯股份有限公司 | 一种语音控制方法、装置及系统 |
CN106297786A (zh) * | 2016-08-12 | 2017-01-04 | 深圳市亚冠电子有限公司 | 一种语音功能遥控开启方法及装置 |
CN106357932A (zh) * | 2016-11-22 | 2017-01-25 | 奇酷互联网络科技(深圳)有限公司 | 一种通话信息记录方法和移动终端 |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2020021188A (ja) * | 2018-07-31 | 2020-02-06 | 田中 成典 | 自動車の姿勢推定装置 |
CN112068793A (zh) * | 2019-06-11 | 2020-12-11 | 北京搜狗科技发展有限公司 | 一种语音输入方法及装置 |
CN111968637A (zh) * | 2020-08-11 | 2020-11-20 | 北京小米移动软件有限公司 | 终端设备的操作模式控制方法、装置、终端设备及介质 |
CN113205806A (zh) * | 2021-04-02 | 2021-08-03 | 广东小天才科技有限公司 | 一种输出切换方法、鼠标、终端设备及可读存储介质 |
CN114047900A (zh) * | 2021-10-12 | 2022-02-15 | 中电金信软件有限公司 | 业务处理方法、装置、电子设备及计算机可读存储介质 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2018157499A1 (zh) | 一种语音输入的方法和相关设备 | |
US12014223B2 (en) | Content sharing method and electronic device | |
WO2016037318A1 (zh) | 一种指纹识别方法、装置及移动终端 | |
WO2019080873A1 (zh) | 一种批注生成的方法及相关装置 | |
WO2019062910A1 (zh) | 一种复制和粘贴的方法、数据处理装置和用户设备 | |
US20150149925A1 (en) | Emoticon generation using user images and gestures | |
WO2020259024A1 (zh) | 图标分类方法、移动终端及计算机可读存储介质 | |
US20150025882A1 (en) | Method for operating conversation service based on messenger, user interface and electronic device using the same | |
KR101771071B1 (ko) | 통신 방법, 클라이언트, 및 단말 | |
WO2020238938A1 (zh) | 信息输入方法及移动终端 | |
CN111490927B (zh) | 一种显示消息的方法、装置及设备 | |
TWI551095B (zh) | 通信方法及通信系統 | |
WO2015024494A1 (zh) | 应用的分享方法及装置 | |
WO2021104160A1 (zh) | 编辑方法及电子设备 | |
WO2019233316A1 (zh) | 数据处理方法、装置、移动终端以及存储介质 | |
WO2019201109A1 (zh) | 文字处理方法、装置、移动终端及存储介质 | |
US20210165953A1 (en) | Email Translation Method and Electronic Device | |
WO2018049903A1 (zh) | 数据迁移方法及相关设备 | |
CN106959746A (zh) | 语音数据的处理方法及装置 | |
US20150153921A1 (en) | Apparatuses and methods for inputting a uniform resource locator | |
US20230054717A1 (en) | Ui control generation and trigger methods, and terminal | |
CN105871696B (zh) | 一种信息发送、接收方法及移动终端 | |
KR102117295B1 (ko) | 전자 장치의 페어링 방법 및 장치 | |
WO2017143575A1 (zh) | 对图片的内容进行检索的方法、便携式电子设备和图形用户界面 | |
WO2018049902A1 (zh) | 数据迁移方法及相关设备 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17898356 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 17898356 Country of ref document: EP Kind code of ref document: A1 |