WO2018133656A1 - 将语音输入转换成文本输入的方法、装置和语音输入设备 - Google Patents

将语音输入转换成文本输入的方法、装置和语音输入设备 Download PDF

Info

Publication number
WO2018133656A1
WO2018133656A1 PCT/CN2017/120424 CN2017120424W WO2018133656A1 WO 2018133656 A1 WO2018133656 A1 WO 2018133656A1 CN 2017120424 W CN2017120424 W CN 2017120424W WO 2018133656 A1 WO2018133656 A1 WO 2018133656A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
text
text character
host
code
Prior art date
Application number
PCT/CN2017/120424
Other languages
English (en)
French (fr)
Inventor
黄玉玲
Original Assignee
黄玉玲
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 黄玉玲 filed Critical 黄玉玲
Publication of WO2018133656A1 publication Critical patent/WO2018133656A1/zh
Priority to US16/507,051 priority Critical patent/US11087758B2/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/02Input arrangements using manually operated switches, e.g. using keyboards or dials
    • G06F3/023Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
    • G06F3/0233Character input methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Definitions

  • Embodiments of the present invention relate to the field of information processing technologies, and in particular, to a method, an apparatus, and a voice input device for converting voice input into text input.
  • keyboard position scanning module is configured to scan the position of the button
  • decoding module is configured to generate the button code according to the button position and convert the button code into an identification code recognizable by the host
  • the communication module is configured to transmit the identification code to the host ( For example, a computer, a tablet, etc., and receiving a control command from the host, the host receives the identification code and performs conversion processing to implement text input.
  • the technical problem to be solved by the embodiment of the present invention is to provide a method, a device and a voice input device for converting a voice input into a text input, which can convert the voice into a text input, and does not need to manually tap the keyboard, thereby providing convenience for the user.
  • a technical solution adopted by the embodiment of the present invention is to provide a method for converting a voice input into a text input, the method comprising:
  • the key code includes: a key code corresponding to the text character or a key code corresponding to the paste command.
  • the method further includes:
  • processing the text characters to generate a key code includes:
  • processing the text characters to generate a key code includes:
  • the converting the voice signal into a corresponding text character comprises:
  • the speech signal is identified and converted into corresponding text characters according to a preset language category.
  • an embodiment of the present invention provides a method for converting a voice input into a text input, the method comprising:
  • processing the text characters to generate a key code includes:
  • processing the text characters to generate a key code includes:
  • an embodiment of the present invention further provides an apparatus for converting a voice input into a text input, the apparatus comprising:
  • a voice acquisition module configured to acquire a voice signal
  • a voice signal sending module configured to send the voice signal to a host
  • a button code receiving module configured to receive a button code sent by the host
  • a decoding module configured to convert the key code into an identification code that can be recognized by the host
  • An identification code sending module is configured to send the identification code to the host.
  • the key code includes: a key code corresponding to the text character or a key code corresponding to the paste command.
  • the device further includes:
  • a voice receiving module configured to receive a voice signal sent by the voice input device
  • a voice recognition module configured to convert the voice signal into a corresponding text character
  • a processing module configured to process the text characters to generate a key code
  • a button code sending module configured to send the button code to the voice input device
  • the identification code receiving module is configured to receive the identification code sent by the voice input device, and implement text character input.
  • the processing module includes:
  • the first processing submodule is configured to convert the text character into a key code corresponding to the text character.
  • the processing module includes:
  • the second processing submodule is configured to paste the text character to the clipboard, and generate a key code corresponding to the paste command.
  • the voice recognition module includes:
  • a first voice recognition submodule configured to: convert the voice signal into a corresponding text character according to a language category of the host system;
  • the second voice recognition submodule is configured to convert the voice signal into a corresponding text character according to a preset language category.
  • an embodiment of the present invention further provides an apparatus for converting a voice input into a text input, the apparatus comprising:
  • a voice receiving module configured to receive a voice signal sent by the voice input device
  • a voice recognition module configured to convert the voice signal into a corresponding text character
  • a processing module configured to process the text characters to generate a key code
  • a button code sending module configured to send the button code to the voice input device
  • the identification code receiving module is configured to receive the identification code sent by the voice input device, and implement text character input.
  • the processing module includes:
  • the first processing submodule is configured to convert the text character into a key code corresponding to the text character.
  • the processing module includes:
  • the second processing submodule is configured to paste the text character to the clipboard, and generate a key code corresponding to the paste command.
  • the embodiment of the present invention further provides a voice input device, where the voice input device includes:
  • a sound signal receiving unit configured to receive a sound signal
  • a voice signal acquiring unit configured to be in communication with the sound signal receiving unit, configured to process the sound signal to obtain a voice signal
  • a sending unit in communication with the voice signal acquiring unit, configured to send the voice signal to a host;
  • a receiving unit configured to receive key code information from the host.
  • an embodiment of the present invention further provides an electronic device, including:
  • At least one processor and,
  • the memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to perform the method as described above.
  • the embodiment of the present invention further provides a non-transitory computer readable storage medium, where the computer-readable storage medium stores computer-executable instructions, when the computer-executable instructions are executed by an electronic device, The electronic device is caused to perform the method as described above.
  • the embodiment of the present invention further provides a computer program product, the computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions, when When the program instructions are executed by the electronic device, the electronic device is caused to perform the method as described above.
  • the embodiment of the present invention obtains a voice signal, and then sends the voice signal to the host, and the host processes the voice signal to generate a corresponding key code, and receives the host code. After the sent key code is encoded, the text input is implemented according to the key code, and the keyboard is not required to be manually tapped, which provides convenience for the user.
  • FIG. 1 is a flow chart of one embodiment of a method of converting speech input to text input in accordance with the present invention
  • FIG. 2 is a flow chart of one embodiment of a method for converting speech input to text input according to the present invention
  • FIG. 3 is a flow chart of one embodiment of a method for converting speech input into text input according to the present invention
  • FIG. 4 is a schematic structural diagram of an embodiment of the present invention for converting a voice input into a text input device
  • FIG. 5 is a schematic structural diagram of an embodiment of the present invention for converting a voice input into a text input device
  • FIG. 6 is a schematic structural diagram of a processing module in an embodiment of converting a voice input into a text input device according to the present invention
  • FIG. 7 is a schematic structural diagram of a processing module in an embodiment of converting a voice input into a text input device according to the present invention.
  • FIG. 8 is a schematic structural diagram of an embodiment of the present invention for converting a voice input into a text input device
  • FIG. 9 is a schematic structural diagram of an embodiment of a voice input device of the present invention.
  • FIG. 10 is a schematic diagram showing the hardware structure of an electronic device for converting a voice input into a text input method according to an embodiment of the present invention.
  • the method and device for converting voice input into text input can be used when a voice signal is input through a voice input device and then text input is implemented in the electronic device, and the voice input device and the electronic device can be wired, Connect by wireless or Bluetooth.
  • the electronic device ie, the host described below
  • the voice input device It can be a keyboard with a voice function, a mouse, a Bluetooth headset, a microphone, an ipcamera (camera camera), and a camera microphone.
  • an embodiment of the present invention provides a method for converting a voice input into a text input.
  • the method may be used on a voice input device side, and the method includes:
  • Step 101 Acquire a voice signal
  • a microphone can be set on a voice input device such as a keyboard for receiving a sound signal, and another button is set or an original button on the keyboard is used. When the button is pressed, recording starts, and when the button is released, the recording is stopped. This acquires a speech signal.
  • the voice input device is a mouse
  • a microphone can be set on the mouse for receiving the sound signal, and another button is set or the original button on the mouse is used. When the button is pressed, the recording starts, and when the button is released, the recording is stopped. This acquires a speech signal.
  • Step 102 Send the voice signal to a host.
  • the host refers to an electronic device such as a computer or a tablet computer.
  • a sending unit may be disposed in the voice input device, such as a keyboard or a mouse, for transmitting the voice signal to the host through wired, wireless or Bluetooth. .
  • Step 103 Receive a key code sent by the host
  • the host After receiving the voice signal sent by the voice input device side, the host converts the voice signal into a corresponding text character.
  • the host may convert the text character into a key code corresponding to the text character and then encode the button.
  • Send to the voice input device, or the host directly pastes the text character to the clipboard, and then generates a button code corresponding to the paste command, and then sends the button code to the voice input device, and the voice input device performs the paste operation.
  • a receiving unit may be disposed in a voice input device such as a keyboard or a mouse for receiving a key code sent by the host by wired, wireless or Bluetooth.
  • Step 104 Convert the key code into an identification code that can be recognized by the host;
  • the key code sent by the host is converted and converted into an identification code (such as ASCII code) that the host can recognize.
  • an identification code such as ASCII code
  • the voice input device is a keyboard
  • the existing coding module in the keyboard can be used to convert the button code into an identification code that can be recognized by the host.
  • the voice input device can be added.
  • the decoding module uses the newly added decoding module to perform conversion of the key code to the identification code.
  • Step 105 Send the identification code to the host.
  • the identification code is sent to the host, and the host receives the identification code and performs conversion processing to implement text character input.
  • the voice input device is a keyboard
  • the existing communication module in the keyboard can be used to send the identification code to the host.
  • a communication module can be added, and the new communication module can be added. The communication module sends the identification code to the host.
  • the voice signal is obtained, and then the voice signal is sent to the host, and the voice signal is processed by the host to generate a corresponding button code.
  • the text input is implemented according to the button code, without manual knocking.
  • the keyboard is provided for the convenience of the user.
  • the method includes:
  • Step 201 Acquire a voice signal.
  • Step 202 Send the voice signal to a host.
  • Step 203 Receive a voice signal sent by the voice input device.
  • the step of receiving the voice signal sent by the voice input device can be implemented by software on the basis of existing hardware.
  • Step 204 Convert the voice signal into a corresponding text character.
  • the speech signal is converted into text characters using speech recognition technology.
  • Step 205 Processing the text characters to generate a key code.
  • the host may convert the text character into a key code corresponding to the text character, or the host directly pastes the text character into the clipboard, and then generates a key code corresponding to the paste command.
  • Step 206 Send the button code to the voice input device.
  • the step of transmitting the button code to the voice input device may be implemented by software on the basis of existing hardware.
  • Step 207 Receive a key code sent by the host.
  • Step 208 Convert the key code into an identification code that the host can recognize
  • Step 209 Send the identification code to the host.
  • Step 210 Receive an identification code sent by the voice input device to implement text character input.
  • the text text input is implemented, and the text character can be input in the current text of the host, and the current text refers to the text positioned by the cursor.
  • the text may be a word file, a text file, a PPT file, or the like that can implement text input.
  • the steps 201, 202, 207, 208, and 209 can be performed on the voice input device side, and the steps 203, 204, 205, 206, and 210 can be performed on the host side.
  • the embodiment of the present invention converts the voice signal into a corresponding text character, converts the text character into a key code, and then implements text input according to the key code. It is convenient for the user without having to manually tap the keyboard.
  • the converting the voice signal into a corresponding text character comprises:
  • the voice signal may be Chinese, English, Japanese, etc.
  • the host may identify according to the language category of the system, that is, if the system language is Chinese, the voice signal is identified according to the Chinese rule, if The system language is English, and the speech signal is recognized according to English rules.
  • the speech signal is identified and converted into corresponding text characters according to a preset language category.
  • the method of performing speech recognition based on the system language category is not suitable for the case where the system language is Chinese and wants to enter English. Therefore, it is possible to provide a language category entry and perform speech recognition according to the set language category. In this way, no matter whether you want to enter Chinese, English or Japanese, you can realize the speech recognition of the corresponding language as long as you set the corresponding language category.
  • an embodiment of the present invention further provides a method for converting a voice input into a text input, and the method may be used on a host side, and the method includes:
  • Step 301 Receive a voice signal sent by the voice input device.
  • the step of receiving the voice signal sent by the voice input device can be implemented by software on the basis of existing hardware.
  • Step 302 Convert the voice signal into a corresponding text character.
  • the speech signal is converted into text characters using speech recognition technology.
  • the voice signal may be recognized and converted into a corresponding text character according to a language category of the host system; or the voice signal may be recognized and converted into a corresponding text character according to a preset language category. .
  • Step 303 Processing the text characters to generate a key code.
  • the host may convert the text character into a key code corresponding to the text character, or the host directly pastes the text character into the clipboard, and then generates a key code corresponding to the paste command.
  • Step 304 Send the button code to the voice input device.
  • the step of transmitting the button code to the voice input device may be implemented by software on the basis of existing hardware.
  • Step 305 Receive an identification code sent by the voice input device to implement text character input.
  • the text text input is implemented, and the text character can be input in the current text of the host, and the current text refers to the text positioned by the cursor.
  • the text may be a word file, a text file, a PPT file, or the like that can implement text input.
  • the embodiment of the present invention converts the voice signal into a corresponding text character by receiving the voice signal sent by the voice input device, and then converts the text character into a button code. And then according to the button code to achieve text input, without the need to manually tap the keyboard, to provide convenience for the user.
  • the embodiment of the present invention further provides a device for converting a voice input into a text input, and the device may be disposed in a voice input device, and the device includes:
  • a voice acquiring module 401 configured to acquire a voice signal
  • a voice signal sending module 402 configured to send the voice signal to a host
  • a key code receiving module 403, configured to receive a key code sent by the host
  • a decoding module 404 configured to convert the key code into an identification code that can be recognized by the host
  • the identification code sending module 405 is configured to send the identification code to the host.
  • the voice obtaining module 401 is configured to acquire a voice signal, and then send the voice signal to the host by using the voice signal sending module 402, and the host processes the voice signal to generate a corresponding button code.
  • the host may convert the text character.
  • the key code corresponding to the text character is then sent to the voice input device, or the host directly pastes the text character to the clipboard, and then generates a key code corresponding to the paste command, and then sends the button code to the voice input. device.
  • the key code sent by the host is received by the key code receiving module 403, and the key code is converted by the decoding module 404 into an identification code that can be recognized by the host.
  • the voice signal is obtained, and then the voice signal is sent to the host, and the voice signal is processed by the host to generate a corresponding button code.
  • the text input is implemented according to the button code, without manual knocking.
  • the keyboard is provided for the convenience of the user.
  • the device includes:
  • the voice acquiring module 501 is configured to acquire a voice signal.
  • a voice signal sending module 502 configured to send the voice signal to a host
  • the voice receiving module 503 is configured to receive a voice signal sent by the voice input device.
  • a voice recognition module 504 configured to convert the voice signal into a corresponding text character
  • the processing module 505 is configured to process the text characters to generate a key code.
  • a key code sending module 506, configured to send the key code to the voice input device
  • the key code receiving module 507 is configured to receive a key code sent by the host;
  • a decoding module 508 configured to convert the key code into an identification code that can be recognized by the host;
  • the identification code sending module 509 is configured to send the identification code to the host.
  • the identification code receiving module 510 is configured to receive the identification code sent by the voice input device, and implement text character input.
  • the voice acquiring module 501 is configured to acquire a voice signal, and then send the voice signal to the host by using the voice signal sending module 502, and the host receives the voice signal sent by the voice input device by using the voice receiving module 503, and the voice signal is sent by the voice recognition module 504.
  • the voice signal is converted into a corresponding text character, and the text character is processed by the processing module 505 to generate a key code.
  • the button code is sent to the voice input device by the button code sending module 506.
  • the voice input device receives the button code sent by the host through the button code receiving module 507, and converts the button code into a recognition code that the host can recognize through the decoding module 508.
  • the identification code is sent to the host by the identification code sending module 509, and the host receives the identification code sent by the voice input device by the identification code receiving module 510, thereby implementing text character input.
  • the voice acquiring module 501, the voice signal sending module 502, the key code receiving module 507, the decoding module 508, and the identification code sending module 509 may be disposed in the voice input device, the voice receiving module 503, the voice recognition module 504, and the processing module 505.
  • the key code sending module 506 and the identification code receiving module 510 can be disposed in the host.
  • the embodiment of the present invention converts the voice signal into a corresponding text character, converts the text character into a key code, and then implements text input according to the key code. It is convenient for the user without having to manually tap the keyboard.
  • the processing module 605 includes:
  • the first processing sub-module 6051 is configured to convert the text character into a key code corresponding to the text character.
  • the host converts the voice signal into a corresponding text character through the voice recognition module
  • the text character is converted into a button code corresponding to the text character, and then the button code is sent to the voice input device through the button code sending module.
  • the processing module 705 includes:
  • the second processing sub-module 7051 is configured to paste the text character to the clipboard, and generate a key code corresponding to the paste command.
  • the host converts the voice signal into a corresponding text character through the voice recognition module
  • the host directly pastes the text character into the clipboard, and then generates a button code corresponding to the paste command, and then sends the button code to the voice through the button code sending module.
  • the input device is operated by the voice input device to perform a paste operation.
  • the voice recognition module includes:
  • a first voice recognition submodule configured to: convert the voice signal into a corresponding text character according to a language category of the host system;
  • the second voice recognition submodule is configured to convert the voice signal into a corresponding text character according to a preset language category.
  • an embodiment of the present invention further provides an apparatus for converting a voice input into a text input, the apparatus comprising:
  • the voice receiving module 801 is configured to receive a voice signal sent by the voice input device.
  • a voice recognition module 802 configured to convert the voice signal into a corresponding text character
  • the processing module 803 is configured to process the text characters to generate a key code.
  • a key code sending module 804 configured to send the key code to the voice input device
  • the identification code receiving module 805 is configured to receive the identification code sent by the voice input device, and implement text character input.
  • the embodiment of the present invention converts the voice signal into a corresponding text character by receiving the voice signal sent by the voice input device, and then converts the text character into a button code. And then according to the button code to achieve text input, without the need to manually tap the keyboard, to provide convenience for the user.
  • the processing module includes:
  • the first processing submodule is configured to convert the text character into a key code corresponding to the text character.
  • the processing module includes:
  • the second processing submodule is configured to paste the text character to the clipboard, and generate a key code corresponding to the paste command.
  • the embodiment of the present invention further provides a voice input device, where the voice input device includes:
  • the sound signal receiving unit 901 is configured to receive a sound signal
  • the voice signal acquiring unit 902 is in communication with the sound signal receiving unit 901, and is configured to process the sound signal to obtain a voice signal;
  • the sending unit 903 is in communication connection with the voice signal acquiring unit 902, and configured to send the voice signal to the host;
  • the receiving unit 904 is configured to receive key code information from the host
  • the decoding unit 905 is connected to the receiving unit 904, and configured to convert the key code into an identification code that can be recognized by the host;
  • the communication unit 906 is connected to the decoding unit 905 and configured to send the identification code to the host.
  • the voice input device may be a keyboard with a voice function, a mouse, a microphone, an ipcamera (network camera), a camera microphone, etc.
  • the sound signal receiving unit may use a microphone to receive a sound signal
  • the function of the voice signal acquiring unit may be It is realized by adding a button on the voice input device or using the original button on the keyboard. When the button is pressed, the recording starts, and when the button is released, the recording is stopped, so that a voice signal is obtained.
  • the receiving unit and the communication unit may be communication modules using wired, wireless or Bluetooth technology.
  • the decoding unit and the communication unit may use the existing decoding module and the communication module of the keyboard, and in the case of other voice input devices, the decoding module in the keyboard may be added to other voice input devices.
  • the decoding unit and the communication unit having the same function as the communication module realize the function of converting the key code into the identification code and transmitting the identification code to the host.
  • the voice input device obtained by the embodiment of the present invention obtains a voice signal, and then sends the voice signal to the host, and the host processes the voice signal to generate a corresponding button code, and receives the button code sent by the host, and implements the text according to the button code. Input, no need to manually tap the keyboard, to provide convenience for the user.
  • FIG. 10 is a schematic diagram showing the hardware structure of an electronic device 10 for converting a voice input into a text input method according to an embodiment of the present disclosure. As shown in FIG. 10, the electronic device 10 includes:
  • One or more processors 11 and memory 12, one processor 11 is taken as an example in FIG.
  • the processor 11 and the memory 12 can be connected by a bus or other means, as exemplified by the bus connection in FIG.
  • the memory 12 is a non-volatile computer readable storage medium, and can be used for storing a non-volatile software program, a non-volatile computer executable program, and a module, such as converting a voice input into a text input in the embodiment of the present application.
  • the corresponding program instruction/module for example, the voice acquisition module 401, the voice signal transmission module 402, the key code reception module 403, the decoding module 404, and the identification code transmission module 405 shown in FIG. 4).
  • the processor 11 executes various functional applications of the server and data processing by running non-volatile software programs, instructions, and modules stored in the memory 12, that is, implementing the above method embodiments to convert voice input into a text input method.
  • the memory 12 can include a storage program area and a storage data area, wherein the storage program area can store an operating system, an application required for at least one function; the storage data area can be stored according to the use of converting the voice input into a text input device. Data, etc. Further, the memory 12 may include a high speed random access memory, and may also include a nonvolatile memory such as at least one magnetic disk storage device, flash memory device, or other nonvolatile solid state storage device. In some embodiments, the memory 12 can optionally include a memory remotely located relative to the processor 11 that can be connected via a network to convert the voice input to a text input device. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • the one or more modules are stored in the memory 12, and when executed by the one or more processors 11, performing a method of converting a voice input into a text input method in any of the above method embodiments, for example, performing the above
  • the method steps 301 through 305 of FIG. 3 are described to implement the functions of the modules 801-805 of FIG.
  • the electronic device of the embodiment of the present application exists in various forms, including but not limited to:
  • Mobile communication devices These devices are characterized by mobile communication functions and are mainly aimed at providing voice and data communication.
  • Such terminals include: smart phones (such as iPhone), multimedia phones, functional phones, and low-end phones.
  • Ultra-mobile personal computer equipment This type of equipment belongs to the category of personal computers, has computing and processing functions, and generally has mobile Internet access.
  • Such terminals include: PDAs, MIDs, and UMPC devices, such as the iPad.
  • Portable entertainment devices These devices can display and play multimedia content. Such devices include: audio, video players (such as iPod), handheld game consoles, e-books, and smart toys and portable car navigation devices.
  • the server consists of a processor, a hard disk, a memory, a system bus, etc.
  • the server is similar to a general-purpose computer architecture, but because of the need to provide highly reliable services, processing power and stability High reliability in terms of reliability, security, scalability, and manageability.
  • Embodiments of the present application provide a non-transitory computer readable storage medium storing computer-executable instructions that are executed by one or more processors, for example, to perform the above
  • the method steps 301 through 305 of FIG. 3 are described to implement the functions of the modules 801-805 of FIG.
  • the device embodiments described above are merely illustrative, wherein the units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, ie may be located A place, or it can be distributed to multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • the embodiments can be implemented by means of software plus a general hardware platform, and of course, by hardware.
  • a person skilled in the art can understand that all or part of the process of implementing the above embodiments can be completed by a computer program to instruct related hardware, and the program can be stored in a computer readable storage medium. When executed, the flow of an embodiment of the methods as described above may be included.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (Random Access). Memory, RAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Acoustics & Sound (AREA)
  • Artificial Intelligence (AREA)
  • Input From Keyboards Or The Like (AREA)
  • Document Processing Apparatus (AREA)

Abstract

一种将语音输入转换成文本输入的方法、装置和语音输入设备,所述方法包括:获取语音信号(101);将所述语音信号发送给主机(102);接收主机发送的按键编码(103);将所述按键编码转换成主机能够识别的识别编码(104);将所述识别编码发送给主机(105)。上述方法通过获取语音信号,然后将语音信号发送给主机,由主机对语音信号进行处理产生对应的按键编码,接收到主机发送的按键编码后根据所述按键编码实现文本输入,无需手动敲击键盘,为使用者提供了方便。

Description

将语音输入转换成文本输入的方法、装置和语音输入设备 技术领域
本发明实施例涉及信息处理技术领域,特别是涉及一种将语音输入转换成文本输入的方法、装置和语音输入设备。
背景技术
目前,在计算机、平板电脑等电子设备中,通常是通过手敲键盘的方式实现文本输入。根据键盘工作原理,可以把键盘分为编码键盘和非编码键盘,无论是哪种键盘,其功能都大概分为几个模块来实现。键盘位置扫描模块,用于扫描按键的位置,译码模块,用于根据按键位置产生按键编码并将按键编码转换成主机能够识别的识别编码,通讯模块,用于将上述识别编码传送给主机(例如计算机、平板电脑等)以及接收来自主机的控制命令,主机接收上述识别编码进行转换处理后实现文本输入。
实现本发明过程中,发明人发现相关技术中至少存在如下问题:必须用双手敲击键盘实现文本输入,对于不会语言输入法或者双手不方便
的人士无法实现文本输入,对于会语言输入法的健康人士,如果长时间双手敲键盘,容易疲劳。
发明内容
本发明实施例主要解决的技术问题是提供一种将语音输入转换成文本输入的方法、装置和语音输入设备,可以将语音转换成文本输入,无需手动敲击键盘,为使用者提供了方便。
为解决上述技术问题,本发明实施例采用的一个技术方案是:提供了一种将语音输入转换成文本输入的方法,所述方法包括:
获取语音信号;
将所述语音信号发送给主机;
接收主机发送的按键编码;
将所述按键编码转换成主机能够识别的识别编码;
将所述识别编码发送给主机。
可选的,所述按键编码包括:与文本字符对应的按键编码或者粘贴命令对应的按键编码。
可选的,所述方法还包括:
接收语音输入设备发送的语音信号;
将所述语音信号转换成对应的文本字符;
对所述文本字符进行处理产生按键编码;
将所述按键编码发送给语音输入设备;
接收语音输入设备发送的识别编码,实现文本字符输入。
可选的,所述对所述文本字符进行处理产生按键编码,包括:
将所述文本字符转换成与所述文本字符对应的按键编码。
可选的,所述对所述文本字符进行处理产生按键编码,包括:
将所述文本字符粘贴到剪贴板,生成粘贴命令对应的按键编码。
可选的,所述将所述语音信号转换成对应的文本字符,包括:
根据主机系统的语言类别对所述语音信号进行识别转换成对应的文本字符;或者,
根据预设的语言类别对所述语音信号进行识别转换成对应的文本字符。
第二方面,本发明实施例提供了一种将语音输入转换成文本输入的方法,所述方法包括:
接收语音输入设备发送的语音信号;
将所述语音信号转换成对应的文本字符;
对所述文本字符进行处理产生按键编码;
将所述按键编码发送给语音输入设备;
接收语音输入设备发送的识别编码,实现文本字符输入。
可选的,所述对所述文本字符进行处理产生按键编码,包括:
将所述文本字符转换成与所述文本字符对应的按键编码。
可选的,所述对所述文本字符进行处理产生按键编码,包括:
将所述文本字符粘贴到剪贴板,生成粘贴命令对应的按键编码。
第三方面,本发明实施例还提供了一种将语音输入转换成文本输入的的装置,所述装置包括:
语音获取模块,用于获取语音信号;
语音信号发送模块,用于将所述语音信号发送给主机;
按键编码接收模块,用于接收主机发送的按键编码;
解码模块,用于将所述按键编码转换成主机能够识别的识别编码;
识别编码发送模块,用于将所述识别编码发送给主机。
可选的,所述按键编码包括:与文本字符对应的按键编码或者粘贴命令对应的按键编码。
可选的,所述装置还包括:
语音接收模块,用于接收语音输入设备发送的语音信号;
语音识别模块,用于将所述语音信号转换成对应的文本字符;
处理模块,用于对所述文本字符进行处理产生按键编码;
按键编码发送模块,用于将所述按键编码发送给语音输入设备;
识别编码接收模块,用于接收语音输入设备发送的识别编码,实现文本字符输入。
可选的,所述处理模块包括:
第一处理子模块,用于将所述文本字符转换成与所述文本字符对应的按键编码。
可选的,所述处理模块包括:
第二处理子模块,用于将所述文本字符粘贴到剪贴板,生成粘贴命令对应的按键编码。
可选的,所述语音识别模块包括:
第一语音识别子模块,用于根据主机系统的语言类别对所述语音信号进行识别转换成对应的文本字符;或者,
第二语音识别子模块,用于根据预设的语言类别对所述语音信号进行识别转换成对应的文本字符。
第四方面,本发明实施例还提供了一种将语音输入转换成文本输入的装置,所述装置包括:
语音接收模块,用于接收语音输入设备发送的语音信号;
语音识别模块,用于将所述语音信号转换成对应的文本字符;
处理模块,用于对所述文本字符进行处理产生按键编码;
按键编码发送模块,用于将所述按键编码发送给语音输入设备;
识别编码接收模块,用于接收语音输入设备发送的识别编码,实现文本字符输入。
可选的,所述处理模块包括:
第一处理子模块,用于将所述文本字符转换成与所述文本字符对应的按键编码。
可选的,所述处理模块包括:
第二处理子模块,用于将所述文本字符粘贴到剪贴板,生成粘贴命令对应的按键编码。
第五方面,本发明实施例还提供了一种语音输入设备,所述语音输入设备包括:
声音信号接收单元,用于接收声音信号;
语音信号获取单元,与所述声音信号接收单元通讯连接,用于对所述声音信号进行处理获取语音信号;
发送单元,与所述语音信号获取单元通讯连接,用于将所述语音信号发送到主机;
接收单元,用于接收来自主机的按键编码信息。
第六方面,本发明实施例还提供了一种电子设备,包括:
至少一个处理器;以及,
与所述至少一个处理器通信连接的存储器;其中,
所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行如上所述的方法。
第七方面,本发明实施例还提供了一种非易失性计算机可读存储介质,所述计算机可读存储介质存储有计算机可执行指令,当所述计算机可执行指令被电子设备执行时,使所述电子设备执行如上所述的方法。
第八方面,本发明实施例还提供了一种计算机程序产品,所述计算机程序产品包括存储在非易失性计算机可读存储介质上的计算机程序,所述计算机程序包括程序指令,当所述程序指令被电子设备执行时,使所述电子设备执行如上所述的方法。
本发明实施例的有益效果是:区别于现有技术的情况,本发明实施例通过获取语音信号,然后将语音信号发送给主机,由主机对语音信号进行处理产生对应的按键编码,接收到主机发送的按键编码后根据所述按键编码实现文本输入,无需手动敲击键盘,为使用者提供了方便。
附图说明
图1是本发明将语音输入转换成文本输入方法的一个实施例的流程图;
图2是本发明将语音输入转换成文本输入方法的一个实施例的流程图;
图3是本发明将语音输入转换成文本输入方法的一个实施例的流程图;
图4是本发明将语音输入转换成文本输入装置的一个实施例的结构示意图;
图5是本发明将语音输入转换成文本输入装置的一个实施例的结构示意图;
图6是本发明将语音输入转换成文本输入装置的一个实施例中处理模块的结构示意图;
图7是本发明将语音输入转换成文本输入装置的一个实施例中处理模块的结构示意图;
图8是本发明将语音输入转换成文本输入装置的一个实施例的结构示意图;
图9是本发明语音输入设备的一个实施例的结构示意图;
图10是本发明实施例提供的将语音输入转换成文本输入方法的电子设备的硬件结构示意图。
具体实施方式
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
本发明实施例提供的将语音输入转换成文本输入的方法和装置,可以使用在通过语音输入设备输入语音信号然后在电子设备实现文本输入的场合,语音输入设备和电子设备之间可以通过有线、无线或者蓝牙等方式连接。其中,电子设备(即下文所述的主机)可以是采用Android、 Window或Mac等系统的设备,例如电脑、电视、播放器、OTT盒子、手机、平板电脑或一体机等;所述语音输入设备可以是带语音功能的键盘、鼠标、蓝牙耳机、麦克风、ipcamera(网络摄像机)以及摄像头麦克风等。
如图1所示,本发明实施例提供了一种将语音输入转换成文本输入的方法,所述方法可以使用在语音输入设备侧,所述方法包括:
步骤101:获取语音信号;
在实际应用中,可以在语音输入设备例如键盘上设置麦克风用于接收声音信号,另外设置一个按键或者使用键盘上原有的按键,当按键按下时,开始录音,当按键松开时,停止录音,这样获取了一段语音信号。在语音输入设备为鼠标的场合,可以在鼠标上设置麦克风用于接收声音信号,另外设置一个按键或者使用鼠标上原有的按键,当按键按下时,开始录音,当按键松开时,停止录音,这样获取了一段语音信号。
步骤102:将所述语音信号发送给主机;
所述主机指计算机或者平板电脑等电子设备,在实际使用中,可以在语音输入设备例如键盘或鼠标中设置一发送单元,用于将所述语音信号通过有线、无线或者蓝牙的方式发送给主机。
步骤103:接收主机发送的按键编码;
主机接收到语音输入设备侧发送的语音信号后,将所述语音信号转换成对应的文本字符,可选的,主机可以将该文本字符转换成与该文本字符对应的按键编码然后将该按键编码发送给语音输入设备,或者主机直接将该文本字符粘贴到剪贴板,然后生成粘贴命令对应的按键编码,然后将该按键编码发送给语音输入设备,由语音输入设备来执行粘贴操作。在实际应用中,可以在键盘或鼠标等语音输入设备中设置一接收单元,用于通过有线、无线或者蓝牙的方式接收主机发送的按键编码。
步骤104:将所述按键编码转换成主机能够识别的识别编码;
将主机发送的按键编码进行转换,转换成主机能够识别的识别编码(例如ASCII码)。在实际应用中,当语音输入设备为键盘的场合,可以利用键盘中已有的译码模块,将所述按键编码转换成主机能够识别的识别编码,在其他语音输入设备的场合,可以新增译码模块,利用新增的译码模块,进行按键编码到识别编码的转换。
步骤105:将所述识别编码发送给主机。
将所述识别编码发送给主机,主机接收该识别编码后进行转换处理,实现文本字符输入。在实际应用中,当语音输入设备为键盘的场合,可以利用键盘中已有的通讯模块,将所述识别编码发送给主机,在其他语音输入设备的场合,可以新增通讯模块,利用新增的通讯模块,将识别编码发送给主机。
本发明实施例通过获取语音信号,然后将语音信号发送给主机,由主机对语音信号进行处理产生对应的按键编码,接收到主机发送的按键编码后根据所述按键编码实现文本输入,无需手动敲击键盘,为使用者提供了方便。
如图2所示,为所述方法的一个实施例的流程图,在该实施例中,所述包括:
步骤201:获取语音信号;
步骤202:将所述语音信号发送给主机;
步骤203:接收语音输入设备发送的语音信号;
所述接收语音输入设备发送的语音信号的步骤,可以在现有的硬件的基础上通过软件来实现。
步骤204:将所述语音信号转换成对应的文本字符;
利用语音识别技术将语音信号转换成文本字符。
步骤205:对所述文本字符进行处理产生按键编码;
可选的,主机可以将该文本字符转换成与该文本字符对应的按键编码,或者主机直接将该文本字符粘贴到剪贴板,然后生成粘贴命令对应的按键编码。
步骤206:将所述按键编码发送给语音输入设备;
所述将所述按键编码发送给语音输入设备的步骤,可以在现有的硬件的基础上通过软件来实现。
步骤207:接收主机发送的按键编码;
步骤208:将所述按键编码转换成主机能够识别的识别编码;
步骤209:将所述识别编码发送给主机。
步骤210:接收语音输入设备发送的识别编码,实现文本字符输入。
其中,所述实现文本字符输入,可以在主机的当前文本输入文本字符,所述当前文本是指由光标定位的文本。所述文本可以为word文件、文本文件、PPT文件等能实现文本输入的文件。
其中,步骤201、202、207、208、209可以在语音输入设备侧执行,步骤203、204、205、206、210可以在主机侧执行。
与现有技术通过扫描按键位置,根据按键位置产生按键编码相比,本发明实施例通过将语音信号转换成对应的文本字符,再将文本字符转换成按键编码,然后根据按键编码实现文本输入,无需手动敲击键盘,为使用者提供了方便。
其中,在所述方法的某些实施例中,所述将所述语音信号转换成对应的文本字符,包括:
根据主机系统的语言类别对所述语音信号进行识别转换成对应的文本字符;
即所述语音信号有可能是中文、英文或者日文等,主机在进行语音识别时,可以根据系统的语言类别来进行识别,即如果系统语言是中文,则根据中文规则识别所述语音信号,如果系统语言是英文则根据英文规则来识别语音信号。
或者,
根据预设的语言类别对所述语音信号进行识别转换成对应的文本字符。
根据系统语言类别进行语音识别的方法不适合系统语言是中文而想录入英文的场合,因此可以提供设置语言类别入口,根据设置的的语言类别来进行语音识别。这样无论想录入中文、英文还是日文只要进行相应的语言类别设置,就可以实现对应语言的语音识别。
相应的,如图3所示,本发明实施例还提供了一种将语音输入转换成文本输入的方法,所述方法可以用于主机侧,所述方法包括:
步骤301:接收语音输入设备发送的语音信号;
所述接收语音输入设备发送的语音信号的步骤,可以在现有的硬件的基础上通过软件来实现。
步骤302:将所述语音信号转换成对应的文本字符;
利用语音识别技术将语音信号转换成文本字符。可选的,在实际应用中,可以根据主机系统的语言类别对所述语音信号进行识别转换成对应的文本字符;或者根据预设的语言类别对所述语音信号进行识别转换成对应的文本字符。
步骤303:对所述文本字符进行处理产生按键编码;
可选的,主机可以将该文本字符转换成与该文本字符对应的按键编码,或者主机直接将该文本字符粘贴到剪贴板,然后生成粘贴命令对应的按键编码。
步骤304:将所述按键编码发送给语音输入设备;
所述将所述按键编码发送给语音输入设备的步骤,可以在现有的硬件的基础上通过软件来实现。
步骤305:接收语音输入设备发送的识别编码,实现文本字符输入。
其中,所述实现文本字符输入,可以在主机的当前文本输入文本字符,所述当前文本是指由光标定位的文本。所述文本可以为word文件、文本文件、PPT文件等能实现文本输入的文件。
与现有技术通过扫描按键位置,根据按键位置产生按键编码相比,本发明实施例通过接收语音输入设备发送的语音信号,将语音信号转换成对应的文本字符,再将文本字符转换成按键编码,然后根据该按键编码实现文本输入,无需手动敲击键盘,为使用者提供了方便。
相应的,如图4所示,本发明实施例还提供了一种将语音输入转换成文本输入的装置,所述装置可以设置于语音输入设备内,所述装置包括:
语音获取模块401,用于获取语音信号;
语音信号发送模块402,用于将所述语音信号发送给主机;
按键编码接收模块403,用于接收主机发送的按键编码;
解码模块404,用于将所述按键编码转换成主机能够识别的识别编码;
识别编码发送模块405,用于将所述识别编码发送给主机。
语音获取模块401,用于获取语音信号,然后通过语音信号发送模块402将所述语音信号发送给主机,主机对语音信号进行处理产生对应的按键编码,可选的,主机可以将该文本字符转换成与该文本字符对应的按键编码然后将该按键编码发送给语音输入设备,或者主机直接将该文本字符粘贴到剪贴板,然后生成粘贴命令对应的按键编码,然后将该按键编码发送给语音输入设备。通过按键编码接收模块403接收主机发送的按键编码,通过解码模块404将所述按键编码转换成主机能够识别的识别编码。
本发明实施例通过获取语音信号,然后将语音信号发送给主机,由主机对语音信号进行处理产生对应的按键编码,接收到主机发送的按键编码后根据所述按键编码实现文本输入,无需手动敲击键盘,为使用者提供了方便。
如图5所示,为所述装置的一个实施例的结构示意图,在该实施例中,所述装置包括:
语音获取模块501,用于获取语音信号;
语音信号发送模块502,用于将所述语音信号发送给主机;
语音接收模块503,用于接收语音输入设备发送的语音信号;
语音识别模块504,用于将所述语音信号转换成对应的文本字符;
处理模块505,用于对所述文本字符进行处理产生按键编码;
按键编码发送模块506,用于将所述按键编码发送给语音输入设备;
按键编码接收模块507,用于接收主机发送的按键编码;
解码模块508,用于将所述按键编码转换成主机能够识别的识别编码;
识别编码发送模块509,用于将所述识别编码发送给主机。
识别编码接收模块510,用于接收语音输入设备发送的识别编码,实现文本字符输入。
语音获取模块501,用于获取语音信号,然后通过语音信号发送模块502将所述语音信号发送给主机,主机通过语音接收模块503接收语音输入设备发送的语音信号,通过语音识别模块504将所述语音信号转换成对应的文本字符,通过处理模块505将所述文本字符进行处理产生按键编码,
通过按键编码发送模块506将所述按键编码发送给语音输入设备,语音输入设备通过按键编码接收模块507接收主机发送的按键编码,通过解码模块508将所述按键编码转换成主机能够识别的识别编码,通过识别编码发送模块509将所述识别编码发送给主机,主机通过识别编码接收模块510接收语音输入设备发送的识别编码,实现文本字符输入。
其中,语音获取模块501、语音信号发送模块502、按键编码接收模块507、解码模块508、识别编码发送模块509可以设置在语音输入设备内,语音接收模块503、语音识别模块504、处理模块505、按键编码发送模块506、识别编码接收模块510可以设置在主机内。
与现有技术通过扫描按键位置,根据按键位置产生按键编码相比,本发明实施例通过将语音信号转换成对应的文本字符,再将文本字符转换成按键编码,然后根据按键编码实现文本输入,无需手动敲击键盘,为使用者提供了方便。
可选的,如图6所示,在所述装置的某些实施例中,所述处理模块605包括:
第一处理子模块6051,用于将所述文本字符转换成与所述文本字符对应的按键编码。
即主机通过语音识别模块将语音信号转换成对应的文本字符后,将该文本字符转换成与该文本字符对应的按键编码,然后通过按键编码发送模块将该按键编码发送给语音输入设备。
可选的,如图7所示,在所述装置的某些实施例中,所述处理模块705包括:
第二处理子模块7051,用于将所述文本字符粘贴到剪贴板,生成粘贴命令对应的按键编码。
即主机通过语音识别模块将语音信号转换成对应的文本字符后,主机直接将该文本字符粘贴到剪贴板,然后生成粘贴命令对应的按键编码,然后通过按键编码发送模块将该按键编码发送给语音输入设备,由语音输入设备来执行粘贴操作。
可选的,在所述装置的某些实施例中,所述语音识别模块包括:
第一语音识别子模块,用于根据主机系统的语言类别对所述语音信号进行识别转换成对应的文本字符;或者,
第二语音识别子模块,用于根据预设的语言类别对所述语音信号进行识别转换成对应的文本字符。
相应的,本发明实施例还提供了一种将语音输入转换成文本输入的装置,所述装置包括:
语音接收模块801,用于接收语音输入设备发送的语音信号;
语音识别模块802,用于将所述语音信号转换成对应的文本字符;
处理模块803,用于对所述文本字符进行处理产生按键编码;
按键编码发送模块804,用于将所述按键编码发送给语音输入设备;
识别编码接收模块805,用于接收语音输入设备发送的识别编码,实现文本字符输入。
与现有技术通过扫描按键位置,根据按键位置产生按键编码相比,本发明实施例通过接收语音输入设备发送的语音信号,将语音信号转换成对应的文本字符,再将文本字符转换成按键编码,然后根据该按键编码实现文本输入,无需手动敲击键盘,为使用者提供了方便。
可选的,在所述装置的某些实施例中,所述处理模块包括:
第一处理子模块,用于将所述文本字符转换成与所述文本字符对应的按键编码。
可选的,在所述装置的某些实施例中,所述处理模块包括:
第二处理子模块,用于将所述文本字符粘贴到剪贴板,生成粘贴命令对应的按键编码。
需要说明的是,由于本发明实施例的装置实施例与方法实施例基于相同的发明构思,方法实施例中的技术内容同样适用于装置实施例,因此,装置实施例中与方法实施例相同的技术内容在此不再赘述。
相应的,如图9所示,本发明实施例还提供了一种语音输入设备,所述语音输入设备包括:
声音信号接收单元901,用于接收声音信号;
语音信号获取单元902,与所述声音信号接收单元901通讯连接,用于对所述声音信号进行处理获取语音信号;
发送单元903,与所述语音信号获取单元902通讯连接,用于将所述语音信号发送到主机;
接收单元904,用于接收来自主机的按键编码信息;
译码单元905,与所述接收单元904相连,用于将所述按键编码转换成主机能够识别的识别编码;
通讯单元906,与所述译码单元905相连,用于将所述识别编码发送给主机。
其中,所述语音输入设备可以是带语音功能的键盘、鼠标、麦克风、ipcamera(网络摄像机)以及摄像头麦克风等,所述声音信号接收单元可以使用麦克风来接收声音信号,语音信号获取单元的功能可以以在语音输入设备上新增的一个按键或者使用键盘上原有的按键来实现,当按键按下时,开始录音,当按键松开时,停止录音,这样获取了一段语音信号。接收单元和通讯单元可以为采用有线、无线或者蓝牙技术的通讯模块。在语音输入设备为键盘的场合,译码单元和通讯单元可以采用键盘现有的译码模块和通讯模块,其他语音输入设备的场合,可以在其他语音输入设备中增加与键盘中的译码模块和通讯模块具有相同功能的译码单元和通讯单元实现将按键编码转换成识别编码和发送识别编码的给主机的功能。
本发明实施例提供的语音输入设备通过获取语音信号,然后将语音信号发送给主机,由主机对语音信号进行处理产生对应的按键编码,接收到主机发送的按键编码后根据所述按键编码实现文本输入,无需手动敲击键盘,为使用者提供了方便。
图10是本申请实施例提供的将语音输入转换成文本输入方法的电子设备10的硬件结构示意图,如图10所示,该电子设备10包括:
一个或多个处理器11以及存储器12,图10中以一个处理器11为例。
处理器11和存储器12可以通过总线或者其他方式连接,图10中以通过总线连接为例。
存储器12作为一种非易失性计算机可读存储介质,可用于存储非易失性软件程序、非易失性计算机可执行程序以及模块,如本申请实施例中的将语音输入转换成文本输入方法对应的程序指令/模块(例如,附图4所示的语音获取模块401、语音信号发送模块402、按键编码接收模块403、解码模块404和识别编码发送模块405)。处理器11通过运行存储在存储器12中的非易失性软件程序、指令以及模块,从而执行服务器的各种功能应用以及数据处理,即实现上述方法实施例将语音输入转换成文本输入方法。
存储器12可以包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需要的应用程序;存储数据区可存储根据将语音输入转换成文本输入装置的使用所创建的数据等。此外,存储器12可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他非易失性固态存储器件。在一些实施例中,存储器12可选包括相对于处理器11远程设置的存储器,这些远程存储器可以通过网络连接至将语音输入转换成文本输入装置。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
所述一个或者多个模块存储在所述存储器12中,当被所述一个或者多个处理器11执行时,执行上述任意方法实施例中的将语音输入转换成文本输入方法,例如,执行以上描述的图3中的方法步骤301至步骤305,实现图8中的模块801-805的功能。
上述产品可执行本申请实施例所提供的方法,具备执行方法相应的功能模块和有益效果。未在本实施例中详尽描述的技术细节,可参见本申请实施例所提供的方法。
本申请实施例的电子设备以多种形式存在,包括但不限于:
(1)移动通信设备:这类设备的特点是具备移动通信功能,并且以提供话音、数据通信为主要目标。这类终端包括:智能手机(例如iPhone)、多媒体手机、功能性手机,以及低端手机等。
(2)超移动个人计算机设备:这类设备属于个人计算机的范畴,有计算和处理功能,一般也具备移动上网特性。这类终端包括:PDA、MID和UMPC设备等,例如iPad。
(3)便携式娱乐设备:这类设备可以显示和播放多媒体内容。该类设备包括:音频、视频播放器(例如iPod),掌上游戏机,电子书,以及智能玩具和便携式车载导航设备。
(4)服务器:提供计算服务的设备,服务器的构成包括处理器、硬盘、内存、系统总线等,服务器和通用的计算机架构类似,但是由于需要提供高可靠的服务,因此在处理能力、稳定性、可靠性、安全性、可扩展性、可管理性等方面要求较高。
(5)其他具有数据交互功能的电子装置,例如电视等。
本申请实施例提供了一种非易失性计算机可读存储介质,所述计算机可读存储介质存储有计算机可执行指令,该计算机可执行指令被一个或多个处理器执行,例如,执行以上描述的图3中的方法步骤301至步骤305,实现图8中的模块801-805的功能。
以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。
通过以上的实施例的描述,本领域普通技术人员可以清楚地了解到各实施例可借助软件加通用硬件平台的方式来实现,当然也可以通过硬件。本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory, ROM)或随机存储记忆体(Random Access Memory, RAM)等。
最后应说明的是:以上实施例仅用以说明本申请的技术方案,而非对其限制;在本申请的思路下,以上实施例或者不同实施例中的技术特征之间也可以进行组合,步骤可以以任意顺序实现,并存在如上所述的本申请的不同方面的许多其它变化,为了简明,它们没有在细节中提供;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。

Claims (22)

  1. 一种将语音输入转换成文本输入的方法,其特征在于,包括:
    获取语音信号;
    将所述语音信号发送给主机;
    接收主机发送的按键编码;
    将所述按键编码转换成主机能够识别的识别编码;
    将所述识别编码发送给主机。
  2. 根据权利要求1所述的方法,其特征在于,所述按键编码包括:与文本字符对应的按键编码或者粘贴命令对应的按键编码。
  3. 根据权利要求1或2所述的方法,其特征在于,所述方法还包括:
    接收语音输入设备发送的语音信号;
    将所述语音信号转换成对应的文本字符;
    对所述文本字符进行处理产生按键编码;
    将所述按键编码发送给语音输入设备;
    接收语音输入设备发送的识别编码,实现文本字符输入。
  4. 根据权利要求3所述的方法,其特征在于,所述对所述文本字符进行处理产生按键编码,包括:
    将所述文本字符转换成与所述文本字符对应的按键编码。
  5. 根据权利要求3所述的方法,其特征在于,所述对所述文本字符进行处理产生按键编码,包括:
    将所述文本字符粘贴到剪贴板,生成粘贴命令对应的按键编码。
  6. 根据权利要求3所述的方法,其特征在于,所述将所述语音信号转换成对应的文本字符,包括:
    根据主机系统的语言类别对所述语音信号进行识别转换成对应的文本字符;或者,
    根据预设的语言类别对所述语音信号进行识别转换成对应的文本字符。
  7. 一种将语音输入转换成文本输入的方法,其特征在于,包括:
    接收语音输入设备发送的语音信号;
    将所述语音信号转换成对应的文本字符;
    对所述文本字符进行处理产生按键编码;
    将所述按键编码发送给语音输入设备;
    接收语音输入设备发送的识别编码,实现文本字符输入。
  8. 根据权利要求7所述的方法,其特征在于,所述对所述文本字符进行处理产生按键编码,包括:
    将所述文本字符转换成与所述文本字符对应的按键编码。
  9. 根据权利要求7所述的方法,其特征在于,所述对所述文本字符进行处理产生按键编码,包括:
    将所述文本字符粘贴到剪贴板,生成粘贴命令对应的按键编码。
  10. 一种将语音输入转换成文本输入的装置,其特征在于,包括:
    语音获取模块,用于获取语音信号;
    语音信号发送模块,用于将所述语音信号发送给主机;
    按键编码接收模块,用于接收主机发送的按键编码;
    解码模块,用于将所述按键编码转换成主机能够识别的识别编码;
    识别编码发送模块,用于将所述识别编码发送给主机。
  11. 根据权利要求10所述的装置,其特征在于,所述按键编码包括:与文本字符对应的按键编码或者粘贴命令对应的按键编码。
  12. 根据权利要求10或11所述的装置,其特征在于,所述装置还包括:
    语音接收模块,用于接收语音输入设备发送的语音信号;
    语音识别模块,用于将所述语音信号转换成对应的文本字符;
    处理模块,用于对所述文本字符进行处理产生按键编码;
    按键编码发送模块,用于将所述按键编码发送给语音输入设备;
    识别编码接收模块,用于接收语音输入设备发送的识别编码,实现文本字符输入。
  13. 根据权利要求12所述的装置,其特征在于,所述处理模块包括:
    第一处理子模块,用于将所述文本字符转换成与所述文本字符对应的按键编码。
  14. 根据权利要求12所述的装置,其特征在于,所述处理模块包括:
    第二处理子模块,用于将所述文本字符粘贴到剪贴板,生成粘贴命令对应的按键编码。
  15. 根据权利要求12所述的装置,其特征在于,所述语音识别模块包括:
    第一语音识别子模块,用于根据主机系统的语言类别对所述语音信号进行识别转换成对应的文本字符;或者,
    第二语音识别子模块,用于根据预设的语言类别对所述语音信号进行识别转换成对应的文本字符。
  16. 一种将语音输入转换成文本输入的装置,其特征在于,包括:
    语音接收模块,用于接收语音输入设备发送的语音信号;
    语音识别模块,用于将所述语音信号转换成对应的文本字符;
    处理模块,用于对所述文本字符进行处理产生按键编码;
    按键编码发送模块,用于将所述按键编码发送给语音输入设备;
    识别编码接收模块,用于接收语音输入设备发送的识别编码,实现文本字符输入。
  17. 根据权利要求16所述的装置,其特征在于,所述处理模块包括:
    第一处理子模块,用于将所述文本字符转换成与所述文本字符对应的按键编码。
  18. 根据权利要求16所述的装置,其特征在于,所述处理模块包括:
    第二处理子模块,用于将所述文本字符粘贴到剪贴板,生成粘贴命令对应的按键编码。
  19. 一种语音输入设备,其特征在于,所述语音输入设备包括:
    声音信号接收单元,用于接收声音信号;
    语音信号获取单元,与所述声音信号接收单元通讯连接,用于对所述声音信号进行处理获取语音信号;
    发送单元,与所述语音信号获取单元通讯连接,用于将所述语音信号发送到主机;
    接收单元,用于接收来自主机的按键编码信息;
    译码单元,与所述接收单元相连,用于将所述按键编码转换成主机能够识别的识别编码;
    通讯单元,与所述译码单元相连,用于将所述识别编码发送给主机。
  20. 一种电子设备,其特征在于,包括:
    至少一个处理器;以及,
    与所述至少一个处理器通信连接的存储器;其中,
    所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行权利要求7-9任意一项所述的方法。
  21. 一种非易失性计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机可执行指令,当所述计算机可执行指令被电子设备执行时,使所述电子设备执行权利要求7-9任意一项所述的方法。
  22. 一种计算机程序产品,其特征在于,所述计算机程序产品包括存储在非易失性计算机可读存储介质上的计算机程序,所述计算机程序包括程序指令,当所述程序指令被电子设备执行时,使所述电子设备执行权利要求7-9任意一项所述的方法。
PCT/CN2017/120424 2017-01-19 2017-12-31 将语音输入转换成文本输入的方法、装置和语音输入设备 WO2018133656A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/507,051 US11087758B2 (en) 2017-01-19 2019-07-10 Method and voice input apparatus for converting voice input to text input

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710038976.1 2017-01-19
CN201710038976.1A CN106896933B (zh) 2017-01-19 2017-01-19 将语音输入转换成文本输入的方法、装置和语音输入设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/507,051 Continuation US11087758B2 (en) 2017-01-19 2019-07-10 Method and voice input apparatus for converting voice input to text input

Publications (1)

Publication Number Publication Date
WO2018133656A1 true WO2018133656A1 (zh) 2018-07-26

Family

ID=59197998

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/120424 WO2018133656A1 (zh) 2017-01-19 2017-12-31 将语音输入转换成文本输入的方法、装置和语音输入设备

Country Status (3)

Country Link
US (1) US11087758B2 (zh)
CN (1) CN106896933B (zh)
WO (1) WO2018133656A1 (zh)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106896933B (zh) * 2017-01-19 2019-12-06 深圳情景智能有限公司 将语音输入转换成文本输入的方法、装置和语音输入设备
CN109256116A (zh) * 2018-09-27 2019-01-22 深圳市语芯维电子有限公司 通过语音识别键盘功能的方法、系统、设备及存储介质
CN112992134A (zh) * 2019-12-16 2021-06-18 中国科学院沈阳计算技术研究所有限公司 一种基于离线语音识别的测量系统输入方法
JP7365582B2 (ja) * 2020-02-28 2023-10-20 ブラザー工業株式会社 印刷装置
CN111899732A (zh) * 2020-06-17 2020-11-06 北京百度网讯科技有限公司 语音输入方法、装置及电子设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102262524A (zh) * 2010-05-27 2011-11-30 鼎亿数码科技(上海)有限公司 基于无线输入设备的声音识别输入方法及实现装置
CN102566762A (zh) * 2010-12-30 2012-07-11 鸿富锦精密工业(深圳)有限公司 键盘及与其连接的终端设备
CN104810015A (zh) * 2015-03-24 2015-07-29 深圳市创世达实业有限公司 语音转化装置、方法及使用该装置的支持文本存储的音箱
CN106024014A (zh) * 2016-05-24 2016-10-12 努比亚技术有限公司 一种语音转换方法、装置及移动终端
CN106341532A (zh) * 2016-08-30 2017-01-18 李达航 通过语音输入装置使手机各类应用实现语音输入的方法
CN106896933A (zh) * 2017-01-19 2017-06-27 黄玉玲 将语音输入转换成文本输入的方法、装置和语音输入设备

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011149558A2 (en) * 2010-05-28 2011-12-01 Abelow Daniel H Reality alternate
CN102541504A (zh) * 2011-01-04 2012-07-04 鸿富锦精密工业(深圳)有限公司 语音文字转换装置及方法
CN103631802B (zh) * 2012-08-24 2015-05-20 腾讯科技(深圳)有限公司 歌曲信息检索方法、装置及相应的服务器
CN103902056B (zh) * 2012-12-28 2018-02-23 华为技术有限公司 虚拟键盘输入方法、设备及系统
CN104346127B (zh) * 2013-08-02 2018-05-22 腾讯科技(深圳)有限公司 语音输入的实现方法、装置及终端
CN105138139A (zh) * 2015-08-24 2015-12-09 苏州尚德智产通信技术有限公司 一种键盘、一种终端设备和一种字符输入方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102262524A (zh) * 2010-05-27 2011-11-30 鼎亿数码科技(上海)有限公司 基于无线输入设备的声音识别输入方法及实现装置
CN102566762A (zh) * 2010-12-30 2012-07-11 鸿富锦精密工业(深圳)有限公司 键盘及与其连接的终端设备
CN104810015A (zh) * 2015-03-24 2015-07-29 深圳市创世达实业有限公司 语音转化装置、方法及使用该装置的支持文本存储的音箱
CN106024014A (zh) * 2016-05-24 2016-10-12 努比亚技术有限公司 一种语音转换方法、装置及移动终端
CN106341532A (zh) * 2016-08-30 2017-01-18 李达航 通过语音输入装置使手机各类应用实现语音输入的方法
CN106896933A (zh) * 2017-01-19 2017-06-27 黄玉玲 将语音输入转换成文本输入的方法、装置和语音输入设备

Also Published As

Publication number Publication date
CN106896933A (zh) 2017-06-27
US20190333511A1 (en) 2019-10-31
CN106896933B (zh) 2019-12-06
US11087758B2 (en) 2021-08-10

Similar Documents

Publication Publication Date Title
WO2018133656A1 (zh) 将语音输入转换成文本输入的方法、装置和语音输入设备
US11676605B2 (en) Method, interaction device, server, and system for speech recognition
WO2017166649A1 (zh) 语音信号处理方法及装置
CN109147784B (zh) 语音交互方法、设备以及存储介质
KR100819928B1 (ko) 휴대 단말기의 음성 인식장치 및 그 방법
CN108920128B (zh) 演示文稿的操作方法及系统
WO2017211020A1 (zh) 一种电视操控方法及装置
WO2017161829A1 (zh) 语音信号处理方法及装置
CN109448709A (zh) 一种终端投屏的控制方法和终端
CN110992955A (zh) 一种智能设备的语音操作方法、装置、设备及存储介质
US20110119639A1 (en) System and method of haptic communication at a portable computing device
CN103973542B (zh) 一种语音信息处理方法及装置
US11196868B2 (en) Audio data processing method, server, client and server, and storage medium
WO2018000623A1 (zh) 一种网页的操控方法及装置
US9521467B2 (en) Method and apparatus for program information exchange and communications system using a program comment instruction
CN202979200U (zh) 一种输入装置及电视系统
CN106776747A (zh) 文件获取和传输的方法、装置及电子设备
CN106790985A (zh) 一种演示文稿操作控制方法和装置
US20150351144A1 (en) Wireless transmission apparatus and implementation method thereof
KR102574294B1 (ko) 인공지능 플랫폼 제공 장치 및 이를 이용한 컨텐츠 서비스 방법
TWM515143U (zh) 語音翻譯系統及翻譯處理裝置
US10412572B2 (en) Device discovering method, apparatus and computer storage medium thereof
KR20140050539A (ko) 입력 신호 변환 장치 및 방법
KR20160115566A (ko) 클라우드 스트리밍 서비스 시스템, 이미지와 텍스트의 분리를 통한 이미지 클라우드 스트리밍 서비스 방법 및 이를 위한 장치
CN111754996A (zh) 基于语音模拟遥控器的控制方法、装置及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17893182

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17893182

Country of ref document: EP

Kind code of ref document: A1