CN116243804A - Voice control keyboard - Google Patents
Voice control keyboard Download PDFInfo
- Publication number
- CN116243804A CN116243804A CN202211558374.6A CN202211558374A CN116243804A CN 116243804 A CN116243804 A CN 116243804A CN 202211558374 A CN202211558374 A CN 202211558374A CN 116243804 A CN116243804 A CN 116243804A
- Authority
- CN
- China
- Prior art keywords
- voice
- keyboard
- module
- command
- recognition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012545 processing Methods 0.000 claims description 11
- 238000000034 method Methods 0.000 claims description 10
- 230000001960 triggered effect Effects 0.000 claims description 8
- 238000004891 communication Methods 0.000 claims description 6
- 230000002159 abnormal effect Effects 0.000 claims description 3
- 239000000284 extract Substances 0.000 claims description 3
- 230000015572 biosynthetic process Effects 0.000 claims description 2
- 238000007781 pre-processing Methods 0.000 claims description 2
- 238000003786 synthesis reaction Methods 0.000 claims description 2
- 230000000977 initiatory effect Effects 0.000 claims 1
- 230000003993 interaction Effects 0.000 abstract description 6
- 230000006870 function Effects 0.000 description 7
- 238000011161 development Methods 0.000 description 2
- 230000006855 networking Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000003139 buffering effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012905 input function Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/02—Input arrangements using manually operated switches, e.g. using keyboards or dials
- G06F3/0227—Cooperation and interconnection of the input arrangement with other functional units of a computer
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/167—Audio in a user interface, e.g. using voice commands for navigating, audio feedback
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/02—Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Input From Keyboards Or The Like (AREA)
Abstract
A voice control keyboard can perform voice input and voice control, support a user to customize voice commands, and can record a plurality of persons on site simultaneously according to voiceprint recognition and automatically distinguish a sound main body. The keyboard is customized through voice commands, so that man-machine interaction of a user is simplified; and through the automatic recognition of the voice main body, the automatic voice interaction capability between human and machine is further improved.
Description
Technical Field
The disclosure relates to the technical field of voice interaction, in particular to a voice control keyboard.
Background
At present, the common keyboard obviously meets the human-computer interaction requirements of different crowds and different use occasions, and along with the development of voice recognition technology, a voice input control keyboard starts to appear, but the function of the keyboard at present is mainly realized on the input function of replacing the common keyboard with voice, but in fact, the keyboard is used as independent hardware, and more convenient human-computer interaction functions can be realized.
Disclosure of Invention
The present disclosure provides a voice control keyboard capable of recognizing a voice command and a voice input while also enabling recognition of a voice subject; and customization of voice instructions, so that more convenient office work is realized.
The voice control keyboard provided by the present disclosure includes: mainboard, casing, and drive module, wherein:
the shell is provided with a character input key and a control key, wherein the character input key and the control key comprise keys for entering a voice input mode and/or keys for entering a voice control mode;
the functional module running in the main board comprises:
the keyboard control identification module is used for identifying the input of various keys;
the voice comprehensive processing module is used for receiving voice and preprocessing voice signals;
the voice recognition module is used for performing voice recognition and converting voice signals into voice commands or characters by combining the input of the control keys;
the driving module is used for the communication between the keyboard and the host computer, providing a voice-to-text result selecting tool and executing voice commands.
Further, the driving module includes: a voice command customization and promoter module for: recording the mouse and keyboard operation of a user, or acquiring a script command file edited by the user, or setting an executable file to be started, and defining the executable file as a simple voice command; and starting the corresponding executable file, the recorded mouse operation or the script command according to the voice command input by the user.
Further, an operation key for controlling the entering of the recording mode is arranged on the shell;
the keyboard is provided with a hardware storage component for temporarily storing recorded voice and recognition results;
the driving module transmits the temporarily stored recorded voice and the recognition result to the host for storage.
Further, the voice integrated processing module is further used for carrying out voiceprint distinction on the received voice so as to distinguish different voice subjects.
Furthermore, the hardware storage component is also used for storing operating system data of the keyboard main board operation so as to prevent abnormal power failure.
A method for carrying out multi-person on-site recording by using the keyboard comprises the following steps:
entering a recording mode by controlling keys or voice commands;
the voice comprehensive processing module extracts voiceprint characteristics of the extracted voice signals, distinguishes different voice main bodies and marks the voice;
the voice after the voiceprint feature is marked is sent to a voice recognition module for recognition;
the voice recognition module outputs voice recognition results of different voice main bodies and generates corresponding text documents;
when recording, the audio and the identification result are synchronously stored in the keyboard hardware storage component, after the recording is completed, the background noise in the audio is removed, and the audio and the identification result are stored in the host through the driving module.
A method for customizing and starting voice command by using the keyboard comprises the following steps:
using a keyboard driver to provide a tool: recording a series of operations of a keyboard and a mouse, and marking by using a voice command word; or selecting a required script command file edited by a user, and marking the script command file as a voice command word; or setting an executable file corresponding to the voice command word;
when the voice command is recognized to be input by the user, the corresponding keyboard and mouse operation, script command or executable file is triggered.
Compared with the prior art, the beneficial effects of the present disclosure are: (1) Multiple persons on the same site can record and automatically distinguish the sound main bodies; (2) The man-machine interaction of the user is simplified more through the customization of the voice command; (3) And the hardware storage component is arranged, so that the buffering capacity for data and voice is improved.
Drawings
The foregoing and other objects, features and advantages of the disclosure will be apparent from the following more particular descriptions of exemplary embodiments of the disclosure as illustrated in the accompanying drawings wherein like reference numbers generally represent like parts throughout the exemplary embodiments of the disclosure.
FIG. 1 shows a schematic diagram of an exemplary keyboard structure according to the present disclosure;
fig. 2 is a distribution of keys on an exemplary keyboard housing.
Detailed Description
Preferred embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While the preferred embodiments of the present disclosure are illustrated in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The present disclosure provides a keyboard capable of voice input control, capable of voice input and voice manipulation. An exemplary embodiment according to the present disclosure, as shown in fig. 1, includes: motherboard, keyboard casing and keyboard drive.
(1) In addition to the conventional keys, two keys related to voice input and control are arranged on the keyboard shell: a "control" key for entering a voice manipulation mode; microphone key for entering voice input mode. The keyboard shell is provided with a keyboard lamp to support the whole dimming of the keyboard.
(2) Operating system is operated in the main board, and voice recognition system program is operated in the operating system, wherein the method comprises the following steps: the system comprises a keyboard key control module, a voice frequency comprehensive processing module, a voice recognition module, a communication protocol module, an online upgrading module and the like; wherein:
1) The keyboard key control module is used for identifying keyboard key information and inputting the keyboard key information into the computer through the communication protocol module;
2) The voice frequency comprehensive processing module is used for carrying out pretreatment such as noise reduction, background noise removal, echo removal and the like on voice;
3) The voice recognition module is used for recognizing the voice signals obtained after processing as corresponding words; and determines whether to act as a control command or output text according to the case where the user presses a voice key or a control key.
When the keyboard is switched to the input method attached to the keyboard drive, the microphone key of the keyboard is pressed, the microphone key lamp is lightened to start recording, the keyboard converts voice into characters to be output into the input method selection frame, after the keyboard is pressed to enter, the converted characters are automatically output to the cursor, and when the recording is automatically turned off after 10 seconds (the driving adjustment is available), the microphone key lamp is turned off. The keyboard voice conversion text is combined with the context and judges whether long text recognition is performed or not, and the corresponding recognition mode is automatically switched to output a recognition result.
The speech recognition of this embodiment also supports an offline recognition mode, which automatically switches to a network recognition mode upon networking. Networking recognition patterns have a slightly higher recognition rate than offline recognition patterns in recognition environments that combine context and long text.
4) The communication protocol module is used for communicating with a computer; the online upgrade module is used for online automatic upgrade of the main board firmware.
The main board is communicated with the computer through a USB interface. In a preferred embodiment, the motherboard uses rk3568 chips, four cores up to 2.0GHz, supporting a high speed USB3.0 interface. The operation system is a Linux system, and has the advantages of stable kernel, high instantaneity, strong communication function and good expansibility, and is convenient for secondary development and expansion of upgrading functions.
Preferably, the speech synthesis processing module in the keyboard of this embodiment is further configured to perform voiceprint discrimination on the received speech to distinguish between different voice subjects. Meanwhile, the system is provided with a hardware memory, and temporary storage is carried out on the audio file in a recording mode. The hardware memory is also used for storing operating system data of the keyboard main board operation so as to prevent abnormal power failure.
The keyboard can realize the on-site recording of multiple persons, and comprises the following steps:
entering a recording mode by controlling keys or voice instructions;
the voice comprehensive processing module extracts voiceprint characteristics of the extracted voice signals, distinguishes different voice main bodies and marks the voice;
the voice after the voiceprint feature is marked is sent to a voice recognition module for recognition;
the voice recognition module outputs voice recognition results of different voice main bodies, generates corresponding text documents and stores the text documents in a set path folder;
when recording, the audio is synchronously stored in the hardware memory of the keyboard, after the recording is completed, the background noise in the audio is removed, and the audio is stored in a folder installed on the keyboard driver in an audio file mode.
A voice command driver script command may be provided within the exemplary keyboard driver to enable opening of a particular application or particular mode of operation, such as: imitate mouse-and-keyboard operations, browser searches, fast forward pauses of audio-video players, etc.
And meanwhile, customized keyboard driving is supported, and the customized keyboard driving is deeply combined with software used by a computer to realize specific voice commands so as to perform quick office work, equipment operation and the like. After the voice control command is recognized, the keyboard driving script can make a control action corresponding to the voice control command.
The keyboard also allows the user to customize the voice command itself: by recording a series of mouse-keyboard operations of the user, or using script command files written by the user, and defining as simple voice control instructions:
(a) The first mode comprises the following specific steps:
first, the keyboard and mouse operations are recorded by a driver tool (a software tool provided with the keyboard driver);
then, the voice command word labeling is carried out on the operation;
when the voice command is triggered, the keyboard and mouse operations are triggered.
For example: the client can set a function of directly opening the application program by this method. Firstly, selecting a direct opening application program function through a driving tool, and then clicking a program executable file to be opened; then marking the voice command word; when the voice command is triggered, the corresponding program is opened.
(b) The second mode specifically comprises the following steps: the user edits the script command required by the user, then selects an autonomous triggering script function in the driving tool, and selects the edited script; then marking the voice command word; when a voice command is triggered, a script command is triggered.
And the executable file corresponding to a certain voice command, such as QQ.exe, phtoshop.exe and the like, can be directly recorded through setting in the drive, and the executable file is triggered to be started when the keyboard recognizes that the voice command is input by a user.
The foregoing technical solutions are merely exemplary embodiments of the present invention, and various modifications and variations can be easily made by those skilled in the art based on the application methods and principles disclosed in the present invention, not limited to the methods described in the foregoing specific embodiments of the present invention, so that the foregoing description is only preferred and not in a limiting sense.
Claims (7)
1. A speech-controlled keyboard, comprising: mainboard, casing, and drive module, wherein:
the shell is provided with a character input key and a control key, wherein the character input key and the control key comprise keys for entering a voice input mode and/or keys for entering a voice control mode;
the functional module running in the main board comprises:
the keyboard control identification module is used for identifying the input of various keys;
the voice comprehensive processing module is used for receiving voice and preprocessing voice signals;
the voice recognition module is used for performing voice recognition and converting voice signals into voice commands or characters by combining the input of the control keys;
the driving module is used for the communication between the keyboard and the host computer, providing a voice-to-text result selecting tool and executing voice commands.
2. The keyboard of claim 1, wherein the drive module comprises: a voice command customization and promoter module for: recording the mouse and keyboard operation of a user, or acquiring a script command file edited by the user, or setting an executable file to be started, and defining the executable file as a simple voice command; and starting the corresponding executable file, the recorded mouse operation or the script command according to the voice command input by the user.
3. A keyboard as claimed in claim 1 or 2, wherein:
the shell is provided with an operation key for controlling the entering of a recording mode;
the keyboard is provided with a hardware storage component for temporarily storing recorded voice and recognition results;
the driving module transmits the temporarily stored recorded voice and the recognition result to the host for storage.
4. The keyboard of claim 3, wherein the speech synthesis processing module is further configured to voiceprint distinguish received speech to distinguish between different subjects of sound.
5. A keyboard as claimed in claim 3, wherein: the hardware storage component is also used for storing operating system data of the keyboard main board operation so as to prevent abnormal power failure.
6. A method of multi-person live recording using the keyboard of claim 4, comprising the steps of:
entering a recording mode by controlling keys or voice commands;
the voice comprehensive processing module extracts voiceprint characteristics of the extracted voice signals, distinguishes different voice main bodies and marks the voice;
the voice after the voiceprint feature is marked is sent to a voice recognition module for recognition;
the voice recognition module outputs voice recognition results of different voice main bodies and generates corresponding text documents;
when recording, the audio and the identification result are synchronously stored in the keyboard hardware storage component, after the recording is completed, the background noise in the audio is removed, and the audio and the identification result are stored in the host through the driving module.
7. A method for customizing and initiating voice commands using the keyboard of claim 2, comprising:
using a keyboard driver to provide a tool: recording a series of operations of a keyboard and a mouse, and marking by using a voice command word; or selecting a required script command file edited by a user, and marking the script command file as a voice command word; or setting an executable file corresponding to the voice command word;
when the voice command is recognized to be input by the user, the corresponding keyboard and mouse operation, script command or executable file is triggered.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211558374.6A CN116243804A (en) | 2022-12-06 | 2022-12-06 | Voice control keyboard |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211558374.6A CN116243804A (en) | 2022-12-06 | 2022-12-06 | Voice control keyboard |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116243804A true CN116243804A (en) | 2023-06-09 |
Family
ID=86628447
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211558374.6A Pending CN116243804A (en) | 2022-12-06 | 2022-12-06 | Voice control keyboard |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116243804A (en) |
-
2022
- 2022-12-06 CN CN202211558374.6A patent/CN116243804A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR101213835B1 (en) | Verb error recovery in speech recognition | |
US7260529B1 (en) | Command insertion system and method for voice recognition applications | |
US6415258B1 (en) | Background audio recovery system | |
US5208897A (en) | Method and apparatus for speech recognition based on subsyllable spellings | |
US6233559B1 (en) | Speech control of multiple applications using applets | |
JP4987623B2 (en) | Apparatus and method for interacting with user by voice | |
CN100403828C (en) | Portable digital mobile communication apparatus and voice control method and system thereof | |
US20160328205A1 (en) | Method and Apparatus for Voice Operation of Mobile Applications Having Unnamed View Elements | |
US6513009B1 (en) | Scalable low resource dialog manager | |
JP7328265B2 (en) | VOICE INTERACTION CONTROL METHOD, APPARATUS, ELECTRONIC DEVICE, STORAGE MEDIUM AND SYSTEM | |
WO2020024620A1 (en) | Voice information processing method and device, apparatus, and storage medium | |
US6499015B2 (en) | Voice interaction method for a computer graphical user interface | |
US8606560B2 (en) | Automatic simultaneous interpertation system | |
JP2007264471A (en) | Voice recognition device and method therefor | |
EP1346343A1 (en) | Speech recognition using word-in-phrase command | |
JP2006515073A (en) | Method, system, and programming for performing speech recognition | |
CN106710593A (en) | Method for adding account, terminal, and server | |
KR20080083290A (en) | A method and apparatus for accessing a digital file from a collection of digital files | |
CN101825953A (en) | Chinese character input product with combined voice input and Chinese phonetic alphabet input functions | |
CN116243804A (en) | Voice control keyboard | |
US7036130B2 (en) | Method for expanding in friendly manner the functionality of a portable electronic device and corresponding portable electronic device | |
CN100375084C (en) | Computer with language re-reading function and its realizing method | |
JPH04311222A (en) | Portable computer apparatus for speech processing of electronic document | |
JP2001306090A (en) | Device and method for interaction, device and method for voice control, and computer-readable recording medium with program for making computer function as interaction device and voice control device recorded thereon | |
JP7511623B2 (en) | Information processing device, information processing system, information processing method, and program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |