CN116243804A - Voice control keyboard - Google Patents

Voice control keyboard Download PDF

Info

Publication number
CN116243804A
CN116243804A CN202211558374.6A CN202211558374A CN116243804A CN 116243804 A CN116243804 A CN 116243804A CN 202211558374 A CN202211558374 A CN 202211558374A CN 116243804 A CN116243804 A CN 116243804A
Authority
CN
China
Prior art keywords
voice
keyboard
module
command
recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211558374.6A
Other languages
Chinese (zh)
Inventor
杨坤
李焕梅
吴艳玲
周翠华
许朋
董小格
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hebei Hanguang Heavy Industry Ltd
Original Assignee
Hebei Hanguang Heavy Industry Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hebei Hanguang Heavy Industry Ltd filed Critical Hebei Hanguang Heavy Industry Ltd
Priority to CN202211558374.6A priority Critical patent/CN116243804A/en
Publication of CN116243804A publication Critical patent/CN116243804A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/02Input arrangements using manually operated switches, e.g. using keyboards or dials
    • G06F3/0227Cooperation and interconnection of the input arrangement with other functional units of a computer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Input From Keyboards Or The Like (AREA)

Abstract

A voice control keyboard can perform voice input and voice control, support a user to customize voice commands, and can record a plurality of persons on site simultaneously according to voiceprint recognition and automatically distinguish a sound main body. The keyboard is customized through voice commands, so that man-machine interaction of a user is simplified; and through the automatic recognition of the voice main body, the automatic voice interaction capability between human and machine is further improved.

Description

Voice control keyboard
Technical Field
The disclosure relates to the technical field of voice interaction, in particular to a voice control keyboard.
Background
At present, the common keyboard obviously meets the human-computer interaction requirements of different crowds and different use occasions, and along with the development of voice recognition technology, a voice input control keyboard starts to appear, but the function of the keyboard at present is mainly realized on the input function of replacing the common keyboard with voice, but in fact, the keyboard is used as independent hardware, and more convenient human-computer interaction functions can be realized.
Disclosure of Invention
The present disclosure provides a voice control keyboard capable of recognizing a voice command and a voice input while also enabling recognition of a voice subject; and customization of voice instructions, so that more convenient office work is realized.
The voice control keyboard provided by the present disclosure includes: mainboard, casing, and drive module, wherein:
the shell is provided with a character input key and a control key, wherein the character input key and the control key comprise keys for entering a voice input mode and/or keys for entering a voice control mode;
the functional module running in the main board comprises:
the keyboard control identification module is used for identifying the input of various keys;
the voice comprehensive processing module is used for receiving voice and preprocessing voice signals;
the voice recognition module is used for performing voice recognition and converting voice signals into voice commands or characters by combining the input of the control keys;
the driving module is used for the communication between the keyboard and the host computer, providing a voice-to-text result selecting tool and executing voice commands.
Further, the driving module includes: a voice command customization and promoter module for: recording the mouse and keyboard operation of a user, or acquiring a script command file edited by the user, or setting an executable file to be started, and defining the executable file as a simple voice command; and starting the corresponding executable file, the recorded mouse operation or the script command according to the voice command input by the user.
Further, an operation key for controlling the entering of the recording mode is arranged on the shell;
the keyboard is provided with a hardware storage component for temporarily storing recorded voice and recognition results;
the driving module transmits the temporarily stored recorded voice and the recognition result to the host for storage.
Further, the voice integrated processing module is further used for carrying out voiceprint distinction on the received voice so as to distinguish different voice subjects.
Furthermore, the hardware storage component is also used for storing operating system data of the keyboard main board operation so as to prevent abnormal power failure.
A method for carrying out multi-person on-site recording by using the keyboard comprises the following steps:
entering a recording mode by controlling keys or voice commands;
the voice comprehensive processing module extracts voiceprint characteristics of the extracted voice signals, distinguishes different voice main bodies and marks the voice;
the voice after the voiceprint feature is marked is sent to a voice recognition module for recognition;
the voice recognition module outputs voice recognition results of different voice main bodies and generates corresponding text documents;
when recording, the audio and the identification result are synchronously stored in the keyboard hardware storage component, after the recording is completed, the background noise in the audio is removed, and the audio and the identification result are stored in the host through the driving module.
A method for customizing and starting voice command by using the keyboard comprises the following steps:
using a keyboard driver to provide a tool: recording a series of operations of a keyboard and a mouse, and marking by using a voice command word; or selecting a required script command file edited by a user, and marking the script command file as a voice command word; or setting an executable file corresponding to the voice command word;
when the voice command is recognized to be input by the user, the corresponding keyboard and mouse operation, script command or executable file is triggered.
Compared with the prior art, the beneficial effects of the present disclosure are: (1) Multiple persons on the same site can record and automatically distinguish the sound main bodies; (2) The man-machine interaction of the user is simplified more through the customization of the voice command; (3) And the hardware storage component is arranged, so that the buffering capacity for data and voice is improved.
Drawings
The foregoing and other objects, features and advantages of the disclosure will be apparent from the following more particular descriptions of exemplary embodiments of the disclosure as illustrated in the accompanying drawings wherein like reference numbers generally represent like parts throughout the exemplary embodiments of the disclosure.
FIG. 1 shows a schematic diagram of an exemplary keyboard structure according to the present disclosure;
fig. 2 is a distribution of keys on an exemplary keyboard housing.
Detailed Description
Preferred embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While the preferred embodiments of the present disclosure are illustrated in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The present disclosure provides a keyboard capable of voice input control, capable of voice input and voice manipulation. An exemplary embodiment according to the present disclosure, as shown in fig. 1, includes: motherboard, keyboard casing and keyboard drive.
(1) In addition to the conventional keys, two keys related to voice input and control are arranged on the keyboard shell: a "control" key for entering a voice manipulation mode; microphone key for entering voice input mode. The keyboard shell is provided with a keyboard lamp to support the whole dimming of the keyboard.
(2) Operating system is operated in the main board, and voice recognition system program is operated in the operating system, wherein the method comprises the following steps: the system comprises a keyboard key control module, a voice frequency comprehensive processing module, a voice recognition module, a communication protocol module, an online upgrading module and the like; wherein:
1) The keyboard key control module is used for identifying keyboard key information and inputting the keyboard key information into the computer through the communication protocol module;
2) The voice frequency comprehensive processing module is used for carrying out pretreatment such as noise reduction, background noise removal, echo removal and the like on voice;
3) The voice recognition module is used for recognizing the voice signals obtained after processing as corresponding words; and determines whether to act as a control command or output text according to the case where the user presses a voice key or a control key.
When the keyboard is switched to the input method attached to the keyboard drive, the microphone key of the keyboard is pressed, the microphone key lamp is lightened to start recording, the keyboard converts voice into characters to be output into the input method selection frame, after the keyboard is pressed to enter, the converted characters are automatically output to the cursor, and when the recording is automatically turned off after 10 seconds (the driving adjustment is available), the microphone key lamp is turned off. The keyboard voice conversion text is combined with the context and judges whether long text recognition is performed or not, and the corresponding recognition mode is automatically switched to output a recognition result.
The speech recognition of this embodiment also supports an offline recognition mode, which automatically switches to a network recognition mode upon networking. Networking recognition patterns have a slightly higher recognition rate than offline recognition patterns in recognition environments that combine context and long text.
4) The communication protocol module is used for communicating with a computer; the online upgrade module is used for online automatic upgrade of the main board firmware.
The main board is communicated with the computer through a USB interface. In a preferred embodiment, the motherboard uses rk3568 chips, four cores up to 2.0GHz, supporting a high speed USB3.0 interface. The operation system is a Linux system, and has the advantages of stable kernel, high instantaneity, strong communication function and good expansibility, and is convenient for secondary development and expansion of upgrading functions.
Preferably, the speech synthesis processing module in the keyboard of this embodiment is further configured to perform voiceprint discrimination on the received speech to distinguish between different voice subjects. Meanwhile, the system is provided with a hardware memory, and temporary storage is carried out on the audio file in a recording mode. The hardware memory is also used for storing operating system data of the keyboard main board operation so as to prevent abnormal power failure.
The keyboard can realize the on-site recording of multiple persons, and comprises the following steps:
entering a recording mode by controlling keys or voice instructions;
the voice comprehensive processing module extracts voiceprint characteristics of the extracted voice signals, distinguishes different voice main bodies and marks the voice;
the voice after the voiceprint feature is marked is sent to a voice recognition module for recognition;
the voice recognition module outputs voice recognition results of different voice main bodies, generates corresponding text documents and stores the text documents in a set path folder;
when recording, the audio is synchronously stored in the hardware memory of the keyboard, after the recording is completed, the background noise in the audio is removed, and the audio is stored in a folder installed on the keyboard driver in an audio file mode.
A voice command driver script command may be provided within the exemplary keyboard driver to enable opening of a particular application or particular mode of operation, such as: imitate mouse-and-keyboard operations, browser searches, fast forward pauses of audio-video players, etc.
And meanwhile, customized keyboard driving is supported, and the customized keyboard driving is deeply combined with software used by a computer to realize specific voice commands so as to perform quick office work, equipment operation and the like. After the voice control command is recognized, the keyboard driving script can make a control action corresponding to the voice control command.
The keyboard also allows the user to customize the voice command itself: by recording a series of mouse-keyboard operations of the user, or using script command files written by the user, and defining as simple voice control instructions:
(a) The first mode comprises the following specific steps:
first, the keyboard and mouse operations are recorded by a driver tool (a software tool provided with the keyboard driver);
then, the voice command word labeling is carried out on the operation;
when the voice command is triggered, the keyboard and mouse operations are triggered.
For example: the client can set a function of directly opening the application program by this method. Firstly, selecting a direct opening application program function through a driving tool, and then clicking a program executable file to be opened; then marking the voice command word; when the voice command is triggered, the corresponding program is opened.
(b) The second mode specifically comprises the following steps: the user edits the script command required by the user, then selects an autonomous triggering script function in the driving tool, and selects the edited script; then marking the voice command word; when a voice command is triggered, a script command is triggered.
And the executable file corresponding to a certain voice command, such as QQ.exe, phtoshop.exe and the like, can be directly recorded through setting in the drive, and the executable file is triggered to be started when the keyboard recognizes that the voice command is input by a user.
The foregoing technical solutions are merely exemplary embodiments of the present invention, and various modifications and variations can be easily made by those skilled in the art based on the application methods and principles disclosed in the present invention, not limited to the methods described in the foregoing specific embodiments of the present invention, so that the foregoing description is only preferred and not in a limiting sense.

Claims (7)

1. A speech-controlled keyboard, comprising: mainboard, casing, and drive module, wherein:
the shell is provided with a character input key and a control key, wherein the character input key and the control key comprise keys for entering a voice input mode and/or keys for entering a voice control mode;
the functional module running in the main board comprises:
the keyboard control identification module is used for identifying the input of various keys;
the voice comprehensive processing module is used for receiving voice and preprocessing voice signals;
the voice recognition module is used for performing voice recognition and converting voice signals into voice commands or characters by combining the input of the control keys;
the driving module is used for the communication between the keyboard and the host computer, providing a voice-to-text result selecting tool and executing voice commands.
2. The keyboard of claim 1, wherein the drive module comprises: a voice command customization and promoter module for: recording the mouse and keyboard operation of a user, or acquiring a script command file edited by the user, or setting an executable file to be started, and defining the executable file as a simple voice command; and starting the corresponding executable file, the recorded mouse operation or the script command according to the voice command input by the user.
3. A keyboard as claimed in claim 1 or 2, wherein:
the shell is provided with an operation key for controlling the entering of a recording mode;
the keyboard is provided with a hardware storage component for temporarily storing recorded voice and recognition results;
the driving module transmits the temporarily stored recorded voice and the recognition result to the host for storage.
4. The keyboard of claim 3, wherein the speech synthesis processing module is further configured to voiceprint distinguish received speech to distinguish between different subjects of sound.
5. A keyboard as claimed in claim 3, wherein: the hardware storage component is also used for storing operating system data of the keyboard main board operation so as to prevent abnormal power failure.
6. A method of multi-person live recording using the keyboard of claim 4, comprising the steps of:
entering a recording mode by controlling keys or voice commands;
the voice comprehensive processing module extracts voiceprint characteristics of the extracted voice signals, distinguishes different voice main bodies and marks the voice;
the voice after the voiceprint feature is marked is sent to a voice recognition module for recognition;
the voice recognition module outputs voice recognition results of different voice main bodies and generates corresponding text documents;
when recording, the audio and the identification result are synchronously stored in the keyboard hardware storage component, after the recording is completed, the background noise in the audio is removed, and the audio and the identification result are stored in the host through the driving module.
7. A method for customizing and initiating voice commands using the keyboard of claim 2, comprising:
using a keyboard driver to provide a tool: recording a series of operations of a keyboard and a mouse, and marking by using a voice command word; or selecting a required script command file edited by a user, and marking the script command file as a voice command word; or setting an executable file corresponding to the voice command word;
when the voice command is recognized to be input by the user, the corresponding keyboard and mouse operation, script command or executable file is triggered.
CN202211558374.6A 2022-12-06 2022-12-06 Voice control keyboard Pending CN116243804A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211558374.6A CN116243804A (en) 2022-12-06 2022-12-06 Voice control keyboard

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211558374.6A CN116243804A (en) 2022-12-06 2022-12-06 Voice control keyboard

Publications (1)

Publication Number Publication Date
CN116243804A true CN116243804A (en) 2023-06-09

Family

ID=86628447

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211558374.6A Pending CN116243804A (en) 2022-12-06 2022-12-06 Voice control keyboard

Country Status (1)

Country Link
CN (1) CN116243804A (en)

Similar Documents

Publication Publication Date Title
KR101213835B1 (en) Verb error recovery in speech recognition
US7260529B1 (en) Command insertion system and method for voice recognition applications
US6415258B1 (en) Background audio recovery system
US5208897A (en) Method and apparatus for speech recognition based on subsyllable spellings
US6233559B1 (en) Speech control of multiple applications using applets
JP4987623B2 (en) Apparatus and method for interacting with user by voice
CN100403828C (en) Portable digital mobile communication apparatus and voice control method and system thereof
US20160328205A1 (en) Method and Apparatus for Voice Operation of Mobile Applications Having Unnamed View Elements
US6513009B1 (en) Scalable low resource dialog manager
JP7328265B2 (en) VOICE INTERACTION CONTROL METHOD, APPARATUS, ELECTRONIC DEVICE, STORAGE MEDIUM AND SYSTEM
WO2020024620A1 (en) Voice information processing method and device, apparatus, and storage medium
US6499015B2 (en) Voice interaction method for a computer graphical user interface
US8606560B2 (en) Automatic simultaneous interpertation system
JP2007264471A (en) Voice recognition device and method therefor
EP1346343A1 (en) Speech recognition using word-in-phrase command
JP2006515073A (en) Method, system, and programming for performing speech recognition
CN106710593A (en) Method for adding account, terminal, and server
KR20080083290A (en) A method and apparatus for accessing a digital file from a collection of digital files
CN101825953A (en) Chinese character input product with combined voice input and Chinese phonetic alphabet input functions
CN116243804A (en) Voice control keyboard
US7036130B2 (en) Method for expanding in friendly manner the functionality of a portable electronic device and corresponding portable electronic device
CN100375084C (en) Computer with language re-reading function and its realizing method
JPH04311222A (en) Portable computer apparatus for speech processing of electronic document
JP2001306090A (en) Device and method for interaction, device and method for voice control, and computer-readable recording medium with program for making computer function as interaction device and voice control device recorded thereon
JP7511623B2 (en) Information processing device, information processing system, information processing method, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination