WO2019169722A1 - Shortcut key recognition method and apparatus, device, and computer-readable storage medium - Google Patents

Shortcut key recognition method and apparatus, device, and computer-readable storage medium Download PDF

Info

Publication number
WO2019169722A1
WO2019169722A1 PCT/CN2018/085255 CN2018085255W WO2019169722A1 WO 2019169722 A1 WO2019169722 A1 WO 2019169722A1 CN 2018085255 W CN2018085255 W CN 2018085255W WO 2019169722 A1 WO2019169722 A1 WO 2019169722A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
semantic
text
voice
shortcut
Prior art date
Application number
PCT/CN2018/085255
Other languages
French (fr)
Chinese (zh)
Inventor
刘万晶
黄胜彪
徐钊
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2019169722A1 publication Critical patent/WO2019169722A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/02Input arrangements using manually operated switches, e.g. using keyboards or dials
    • G06F3/023Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
    • G06F3/0238Programmable keyboards
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Definitions

  • the present application relates to the field of computer technologies, and in particular, to a shortcut key identification method, apparatus, device, and computer readable storage medium.
  • IDE Integrated Development Environment
  • tools such as a code editor, compiler, debugger, and graphical user interface. It also integrates code writing functions and analysis functions.
  • Integrated development software service suites such as compilation functions and debugging functions.
  • the embodiment of the present application provides a shortcut key identification method, device, device, and computer readable storage medium, which can simplify the identification of shortcut keys to assist in rapid development of programming and improve work efficiency.
  • the embodiment of the present application provides a shortcut key identification method, where the method includes:
  • the embodiment of the present application further provides a shortcut key identification device, and the device includes:
  • a reading unit configured to read a configuration file of the system to determine related shortcut keys, wherein each shortcut key is correspondingly provided with a corresponding shortcut operation instruction; and an analysis unit is configured to perform voice recognition on the acquired voice information Obtaining text information, and performing semantic analysis on the text information to determine corresponding semantic information; and determining unit, configured to perform text recognition matching on the semantic information according to a preset rule to determine that the semantic information is matched Shortcut instructions.
  • the embodiment of the present application further provides a computer device, including: a memory, configured to store a program for implementing shortcut key identification; and a processor, configured to execute a program for realizing shortcut key identification stored in the memory, To perform the method as described above.
  • an embodiment of the present application further provides a computer readable storage medium storing one or more programs, the one or more programs being executable by one or more processors, To achieve the method as described above.
  • the implementation of the embodiment of the present application not only simplifies the identification of the shortcut keys, but also assists in rapid system development and programming, and improves work efficiency.
  • FIG. 1 is a schematic flow chart of a shortcut key identification method provided by an embodiment of the present application.
  • FIG. 2 is another schematic flowchart of a shortcut key identification method provided by an embodiment of the present application.
  • FIG. 3 is another schematic flowchart of a shortcut key identification method according to an embodiment of the present application.
  • FIG. 5 is another schematic flowchart of a shortcut key identification method according to an embodiment of the present application.
  • FIG. 6 is a schematic block diagram of a shortcut key identification apparatus according to an embodiment of the present application.
  • FIG. 7 is another schematic block diagram of a shortcut key identification apparatus according to an embodiment of the present application.
  • FIG. 8 is another schematic block diagram of a shortcut key identification apparatus according to an embodiment of the present application.
  • FIG. 9 is another schematic block diagram of a shortcut key identification apparatus according to an embodiment of the present application.
  • FIG. 10 is another schematic block diagram of a shortcut key identification apparatus according to an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of a computer device according to an embodiment of the present application.
  • FIG. 1 is a schematic flow chart of a method for identifying a shortcut key according to an embodiment of the present application.
  • the method can be run on terminals such as smart phones (such as Android phones, IOS phones, etc.), tablets, laptops, and smart devices.
  • the method can be applied to various development tools to assist programming, and can also be applied to office software such as OFFICE, thereby implementing shortcut operations such as creating a class, code formatting, etc., wherein the development tool can be eclipse or intellJ. IDEA, etc.
  • the steps of the method include S101 to S104.
  • S101 Read a configuration file of the system to determine related shortcut keys, wherein each shortcut key is correspondingly set with a corresponding shortcut operation instruction.
  • the configuration file of the system may be a configuration file of the development tool IDE, or a configuration file of an office software such as OFFICE.
  • the shortcut operation instruction corresponding to the shortcut key can be obtained and executed, thereby realizing the recognition and use of the shortcut key.
  • the shortcut key corresponding to “Copy” may be “ctrl+C”
  • the “ctrl+C” shortcut key corresponds to a shortcut operation instruction for copying
  • the shortcut key corresponding to “Paste” may be “ctrl+V”.
  • the "ctrl+V" shortcut key corresponds to a shortcut operation instruction for pasting.
  • S102 Perform speech recognition on the acquired voice information to obtain text information, and perform semantic analysis on the text information to determine corresponding semantic information.
  • the voice information corresponding to the voice information can be obtained by performing voice recognition on the voice information sent by the user, and the text information can also be semantically analyzed, thereby corresponding to the text information. Semantic information.
  • the step S102 specifically includes steps S201 to S202.
  • S201 Perform speech recognition on the acquired voice information to convert the voice information into text information.
  • the obtained voice information may be processed correspondingly through the smart voice API.
  • the smart voice API may be an API for voice recognition.
  • the step S201 specifically includes steps S301 to S303.
  • the voice activity detection (VAD) voice signal processing technology is required to perform mute and cut off the first and last segments of the acquired voice information. Pure acoustic information is obtained to reduce noise interference.
  • voice activity detection technology is mainly used for speech coding and speech recognition, which can simplify speech processing, and can also be used to remove non-speech segments during an audio session, such as encoding and transmission of mute data packets in IP telephony applications. Saving computation time and bandwidth, voice activity detection technology makes some column-based applications a reality.
  • the moving window function can implement truncation of the signal, thereby realizing framing the pure acoustic wave information.
  • the more common acoustic feature extraction method may be to extract the MFCC feature, that is, according to the physiological characteristics of the human ear, turn each frame waveform into a multi-dimensional vector, to simply understand that this vector contains the content information of the frame speech.
  • the MFCC feature that is, according to the physiological characteristics of the human ear
  • the more common acoustic feature extraction method may be to extract the MFCC feature, that is, according to the physiological characteristics of the human ear, turn each frame waveform into a multi-dimensional vector, to simply understand that this vector contains the content information of the frame speech.
  • pure acoustic wave information is extracted by acoustic features, it can be a 12-line (assuming the acoustic feature is 12-dimensional) and a matrix of N columns, so it is called an observation sequence, where N is the total number of frames.
  • S303 Construct a state network by using a hidden Markov model, and find a path that best matches the pure sound wave information from the state network to obtain corresponding text information.
  • the built-in state network refers to a state network after being developed into a phoneme network by a word-level network.
  • the speech recognition process is actually searching for an optimal path in the state network.
  • the probability of the best path corresponding to the voice information is the largest, which is called “decoding”.
  • the path search algorithm is a dynamic plan pruning algorithm called Viterbi algorithm for finding the global optimal path.
  • several frame speeches correspond to one state, and each of the three states is combined into one phoneme, and several phonemes are combined into one word, and then the sound wave information is finally converted into text information through the matching of the state network.
  • S202 Perform semantic analysis on the text information by using a natural language processing algorithm to obtain corresponding semantic information.
  • Natural Language Processing is a sub-area of artificial intelligence (AI), which uses the dependency relationship between words and words in a sentence to represent the syntactic structure information of a word (such as The subject-predicate, the verb-object, and the medium-structure relationship), and use the tree structure to represent the structure of the whole sentence (such as the subject-predicate, the fixed complement, etc.).
  • AI artificial intelligence
  • the semantic backbone and related semantic components are extracted to help the intelligent product realize the accurate understanding of the user's intention.
  • the step S202 specifically includes steps S401 to S403.
  • S401 Perform word segmentation and part-of-speech tagging on the text information to obtain a plurality of words marked with part of speech.
  • the natural language processing is to enable the computer to understand the human language, that is, to understand the meaning behind the text.
  • the word segmentation is the basis of natural language processing, for example, it can help to extract keywords and classify them. Under normal circumstances, the principle of word segmentation uses the conditional random field, and the word segmentation is carried out by the features such as position labeling and part of speech, so that several words are obtained, which can also be called obtaining a number of terms.
  • a weight can be calculated for each term after the word segmentation, and the important term should be given a higher weight.
  • the term weighting result of "what exercise is more helpful for weight loss/" may be: "What 0.1, exercise 0.5, 0.1, weight loss 0.8, help 0.3, larger 0.2".
  • the weighting formula of Term Weighting generally consists of three parts: local, global and normalization.
  • the Term weighting method can include F-IDF, Okapi, MI, LTU, ATC, TF-ICF, etc. Through the combination of various formulas of local, global, and normalization, different term weighting calculation methods can be generated. That is, the weight value corresponding to each word can be obtained by the term weighting calculation method.
  • a threshold is preset, a weight value corresponding to all words is obtained, and a word corresponding to a weight value greater than or equal to a preset threshold is determined as a keyword, and the keyword is a corresponding semantic information.
  • the preset rule may be that the semantic information is subjected to fuzzy search and parsing according to a preset Chinese vocabulary, thereby implementing text recognition of the semantic information.
  • the shortcut key corresponding to the voice information can be determined, and according to the shortcut key, the shortcut operation instruction that the user needs to call can be confirmed, that is, the corresponding voice feedback is realized.
  • the step S103 specifically includes steps S501 to S502.
  • a related professional term vocabulary may be created according to the existing data information, and the existing data information may be a word in a professional field and a popular word in life, and at the same time, the preset Chinese word
  • the library can be created according to the needs of the user. For example, according to the shortcut keys and related shortcut operations configured by the system, a terminology vocabulary corresponding to the shortcut key can be created, and the terminology vocabulary is the preset Chinese vocabulary of the application.
  • S502 Perform fuzzy matching on the semantic information with the preset Chinese vocabulary to determine a shortcut key that matches the semantic information.
  • a related terminology vocabulary can be created according to the existing data information, and then fuzzy matching is performed by synthesizing the semantics of the parsed linguistic information with the professional term vocabulary related to the creation to determine the semantic information.
  • Matching shortcut operation instructions then feedback to the user to perform the next operation (professional term feedback), perform corresponding operations and language feedback to execute the results, and perform customization with the feedback language to enhance user professionalism. It is convenient for users to operate and learn.
  • the method further comprises the steps of:
  • a shortcut operation corresponding to the input voice information can be implemented.
  • a shortcut key corresponding to the shortcut operation instruction may be displayed, thereby facilitating the user to perform a direct operation.
  • the method provided by the embodiment of the present application can be applied to a common IDE development tool such as Eclipse.
  • the user expresses it through natural language without knowing the shortcut keys of some operations.
  • "I want to create a Java class of Test” the voice recognition API, the semantic recognition API, and the Chinese lexicon can be used. Fuzzy search, etc., the shortcut operation instruction corresponding to the shortcut key "alt+C" of the creation class set in the IDE development tool of Eclipse is called to create a Test.java file.
  • the shortcut key “alt+C” is displayed on the display screen of the terminal at the same time, so as to prompt the user to create a shortcut key of the class as “alt+C”, so that the user can directly use the corresponding shortcut key next time.
  • a call of a shortcut operation instruction may be performed on the selected target by voice recognition to implement an operation process on the target. For example, select a xxx.txt file and send a voice message: “Open File”, then call the shortcut operation command, select the default open file tool to open the file, and also return the voice: “Opened xxx.txt After selecting the text content, the voice message is sent: “Copy”, at this time, it is determined that the corresponding “ctrl+C” shortcut key is copied, so that the shortcut operation instruction corresponding to the shortcut key is called to perform the copying, and the voice feedback: “ The content has been copied.” Specify the text position, and send out the voice message: “Paste”. At this time, confirm the corresponding “ctrl+V” shortcut key, and then call the shortcut operation instruction corresponding to the shortcut key and paste the content in the specified position. At the same time, you can voice feedback: “Paste success”.
  • the embodiment of the present application can not only simplify the identification of shortcut keys, but also assist in rapid system development and programming, and improve work efficiency.
  • the shortcut key recognition method mainly utilizes intelligent speech recognition and integrates shortcut keys in different IDEs, so that the developer can complete the corresponding shortcut operation after inputting the natural language to the computer.
  • the embodiment of the present application further provides a shortcut key identification apparatus, and the apparatus 100 includes: a reading unit 101, an analysis unit 102, and a determination unit 103.
  • the reading unit 101 is configured to read a configuration file of the system to determine related shortcut keys, wherein each shortcut key is correspondingly provided with a corresponding shortcut operation instruction.
  • the configuration file of the system may be a configuration file of the development tool IDE, or a configuration file of an office software such as OFFICE.
  • the shortcut key corresponding to “Copy” may be “ctrl+C”
  • the “ctrl+C” shortcut key corresponds to a shortcut operation instruction for copying
  • the shortcut key corresponding to “Paste” may be “ctrl+V”.
  • the "ctrl+V" shortcut key corresponds to a shortcut operation instruction for pasting.
  • the analyzing unit 102 is configured to perform text recognition on the acquired voice information to obtain text information, and perform semantic analysis on the text information to determine corresponding semantic information.
  • the voice information corresponding to the voice information can be obtained by performing voice recognition on the voice information sent by the user, and the text information can also be semantically analyzed, thereby corresponding to the text information. Semantic information.
  • the analyzing unit 102 specifically includes: a voice recognition unit 201 and a semantic analysis unit 202.
  • the voice recognition unit 201 is configured to perform voice recognition on the acquired voice information to convert the voice information into text information.
  • the obtained voice information may be processed correspondingly through the smart voice API.
  • the smart voice API may be an API for voice recognition.
  • the voice recognition unit 201 specifically includes: a conversion unit 301 , a feature extraction unit 302 , and a construction unit 303 .
  • the converting unit 301 is configured to convert the acquired voice information into pure sound wave information through voice activity detection.
  • the voice activity detection (VAD) voice signal processing technology is required to perform mute and cut off the first and last segments of the acquired voice information. Pure acoustic information is obtained to reduce noise interference.
  • voice activity detection technology is mainly used for speech coding and speech recognition, which can simplify speech processing, and can also be used to remove non-speech segments during an audio session, such as encoding and transmission of mute data packets in IP telephony applications. Saving computation time and bandwidth, voice activity detection technology makes some column-based applications a reality.
  • the feature extraction unit 302 is configured to frame the pure sound wave information by using a moving window function, and perform acoustic feature extraction on the pure sound wave information after the framed.
  • the moving window function can implement truncation of the signal, thereby realizing framing the pure acoustic wave information.
  • the more common acoustic feature extraction method may be to extract the MFCC feature, that is, according to the physiological characteristics of the human ear, turn each frame waveform into a multi-dimensional vector, to simply understand that this vector contains the content information of the frame speech.
  • pure acoustic information can be a 12-line (assuming the acoustic feature is 12-dimensional) and a matrix of N columns, so it is called the observation sequence, where N is the total number of frames.
  • the constructing unit 303 is configured to construct a state network by using a hidden Markov model, and find a path that best matches the pure sound wave information from the state network to obtain corresponding text information.
  • the built-in state network refers to a state network after being developed into a phoneme network by a word-level network.
  • the speech recognition process is actually searching for an optimal path in the state network.
  • the probability of the best path corresponding to the voice information is the largest, which is called “decoding”.
  • the path search algorithm is a dynamic plan pruning algorithm called Viterbi algorithm for finding the global optimal path.
  • several frame speeches correspond to one state, and each of the three states is combined into one phoneme, and several phonemes are combined into one word, and then the sound wave information is finally converted into text information through the matching of the state network.
  • the semantic analysis unit 202 is configured to perform semantic analysis on the text information by using a natural language processing algorithm to obtain corresponding semantic information.
  • Natural Language Processing is a sub-area of artificial intelligence (AI), which uses the dependency relationship between words and words in a sentence to represent the syntactic structure information of a word (such as The subject-predicate, the verb-object, and the medium-structure relationship), and use the tree structure to represent the structure of the whole sentence (such as the subject-predicate, the fixed complement, etc.).
  • AI artificial intelligence
  • the semantic analysis unit 202 specifically includes: a word segmentation unit 401 , a calculation unit 402 , and an adjustment unit 403 .
  • the word segmentation unit 401 is configured to perform segmentation and part-of-speech tagging on the text information to obtain a plurality of words marked with part of speech.
  • the natural language processing is to enable the computer to understand the human language, that is, to understand the meaning behind the text.
  • the word segmentation is the basis of natural language processing, for example, it can help to extract keywords and classify them. Under normal circumstances, the principle of word segmentation uses the conditional random field, and the word segmentation is carried out by the features such as position labeling and part of speech, so that several words are obtained, which can also be called obtaining a number of terms.
  • the calculating unit 402 is configured to calculate a weight value of each word.
  • a weight can be calculated for each term after the word segmentation, and the important term should be given a higher weight.
  • the term weighting result of "what exercise is more helpful for weight loss/" may be: "What 0.1, exercise 0.5, 0.1, weight loss 0.8, help 0.3, larger 0.2".
  • the weighting formula of Term Weighting generally consists of three parts: local, global and normalization.
  • the Term weighting method can include F-IDF, Okapi, MI, LTU, ATC, TF-ICF, etc. Through the combination of various formulas of local, global, and normalization, different term weighting calculation methods can be generated. That is, the weight value corresponding to each word can be obtained by the term weighting calculation method.
  • the adjusting unit 403 is configured to determine that a word whose weight value is greater than or equal to a preset threshold is a keyword, and the keyword is a corresponding semantic information.
  • a threshold is preset, a weight value corresponding to all words is obtained, and a word corresponding to a weight value greater than or equal to a preset threshold is determined as a keyword, and the keyword is a corresponding semantic information.
  • the determining unit 103 is configured to perform text recognition matching on the semantic information according to a preset rule to determine a shortcut operation instruction that matches the semantic information.
  • the determining unit 103 specifically includes: an obtaining unit 501 and a matching unit 502.
  • the obtaining unit 501 is configured to acquire a preset Chinese vocabulary.
  • a related professional term vocabulary may be created according to the existing data information, and the existing data information may be a word in a professional field and a popular word in life, and at the same time, the preset Chinese word
  • the library can be created according to the needs of the user. For example, according to the shortcut keys and related shortcut operations configured by the system, a terminology vocabulary corresponding to the shortcut key can be created, and the terminology vocabulary is the preset Chinese vocabulary of the application.
  • the matching unit 502 is configured to perform fuzzy matching on the semantic information with the preset Chinese vocabulary to determine a shortcut key that matches the semantic information.
  • fuzzy matching by performing fuzzy matching on the semantic information with the preset Chinese vocabulary, text recognition and matching of the semantic information is implemented, thereby determining a shortcut key corresponding to the semantic information, and because Each shortcut key corresponds to a corresponding shortcut operation instruction, and at this time, a shortcut operation instruction matching the semantic information can be determined, thereby implementing a subsequent shortcut operation.
  • a related terminology vocabulary can be created according to the existing data information, and then fuzzy matching is performed by synthesizing the semantics of the parsed linguistic information with the professional term vocabulary related to the creation to determine the semantic information.
  • Matching shortcut operation instructions then feedback to the user to perform the next operation (professional term feedback), perform corresponding operations and language feedback to execute the results, and perform customization with the feedback language to enhance user professionalism. It is convenient for users to operate and learn.
  • the apparatus may further comprise the following units:
  • the running unit 104 is configured to run a shortcut operation instruction corresponding to the shortcut key to implement a corresponding shortcut operation.
  • a shortcut operation corresponding to the input voice information can be implemented.
  • the apparatus may further include a display unit, configured to display a shortcut key corresponding to the shortcut operation instruction after the operation shortcut operation instruction is invoked, thereby facilitating the user to perform a direct operation.
  • the method provided by the embodiment of the present application can be applied to a common IDE development tool such as Eclipse.
  • the user expresses it through natural language without knowing the shortcut keys of some operations.
  • "I want to create a Java class of Test” the voice recognition API, the semantic recognition API, and the Chinese lexicon can be used. Fuzzy search, etc., the shortcut operation instruction corresponding to the shortcut key "alt+C" of the creation class set in the IDE development tool of Eclipse is called to create a Test.java file.
  • the shortcut key “alt+C” is displayed on the display screen of the terminal at the same time, so as to prompt the user to create a shortcut key of the class as “alt+C”, so that the user can directly use the corresponding shortcut key next time.
  • a call of a shortcut operation instruction may be performed on the selected target by voice recognition to implement an operation process on the target. For example, select a xxx.txt file and send a voice message: “Open File”, then call the shortcut operation command, select the default open file tool to open the file, and also return the voice: “Opened xxx.txt After selecting the text content, the voice message is sent: “Copy”, at this time, it is determined that the corresponding “ctrl+C” shortcut key is copied, so that the shortcut operation instruction corresponding to the shortcut key is called to perform the copying, and the voice feedback: “ The content has been copied.” Specify the text position, and send out the voice message: “Paste”. At this time, confirm the corresponding “ctrl+V” shortcut key, and then call the shortcut operation instruction corresponding to the shortcut key and paste the content in the specified position. At the same time, you can voice feedback: “Paste success”.
  • FIG. 11 is a schematic structural diagram of a computer device according to the present application.
  • the device may be a terminal or a server, wherein the terminal may be a communication-enabled electronic device such as a smart phone, a tablet computer, a notebook computer, a desktop computer, a personal digital assistant, and a wearable device.
  • the server can be a standalone server or a server cluster consisting of multiple servers.
  • the computer device 600 includes a processor 602, a non-volatile storage medium 603, an internal memory 604, and a network interface 605 connected by a system bus 601.
  • the non-volatile storage medium 603 of the computer device 600 can store an operating system 6031 and a computer program 6032.
  • the processor 602 can be caused to execute a shortcut key identification method.
  • the processor 602 of the computer device 600 is used to provide computing and control capabilities to support the operation of the entire computer device 600.
  • the internal memory 604 provides an environment for the operation of a computer program in a non-volatile storage medium that, when executed by the processor, causes the processor 602 to perform the shortcut key identification method of the above-described embodiments.
  • the network interface 605 of the computer device 600 is used to perform network communications, such as sending assigned tasks and the like. It will be understood by those skilled in the art that the embodiment of the computer device shown in FIG.
  • the computer device may include more or fewer components than illustrated. Or combine some parts, or different parts.
  • the computer device may include only a memory and a processor. In such an embodiment, the structure and function of the memory and the processor are the same as those of the embodiment shown in FIG. 11, and details are not described herein again.
  • the application provides a computer readable storage medium storing one or more programs, the one or more programs being executable by one or more processors to implement the above-described embodiments Key identification method.
  • the foregoing storage medium of the present application includes: a magnetic disk, an optical disk, a read-only memory (ROM), and the like, which can store various program codes.
  • the units in all the embodiments of the present application may be implemented by a general-purpose integrated circuit, such as a CPU (Central Processing Unit), or by an ASIC (Application Specific Integrated Circuit).
  • the steps in the shortcut key identification method in the embodiment of the present application may be sequentially adjusted, merged, and deleted according to actual needs.
  • the units in the shortcut key identification terminal may be combined, divided, and deleted according to actual needs.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Machine Translation (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

Disclosed in embodiments of the present application are a shortcut key recognition method and apparatus, a device, and a computer-readable storage medium. The method comprises: reading a configuration file of a system to determine related shortcut keys, wherein a shortcut operation instruction is correspondingly provided for each shortcut key; performing voice recognition on obtained voice information to obtain text information, and performing semantic analysis on the text information to determine corresponding semantic information; and performing text recognition matching on the semantic information according to a preset rule to determine a shortcut key that matches the semantic information.

Description

快捷键识别方法、装置、设备以及计算机可读存储介质Shortcut key identification method, device, device and computer readable storage medium
本申请要求于2018年3月8日提交中国专利局、申请号为CN201810191036.0、申请名称为“快捷键识别方法、装置、设备以及计算机可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese Patent Application filed on March 8, 2018, the Chinese Patent Office, the application number is CN201810191036.0, and the application name is "the shortcut key identification method, device, device, and computer readable storage medium". The entire contents are incorporated herein by reference.
技术领域Technical field
本申请涉及计算机技术领域,尤其涉及一种快捷键识别方法、装置、设备以及计算机可读存储介质。The present application relates to the field of computer technologies, and in particular, to a shortcut key identification method, apparatus, device, and computer readable storage medium.
背景技术Background technique
集成开发环境(IDE,Integrated Development Environment)是用于提供程序开发环境的应用程序,一般包括代码编辑器、编译器、调试器和图形用户界面等工具,其还是集成了代码编写功能、分析功能、编译功能、调试功能等一体化的开发软件服务套。目前的IDE作为开发工具有很多种,不同的IDE又有不同组合的快捷键,对于程序员来说,记住多种快捷键能够方便快捷操作,但是程序员为了能够快速地进行编程,需要记住大量的快捷键,这无疑会增大程序员的工作量,而且可能会使得快捷化操作变得不够快捷和准确。The Integrated Development Environment (IDE) is an application for providing a program development environment. It generally includes tools such as a code editor, compiler, debugger, and graphical user interface. It also integrates code writing functions and analysis functions. Integrated development software service suites such as compilation functions and debugging functions. There are many kinds of development tools in the current IDE. Different IDEs have different combinations of shortcut keys. For programmers, remembering a variety of shortcut keys can be operated quickly and easily, but programmers need to remember in order to be able to program quickly. Living a large number of shortcuts will undoubtedly increase the programmer's workload and may make the shortcuts less fast and accurate.
发明内容Summary of the invention
本申请实施例提供一种快捷键识别方法、装置、设备以及计算机可读存储介质,能够简化对快捷键的识别,以辅助快速开发编程,提高工作效率。The embodiment of the present application provides a shortcut key identification method, device, device, and computer readable storage medium, which can simplify the identification of shortcut keys to assist in rapid development of programming and improve work efficiency.
一方面,本申请实施例提供了一种快捷键识别方法,该方法包括:In one aspect, the embodiment of the present application provides a shortcut key identification method, where the method includes:
读取系统的配置文件,以确定相关的快捷键,其中,每个快捷键均对应设置有相应的快捷操作指令;通过对获取的语音信息进行语音识别以得到文本信息,并对所述文本信息进行语义分析,以确定相应的语义信息;根据预设规则对所述语义信息进行文本识别匹配,以确定与所述语义信息相匹配的快捷键。Reading a configuration file of the system to determine related shortcut keys, wherein each shortcut key is correspondingly provided with a corresponding shortcut operation instruction; performing voice recognition on the acquired voice information to obtain text information, and obtaining the text information Performing semantic analysis to determine corresponding semantic information; performing text recognition matching on the semantic information according to a preset rule to determine a shortcut key that matches the semantic information.
另一方面,本申请实施例还提供了一种快捷键识别装置,该装置包括:On the other hand, the embodiment of the present application further provides a shortcut key identification device, and the device includes:
读取单元,用于读取系统的配置文件,以确定相关的快捷键,其中,每个 快捷键均对应设置有相应的快捷操作指令;分析单元,用于通过对获取的语音信息进行语音识别以得到文本信息,并对所述文本信息进行语义分析,以确定相应的语义信息;确定单元,用于根据预设规则对所述语义信息进行文本识别匹配,以确定与所述语义信息相匹配的快捷操作指令。a reading unit, configured to read a configuration file of the system to determine related shortcut keys, wherein each shortcut key is correspondingly provided with a corresponding shortcut operation instruction; and an analysis unit is configured to perform voice recognition on the acquired voice information Obtaining text information, and performing semantic analysis on the text information to determine corresponding semantic information; and determining unit, configured to perform text recognition matching on the semantic information according to a preset rule to determine that the semantic information is matched Shortcut instructions.
又一方面,本申请实施例还提供了一种计算机设备,包括:存储器,用于存储实现快捷键识别的程序;以及处理器,用于运行所述存储器中存储的实现快捷键识别的程序,以执行如上所述方法。In another aspect, the embodiment of the present application further provides a computer device, including: a memory, configured to store a program for implementing shortcut key identification; and a processor, configured to execute a program for realizing shortcut key identification stored in the memory, To perform the method as described above.
再一方面,本申请实施例还提供了一种计算机可读存储介质,计算机可读存储介质存储有一个或者一个以上程序,所述一个或者一个以上程序可被一个或者一个以上的处理器执行,以实现如上所述方法。In still another aspect, an embodiment of the present application further provides a computer readable storage medium storing one or more programs, the one or more programs being executable by one or more processors, To achieve the method as described above.
实施本申请实施例不仅可以简化对快捷键的识别,还能辅助快速地进行系统开发和编程,提高工作效率。The implementation of the embodiment of the present application not only simplifies the identification of the shortcut keys, but also assists in rapid system development and programming, and improves work efficiency.
附图说明DRAWINGS
为了更清楚地说明本申请实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings used in the description of the embodiments will be briefly described below. Obviously, the drawings in the following description are some embodiments of the present application, For the ordinary technicians, other drawings can be obtained based on these drawings without any creative work.
图1是本申请实施例提供的一种快捷键识别方法的示意流程图;1 is a schematic flow chart of a shortcut key identification method provided by an embodiment of the present application;
图2是本申请实施例提供的一种快捷键识别方法的另一示意流程图;2 is another schematic flowchart of a shortcut key identification method provided by an embodiment of the present application;
图3是本申请实施例提供的一种快捷键识别方法的另一示意流程图;FIG. 3 is another schematic flowchart of a shortcut key identification method according to an embodiment of the present application; FIG.
图4是本申请实施例提供的一种快捷键识别方法的另一示意流程图;4 is another schematic flowchart of a shortcut key identification method provided by an embodiment of the present application;
图5是本申请实施例提供的一种快捷键识别方法的另一示意流程图;FIG. 5 is another schematic flowchart of a shortcut key identification method according to an embodiment of the present application; FIG.
图6是本申请实施例提供的一种快捷键识别装置的示意性框图;FIG. 6 is a schematic block diagram of a shortcut key identification apparatus according to an embodiment of the present application; FIG.
图7是本申请实施例提供的一种快捷键识别装置的另一示意性框图;FIG. 7 is another schematic block diagram of a shortcut key identification apparatus according to an embodiment of the present application; FIG.
图8是本申请实施例提供的一种快捷键识别装置的另一示意性框图;FIG. 8 is another schematic block diagram of a shortcut key identification apparatus according to an embodiment of the present application; FIG.
图9是本申请实施例提供的一种快捷键识别装置的另一示意性框图;FIG. 9 is another schematic block diagram of a shortcut key identification apparatus according to an embodiment of the present application; FIG.
图10是本申请实施例提供的一种快捷键识别装置的另一示意性框图;FIG. 10 is another schematic block diagram of a shortcut key identification apparatus according to an embodiment of the present application; FIG.
图11是本申请实施例提供的一种计算机设备结构组成示意图。FIG. 11 is a schematic structural diagram of a computer device according to an embodiment of the present application.
具体实施方式Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application are clearly and completely described in the following with reference to the drawings in the embodiments of the present application. It is obvious that the described embodiments are a part of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present application without departing from the inventive scope are the scope of the present application.
请参阅图1,图1是本申请实施例提供的一种快捷键识别方法的示意流程图。该方法可以运行在智能手机(如Android手机、IOS手机等)、平板电脑、笔记本电脑以及智能设备等终端中。具体地,该方法可以应用于各种开发工具中以协助编程,也可以应用于OFFICE等办公软件中,从而实现例如:创建类,代码格式化等快捷化操作;其中开发工具可以是eclipse、intellJ IDEA等。如图1所示,该方法的步骤包括S101~S104。Please refer to FIG. 1. FIG. 1 is a schematic flow chart of a method for identifying a shortcut key according to an embodiment of the present application. The method can be run on terminals such as smart phones (such as Android phones, IOS phones, etc.), tablets, laptops, and smart devices. Specifically, the method can be applied to various development tools to assist programming, and can also be applied to office software such as OFFICE, thereby implementing shortcut operations such as creating a class, code formatting, etc., wherein the development tool can be eclipse or intellJ. IDEA, etc. As shown in FIG. 1, the steps of the method include S101 to S104.
S101,读取系统的配置文件,以确定相关的快捷键,其中,每个快捷键均对应设置有相应的快捷操作指令。S101: Read a configuration file of the system to determine related shortcut keys, wherein each shortcut key is correspondingly set with a corresponding shortcut operation instruction.
在本申请实施例中,系统的配置文件可以是开发工具IDE的配置文件,也可以是OFFICE等办公软件的配置文件。通过读取配置文件,即可以确定配置文件中记录有的快捷键以及与该快捷键相匹配的快捷操作指令。一般情况下,在调用了其中一个快捷键后,就可以获取并执行与该快捷键对应的快捷操作指令,从而实现对快捷键的识别和使用。例如,“复制”所对应的快捷键可以是“ctrl+C”,该“ctrl+C”快捷键对应有关于复制的快捷操作指令,“粘贴”对应的快捷键可以是“ctrl+V”,该“ctrl+V”快捷键对应有关于粘贴的快捷操作指令等。In the embodiment of the present application, the configuration file of the system may be a configuration file of the development tool IDE, or a configuration file of an office software such as OFFICE. By reading the configuration file, you can determine the shortcut keys recorded in the configuration file and the shortcut operation instructions that match the shortcut keys. In general, after one of the shortcut keys is called, the shortcut operation instruction corresponding to the shortcut key can be obtained and executed, thereby realizing the recognition and use of the shortcut key. For example, the shortcut key corresponding to “Copy” may be “ctrl+C”, and the “ctrl+C” shortcut key corresponds to a shortcut operation instruction for copying, and the shortcut key corresponding to “Paste” may be “ctrl+V”. The "ctrl+V" shortcut key corresponds to a shortcut operation instruction for pasting.
S102,通过对获取的语音信息进行语音识别以得到文本信息,并对所述文本信息进行语义分析,以确定相应的语义信息。S102: Perform speech recognition on the acquired voice information to obtain text information, and perform semantic analysis on the text information to determine corresponding semantic information.
在本申请实施例中,可以通过对用户发出的语音信息进行语音识别,从而得到与该语音信息相对应的文本信息,同时对该文本信息也可以进行语义分析,从而到的与该文本信息对应的语义信息。In the embodiment of the present application, the voice information corresponding to the voice information can be obtained by performing voice recognition on the voice information sent by the user, and the text information can also be semantically analyzed, thereby corresponding to the text information. Semantic information.
进一步地,如图2所示,所述步骤S102具体包括步骤S201~S202。Further, as shown in FIG. 2, the step S102 specifically includes steps S201 to S202.
S201,对获取的语音信息进行语音识别,以将语音信息转化为文本信息。S201: Perform speech recognition on the acquired voice information to convert the voice information into text information.
在本申请实施例中,可以通过智能语音API对获取的语音信息进行相应的处理。该智能语音API可以是用于进行语音识别的API。In the embodiment of the present application, the obtained voice information may be processed correspondingly through the smart voice API. The smart voice API may be an API for voice recognition.
进一步地,如图3所示,所述步骤S201具体包括:步骤S301~S303。Further, as shown in FIG. 3, the step S201 specifically includes steps S301 to S303.
S301,将获取的语音信息通过语音活动检测转换为纯声波信息。S301. Convert the acquired voice information into pure sound wave information by using voice activity detection.
在本申请实施例中,一般情况下,在识别语音之前,都需要用语音活动检测(Voice activity detection,VAD)这一语音信号处理技术将所述获取的语音信息的首尾段进行静音切除,以得到纯声波信息,从而降低噪音干扰。其中,语音活动检测技术主要用于语音编码和语音识别,它可以简化语音处理,也可用于在音频会话期间去除非语音片段,如可以在IP电话应用中避免对静音数据包的编码和传输,节省计算时间和带宽,故语音活性检测技术使得一些列基于语音的应用程序称为现实。In the embodiment of the present application, in general, before the voice recognition, the voice activity detection (VAD) voice signal processing technology is required to perform mute and cut off the first and last segments of the acquired voice information. Pure acoustic information is obtained to reduce noise interference. Among them, voice activity detection technology is mainly used for speech coding and speech recognition, which can simplify speech processing, and can also be used to remove non-speech segments during an audio session, such as encoding and transmission of mute data packets in IP telephony applications. Saving computation time and bandwidth, voice activity detection technology makes some column-based applications a reality.
S302,使用移动窗函数对纯声波信息进行分帧,并对分帧后的纯声波信息进行声学特征提取。S302, using the moving window function to frame the pure sound wave information, and performing acoustic feature extraction on the pure sound wave information after the framed.
在本申请实施例中,所述移动窗函数能够实现对信号进行截断,从而实现对纯声波信息进行分帧。同时,较为常见的声学特征提取方法可以是提取MFCC特征,即根据人耳的生理特征,把每一帧波形变成一个多维向量,以简单地理解为这个向量包含了这帧语音的内容信息。例如,纯声波信息进行声学特征提取后,可以成为一个12行(假设声学特征是12维)、N列的一个矩阵,故称之为观察序列,这里N为总帧数。In the embodiment of the present application, the moving window function can implement truncation of the signal, thereby realizing framing the pure acoustic wave information. At the same time, the more common acoustic feature extraction method may be to extract the MFCC feature, that is, according to the physiological characteristics of the human ear, turn each frame waveform into a multi-dimensional vector, to simply understand that this vector contains the content information of the frame speech. For example, after pure acoustic wave information is extracted by acoustic features, it can be a 12-line (assuming the acoustic feature is 12-dimensional) and a matrix of N columns, so it is called an observation sequence, where N is the total number of frames.
S303,利用隐马尔可夫模型,构建一个状态网络,并从状态网络中寻找与纯声波信息最匹配的路径,以得到相应的文本信息。S303. Construct a state network by using a hidden Markov model, and find a path that best matches the pure sound wave information from the state network to obtain corresponding text information.
在本申请实施例中,搭建状态网络是指由单词级网络展开成音素网络后,再展开成状态网络。一般情况下,语音识别过程其实就是在状态网络中搜索一条最佳路径,语音信息对应的最佳路径的概率最大,称之为“解码”。路径搜索的算法是一种动态规划剪枝的算法,称之为Viterbi算法,用于寻找全局最优路径。总之,若干帧语音对应一个状态,每三个状态组合成一个音素,若干个音素组合成一个单词,再通过状态网络的匹配,将声波信息最终转为文本信息。In the embodiment of the present application, the built-in state network refers to a state network after being developed into a phoneme network by a word-level network. In general, the speech recognition process is actually searching for an optimal path in the state network. The probability of the best path corresponding to the voice information is the largest, which is called “decoding”. The path search algorithm is a dynamic plan pruning algorithm called Viterbi algorithm for finding the global optimal path. In short, several frame speeches correspond to one state, and each of the three states is combined into one phoneme, and several phonemes are combined into one word, and then the sound wave information is finally converted into text information through the matching of the state network.
S202,通过自然语言处理算法对文本信息进行语义分析以得到相应的语义信息。S202: Perform semantic analysis on the text information by using a natural language processing algorithm to obtain corresponding semantic information.
在本申请实施例中,自然语言处理算法(Natural Language Processing,NLP)是人工智能(AI)的一个子领域,其利用句子中词与词之间的依存关系来表示词语的句法结构信息(如主谓、动宾、定中等结构关系),并用树状结构来表示整 句的结构(如主谓宾、定状补等)。通过分析用户Query的依存句法结构信息,抽取其中的语义主干及相关语义成分,帮助智能产品实现对用户意图的精准理解。In the embodiment of the present application, Natural Language Processing (NLP) is a sub-area of artificial intelligence (AI), which uses the dependency relationship between words and words in a sentence to represent the syntactic structure information of a word (such as The subject-predicate, the verb-object, and the medium-structure relationship), and use the tree structure to represent the structure of the whole sentence (such as the subject-predicate, the fixed complement, etc.). By analyzing the dependency syntax structure information of the user Query, the semantic backbone and related semantic components are extracted to help the intelligent product realize the accurate understanding of the user's intention.
进一步地,如图4所示,所述步骤S202具体包括:步骤S401~S403。Further, as shown in FIG. 4, the step S202 specifically includes steps S401 to S403.
S401,对所述文本信息进行分词及词性标注,以得到多个标注有词性的词语。S401: Perform word segmentation and part-of-speech tagging on the text information to obtain a plurality of words marked with part of speech.
在本申请实施例中,自然语言处理就是为了让计算机能够理解人类的语言,即理解文字背后的含义。而分词是自然语言处理的基础,比如说对关键词提取、分本分类等都有帮助。一般情况下,分词原理用的是条件随机场,通过对词语的位置标注和词性等特征来进行分词,从而得到若干个词语,也可称为得到若干个term。In the embodiment of the present application, the natural language processing is to enable the computer to understand the human language, that is, to understand the meaning behind the text. The word segmentation is the basis of natural language processing, for example, it can help to extract keywords and classify them. Under normal circumstances, the principle of word segmentation uses the conditional random field, and the word segmentation is carried out by the features such as position labeling and part of speech, so that several words are obtained, which can also be called obtaining a number of terms.
S402,计算每个词语的权重值。S402, calculating a weight value of each word.
在本申请实施例中,可以对分词后的每个term,计算一个权重,重要的term应该给与更高的权重。例如,“什么运动对减肥帮助较大/”的term weighting结果可能是:“什么0.1,运动0.5,对0.1,减肥0.8,帮助0.3,较大0.2”。词语权重(Term weighting)的打分公式一般由local,global和normalization这三部分组成。Term weighting方法可以包括F-IDF,Okapi,MI,LTU,ATC,TF-ICF等,通过local,global,normalization各种公式的组合,可以生成不同的term weighting计算方法。即可以通过term weighting计算方法得到每个词语对应的权重值。In the embodiment of the present application, a weight can be calculated for each term after the word segmentation, and the important term should be given a higher weight. For example, the term weighting result of "what exercise is more helpful for weight loss/" may be: "What 0.1, exercise 0.5, 0.1, weight loss 0.8, help 0.3, larger 0.2". The weighting formula of Term Weighting generally consists of three parts: local, global and normalization. The Term weighting method can include F-IDF, Okapi, MI, LTU, ATC, TF-ICF, etc. Through the combination of various formulas of local, global, and normalization, different term weighting calculation methods can be generated. That is, the weight value corresponding to each word can be obtained by the term weighting calculation method.
S403,确定权重值大于或等于预设阀值的词语为关键词,所述关键词即为相应的语义信息。S403. Determine that the word whose weight value is greater than or equal to the preset threshold is a keyword, and the keyword is the corresponding semantic information.
在本申请实施例中,预设一个阈值,获取所有词语对应的权重值,并将大于或等于预设阀值的权重值所对应的词语确定为关键词,该关键词即为相应的语义信息。In the embodiment of the present application, a threshold is preset, a weight value corresponding to all words is obtained, and a word corresponding to a weight value greater than or equal to a preset threshold is determined as a keyword, and the keyword is a corresponding semantic information. .
S103,根据预设规则对所述语义信息进行文本识别匹配,以确定与所述语义信息相匹配的快捷键。S103. Perform text recognition matching on the semantic information according to a preset rule to determine a shortcut key that matches the semantic information.
在本申请实施例中,所述预设规则可以是根据预设的汉语词库对所述语义信息进行模糊检索解析,从而实现对所述语义信息的文本识别。一般情况下,进行具体的文本识别之后,可以确定该语音信息所对应的快捷键,根据该快捷键即可以确认用户所需要调用的快捷操作指令,即实现相应的语音反馈。In the embodiment of the present application, the preset rule may be that the semantic information is subjected to fuzzy search and parsing according to a preset Chinese vocabulary, thereby implementing text recognition of the semantic information. In general, after the specific text recognition is performed, the shortcut key corresponding to the voice information can be determined, and according to the shortcut key, the shortcut operation instruction that the user needs to call can be confirmed, that is, the corresponding voice feedback is realized.
进一步地,如图5所示,所述步骤S103具体包括:步骤S501~S502。Further, as shown in FIG. 5, the step S103 specifically includes steps S501 to S502.
S501,获取预设的汉语词库。S501: Obtain a preset Chinese vocabulary.
在本申请实施例中,可以根据现有的数据信息创建相关的专业术语词库,该现有的数据信息可以是专业领域的词语与生活中通俗的词语,同时,所述预设的汉语词库可以根据用户的需求进行创建。譬如,可以根据系统配置有的快捷键以及相关快捷操作,创建与该快捷键相对应的专业术语词库,该专业术语词库即为本申请的预设汉语词库。In the embodiment of the present application, a related professional term vocabulary may be created according to the existing data information, and the existing data information may be a word in a professional field and a popular word in life, and at the same time, the preset Chinese word The library can be created according to the needs of the user. For example, according to the shortcut keys and related shortcut operations configured by the system, a terminology vocabulary corresponding to the shortcut key can be created, and the terminology vocabulary is the preset Chinese vocabulary of the application.
S502,将所述语义信息与所述预设的汉语词库进行模糊匹配,以确定与所述语义信息相匹配的快捷键。S502: Perform fuzzy matching on the semantic information with the preset Chinese vocabulary to determine a shortcut key that matches the semantic information.
在本申请实施例中,通过将所述语义信息与所述预设的汉语词库进行模糊匹配,即实现对所述语义信息的文本识别匹配,从而确定该语义信息对应的快捷键,又因为每个快捷键都对应有相应的快捷操作指令,此时即可确定与所述语义信息相匹配的快捷操作指令,从而实现后续的快捷操作。In the embodiment of the present application, by performing fuzzy matching on the semantic information with the preset Chinese vocabulary, text recognition and matching of the semantic information is implemented, thereby determining a shortcut key corresponding to the semantic information, and because Each shortcut key corresponds to a corresponding shortcut operation instruction, and at this time, a shortcut operation instruction matching the semantic information can be determined, thereby implementing a subsequent shortcut operation.
例如,当传统的系统开发过程中,开发人员如果要复制一段代码,那么选中要复制内容后,可以在对包括“复制”的语音信息进行识别后,确定“复制”为关键词,此时“复制”即为语义信息,通过将“复制”与预设的汉语词库进行模糊匹配,可以确定“ctrl+C”为相应的快捷键,即确定“ctrl+C”对应的快捷操作指令为与“复制”相匹配的快捷操作指令,从而可以实现对选定代码的复制。故进一步的,任何可以通过快捷键简化的操作,均可以省去按键操作,直接通过自然语言发出指令即可实现操作。例如创建一个test类,只需要使用自然语言发出创建一个test类的指令。For example, in the traditional system development process, if the developer wants to copy a piece of code, then after selecting the content to be copied, after identifying the voice information including "copy", it is determined that "copy" is a keyword, at this time " "Copy" is the semantic information. By fuzzy matching the "copy" with the preset Chinese vocabulary, it can be determined that "ctrl+C" is the corresponding shortcut key, that is, the shortcut operation instruction corresponding to "ctrl+C" is determined as "Copy" matches the shortcut action instructions so that copying of the selected code is possible. Therefore, any operation that can be simplified by the shortcut key can save the key operation and directly execute the instruction through the natural language. For example, to create a test class, you only need to use the natural language to issue an instruction to create a test class.
故一般情况下,可以根据现有的数据信息创建相关的专业术语词库,然后通过将解析后的语言信息的语义与创建相关的专业术语词库进行模糊匹配,以确定与所述语义信息相匹配的快捷操作指令,再通过语音反馈给用户接下来要执行的操作(专业术语反馈),执行相应的操作并语言反馈执行结果,同时执行与反馈的语言实行定制化,可增强用户专业性,方便用户操作学习。Therefore, in general, a related terminology vocabulary can be created according to the existing data information, and then fuzzy matching is performed by synthesizing the semantics of the parsed linguistic information with the professional term vocabulary related to the creation to determine the semantic information. Matching shortcut operation instructions, then feedback to the user to perform the next operation (professional term feedback), perform corresponding operations and language feedback to execute the results, and perform customization with the feedback language to enhance user professionalism. It is convenient for users to operate and learn.
作为进一步的实施例,所述方法还包括以下步骤:As a further embodiment, the method further comprises the steps of:
S104,运行所述快捷键对应的快捷操作指令,以实现相应的快捷操作。S104. Run a shortcut operation instruction corresponding to the shortcut key to implement a corresponding shortcut operation.
在本申请实施例中,通过运行所述快捷键对应的快捷操作指令,可以实现与输入的语音信息相对应的快捷操作。同时,作为另一实施例,还可以在调用 运行快捷操作指令之后,显示与所述快捷操作指令相应的快捷键,从而方便用户进行直接操作。In the embodiment of the present application, by running the shortcut operation instruction corresponding to the shortcut key, a shortcut operation corresponding to the input voice information can be implemented. At the same time, as another embodiment, after the running shortcut operation instruction is invoked, a shortcut key corresponding to the shortcut operation instruction may be displayed, thereby facilitating the user to perform a direct operation.
例如,本申请实施例提供的方法可以应用到Eclipse这种常用的IDE开发工具中为例。用户在不知道某些操作的快捷键的前提下,通过自然语言表达出来,例如,说:“我要创建一个Test的java类”,则可通过语音识别API、语义识别API以及汉语词库的模糊搜索等,将Eclipse这个IDE开发工具中设置的创建类的快捷键“alt+C”所对应的快捷操作指令调用出来,从而创建一个Test.java文件。作为优选的,还可以同时在终端的显示屏幕上显示该快捷键“alt+C”,以提示用户创建类的快捷键为“alt+C”,方便用户下次直接使用对应的快捷键。当然,还可以通过语音播放,以提示用户。For example, the method provided by the embodiment of the present application can be applied to a common IDE development tool such as Eclipse. The user expresses it through natural language without knowing the shortcut keys of some operations. For example, "I want to create a Java class of Test", the voice recognition API, the semantic recognition API, and the Chinese lexicon can be used. Fuzzy search, etc., the shortcut operation instruction corresponding to the shortcut key "alt+C" of the creation class set in the IDE development tool of Eclipse is called to create a Test.java file. Preferably, the shortcut key “alt+C” is displayed on the display screen of the terminal at the same time, so as to prompt the user to create a shortcut key of the class as “alt+C”, so that the user can directly use the corresponding shortcut key next time. Of course, you can also play the voice to remind the user.
又例如,可以对选定的目标,通过语音识别来进行快捷操作指令的调用,以实现对目标的操作处理。如,选中一个xxx.txt文件,发出语音信息:“打开文件”,此时则调用快捷操作指令,选定默认的打开文件工具对文件进行打开,同时还可以返回语音:“已打开xxx.txt文件”;而选中文本内容后,发出语音信息:“复制”,此时确定复制对应的“ctrl+C”快捷键,从而调用该快捷键对应的快捷操作指令执行复制后,可语音反馈:“已复制内容”。指定文本位置,发出语音信息:“粘贴”,此时确定粘贴对应的“ctrl+V”快捷键,从而调用该快捷键对应的快捷操作指令并在指定位置粘贴内容,同时可以语音反馈:“粘贴成功”。For another example, a call of a shortcut operation instruction may be performed on the selected target by voice recognition to implement an operation process on the target. For example, select a xxx.txt file and send a voice message: “Open File”, then call the shortcut operation command, select the default open file tool to open the file, and also return the voice: “Opened xxx.txt After selecting the text content, the voice message is sent: “Copy”, at this time, it is determined that the corresponding “ctrl+C” shortcut key is copied, so that the shortcut operation instruction corresponding to the shortcut key is called to perform the copying, and the voice feedback: “ The content has been copied." Specify the text position, and send out the voice message: “Paste”. At this time, confirm the corresponding “ctrl+V” shortcut key, and then call the shortcut operation instruction corresponding to the shortcut key and paste the content in the specified position. At the same time, you can voice feedback: “Paste success".
综上,本申请实施例不仅可以简化对快捷键的识别,还能辅助快速地进行系统开发和编程,提高工作效率。同时,该快捷键识别方法主要是利用智能语音识别,集成不同的IDE中的快捷键,使得开发者向计算机输入自然语言后,即可完成对应的快捷操作。In summary, the embodiment of the present application can not only simplify the identification of shortcut keys, but also assist in rapid system development and programming, and improve work efficiency. At the same time, the shortcut key recognition method mainly utilizes intelligent speech recognition and integrates shortcut keys in different IDEs, so that the developer can complete the corresponding shortcut operation after inputting the natural language to the computer.
请参阅图6,对应上述快捷键识别方法,本申请实施例还提出一种快捷键识别装置,该装置100包括:读取单元101、分析单元102以及确定单元103。Referring to FIG. 6 , corresponding to the above-mentioned shortcut key identification method, the embodiment of the present application further provides a shortcut key identification apparatus, and the apparatus 100 includes: a reading unit 101, an analysis unit 102, and a determination unit 103.
所述读取单元101,用于读取系统的配置文件,以确定相关的快捷键,其中,每个快捷键均对应设置有相应的快捷操作指令。在本申请实施例中,系统的配置文件可以是开发工具IDE的配置文件,也可以是OFFICE等办公软件的配置文件。通过读取配置文件,即可以确定配置文件中记录有的快捷键以及与该快捷键相匹配的快捷操作指令。一般情况下,在调用了其中一个快捷键后,就可 以获取并执行与该快捷键对应的快捷操作指令,从而实现对快捷键的识别和使用。例如,“复制”所对应的快捷键可以是“ctrl+C”,该“ctrl+C”快捷键对应有关于复制的快捷操作指令,“粘贴”对应的快捷键可以是“ctrl+V”,该“ctrl+V”快捷键对应有关于粘贴的快捷操作指令等。The reading unit 101 is configured to read a configuration file of the system to determine related shortcut keys, wherein each shortcut key is correspondingly provided with a corresponding shortcut operation instruction. In the embodiment of the present application, the configuration file of the system may be a configuration file of the development tool IDE, or a configuration file of an office software such as OFFICE. By reading the configuration file, you can determine the shortcut keys recorded in the configuration file and the shortcut operation instructions that match the shortcut keys. In general, after one of the shortcut keys is called, the shortcut operation instruction corresponding to the shortcut key can be obtained and executed, thereby realizing the recognition and use of the shortcut key. For example, the shortcut key corresponding to “Copy” may be “ctrl+C”, and the “ctrl+C” shortcut key corresponds to a shortcut operation instruction for copying, and the shortcut key corresponding to “Paste” may be “ctrl+V”. The "ctrl+V" shortcut key corresponds to a shortcut operation instruction for pasting.
所述分析单元102,用于通过对获取的语音信息进行语音识别以得到文本信息,并对所述文本信息进行语义分析,以确定相应的语义信息。在本申请实施例中,可以通过对用户发出的语音信息进行语音识别,从而得到与该语音信息相对应的文本信息,同时对该文本信息也可以进行语义分析,从而到的与该文本信息对应的语义信息。The analyzing unit 102 is configured to perform text recognition on the acquired voice information to obtain text information, and perform semantic analysis on the text information to determine corresponding semantic information. In the embodiment of the present application, the voice information corresponding to the voice information can be obtained by performing voice recognition on the voice information sent by the user, and the text information can also be semantically analyzed, thereby corresponding to the text information. Semantic information.
进一步地,如图7所示,所述分析单元102具体包括:语音识别单元201以及语义分析单元202。Further, as shown in FIG. 7 , the analyzing unit 102 specifically includes: a voice recognition unit 201 and a semantic analysis unit 202.
所述语音识别单元201,用于对获取的语音信息进行语音识别,以将语音信息转化为文本信息。在本申请实施例中,可以通过智能语音API对获取的语音信息进行相应的处理。该智能语音API可以是用于进行语音识别的API。The voice recognition unit 201 is configured to perform voice recognition on the acquired voice information to convert the voice information into text information. In the embodiment of the present application, the obtained voice information may be processed correspondingly through the smart voice API. The smart voice API may be an API for voice recognition.
进一步地,如图8所示,所述语音识别单元201具体包括:转换单元301、特征提取单元302以及构建单元303。Further, as shown in FIG. 8 , the voice recognition unit 201 specifically includes: a conversion unit 301 , a feature extraction unit 302 , and a construction unit 303 .
所述转换单元301,用于将获取的语音信息通过语音活动检测转换为纯声波信息。在本申请实施例中,一般情况下,在识别语音之前,都需要用语音活动检测(Voice activity detection,VAD)这一语音信号处理技术将所述获取的语音信息的首尾段进行静音切除,以得到纯声波信息,从而降低噪音干扰。其中,语音活动检测技术主要用于语音编码和语音识别,它可以简化语音处理,也可用于在音频会话期间去除非语音片段,如可以在IP电话应用中避免对静音数据包的编码和传输,节省计算时间和带宽,故语音活性检测技术使得一些列基于语音的应用程序称为现实。The converting unit 301 is configured to convert the acquired voice information into pure sound wave information through voice activity detection. In the embodiment of the present application, in general, before the voice recognition, the voice activity detection (VAD) voice signal processing technology is required to perform mute and cut off the first and last segments of the acquired voice information. Pure acoustic information is obtained to reduce noise interference. Among them, voice activity detection technology is mainly used for speech coding and speech recognition, which can simplify speech processing, and can also be used to remove non-speech segments during an audio session, such as encoding and transmission of mute data packets in IP telephony applications. Saving computation time and bandwidth, voice activity detection technology makes some column-based applications a reality.
所述特征提取单元302,用于使用移动窗函数对纯声波信息进行分帧,并对分帧后的纯声波信息进行声学特征提取。在本申请实施例中,所述移动窗函数能够实现对信号进行截断,从而实现对纯声波信息进行分帧。同时,较为常见的声学特征提取方法可以是提取MFCC特征,即根据人耳的生理特征,把每一帧波形变成一个多维向量,以简单地理解为这个向量包含了这帧语音的内容信息。例如,纯声波信息进行声学特征提取后,可以成为一个12行(假设声学 特征是12维)、N列的一个矩阵,故称之为观察序列,这里N为总帧数。The feature extraction unit 302 is configured to frame the pure sound wave information by using a moving window function, and perform acoustic feature extraction on the pure sound wave information after the framed. In the embodiment of the present application, the moving window function can implement truncation of the signal, thereby realizing framing the pure acoustic wave information. At the same time, the more common acoustic feature extraction method may be to extract the MFCC feature, that is, according to the physiological characteristics of the human ear, turn each frame waveform into a multi-dimensional vector, to simply understand that this vector contains the content information of the frame speech. For example, pure acoustic information can be a 12-line (assuming the acoustic feature is 12-dimensional) and a matrix of N columns, so it is called the observation sequence, where N is the total number of frames.
所述构建单元303,用于利用隐马尔可夫模型,构建一个状态网络,并从状态网络中寻找与纯声波信息最匹配的路径,以得到相应的文本信息。在本申请实施例中,搭建状态网络是指由单词级网络展开成音素网络后,再展开成状态网络。一般情况下,语音识别过程其实就是在状态网络中搜索一条最佳路径,语音信息对应的最佳路径的概率最大,称之为“解码”。路径搜索的算法是一种动态规划剪枝的算法,称之为Viterbi算法,用于寻找全局最优路径。总之,若干帧语音对应一个状态,每三个状态组合成一个音素,若干个音素组合成一个单词,再通过状态网络的匹配,将声波信息最终转为文本信息。The constructing unit 303 is configured to construct a state network by using a hidden Markov model, and find a path that best matches the pure sound wave information from the state network to obtain corresponding text information. In the embodiment of the present application, the built-in state network refers to a state network after being developed into a phoneme network by a word-level network. In general, the speech recognition process is actually searching for an optimal path in the state network. The probability of the best path corresponding to the voice information is the largest, which is called “decoding”. The path search algorithm is a dynamic plan pruning algorithm called Viterbi algorithm for finding the global optimal path. In short, several frame speeches correspond to one state, and each of the three states is combined into one phoneme, and several phonemes are combined into one word, and then the sound wave information is finally converted into text information through the matching of the state network.
所述语义分析单元202,用于通过自然语言处理算法对文本信息进行语义分析以得到相应的语义信息。在本申请实施例中,自然语言处理算法(Natural Language Processing,NLP)是人工智能(AI)的一个子领域,其利用句子中词与词之间的依存关系来表示词语的句法结构信息(如主谓、动宾、定中等结构关系),并用树状结构来表示整句的的结构(如主谓宾、定状补等)。通过分析用户Query的依存句法结构信息,抽取其中的语义主干及相关语义成分,帮助智能产品实现对用户意图的精准理解。The semantic analysis unit 202 is configured to perform semantic analysis on the text information by using a natural language processing algorithm to obtain corresponding semantic information. In the embodiment of the present application, Natural Language Processing (NLP) is a sub-area of artificial intelligence (AI), which uses the dependency relationship between words and words in a sentence to represent the syntactic structure information of a word (such as The subject-predicate, the verb-object, and the medium-structure relationship), and use the tree structure to represent the structure of the whole sentence (such as the subject-predicate, the fixed complement, etc.). By analyzing the dependency syntax structure information of the user Query, the semantic backbone and related semantic components are extracted to help the intelligent product realize the accurate understanding of the user's intention.
进一步地,如图9所示,所述语义分析单元202具体包括:分词单元401、计算单元402以及调整单元403。Further, as shown in FIG. 9 , the semantic analysis unit 202 specifically includes: a word segmentation unit 401 , a calculation unit 402 , and an adjustment unit 403 .
所述分词单元401,用于对所述文本信息进行分词及词性标注,以得到多个标注有词性的词语。在本申请实施例中,自然语言处理就是为了让计算机能够理解人类的语言,即理解文字背后的含义。而分词是自然语言处理的基础,比如说对关键词提取、分本分类等都有帮助。一般情况下,分词原理用的是条件随机场,通过对词语的位置标注和词性等特征来进行分词,从而得到若干个词语,也可称为得到若干个term。The word segmentation unit 401 is configured to perform segmentation and part-of-speech tagging on the text information to obtain a plurality of words marked with part of speech. In the embodiment of the present application, the natural language processing is to enable the computer to understand the human language, that is, to understand the meaning behind the text. The word segmentation is the basis of natural language processing, for example, it can help to extract keywords and classify them. Under normal circumstances, the principle of word segmentation uses the conditional random field, and the word segmentation is carried out by the features such as position labeling and part of speech, so that several words are obtained, which can also be called obtaining a number of terms.
所述计算单元402,用于计算每个词语的权重值。在本申请实施例中,可以对分词后的每个term,计算一个权重,重要的term应该给与更高的权重。例如,“什么运动对减肥帮助较大/”的term weighting结果可能是:“什么0.1,运动0.5,对0.1,减肥0.8,帮助0.3,较大0.2”。词语权重(Term weighting)的打分公式一般由local,global和normalization这三部分组成。Term weighting方法可以包括F-IDF,Okapi,MI,LTU,ATC,TF-ICF等,通过local,global, normalization各种公式的组合,可以生成不同的term weighting计算方法。即可以通过term weighting计算方法得到每个词语对应的权重值。The calculating unit 402 is configured to calculate a weight value of each word. In the embodiment of the present application, a weight can be calculated for each term after the word segmentation, and the important term should be given a higher weight. For example, the term weighting result of "what exercise is more helpful for weight loss/" may be: "What 0.1, exercise 0.5, 0.1, weight loss 0.8, help 0.3, larger 0.2". The weighting formula of Term Weighting generally consists of three parts: local, global and normalization. The Term weighting method can include F-IDF, Okapi, MI, LTU, ATC, TF-ICF, etc. Through the combination of various formulas of local, global, and normalization, different term weighting calculation methods can be generated. That is, the weight value corresponding to each word can be obtained by the term weighting calculation method.
所述调整单元403,用于确定权重值大于或等于预设阀值的词语为关键词,所述关键词即为相应的语义信息。在本申请实施例中,预设一个阈值,获取所有词语对应的权重值,并将大于或等于预设阀值的权重值所对应的词语确定为关键词,该关键词即为相应的语义信息。The adjusting unit 403 is configured to determine that a word whose weight value is greater than or equal to a preset threshold is a keyword, and the keyword is a corresponding semantic information. In the embodiment of the present application, a threshold is preset, a weight value corresponding to all words is obtained, and a word corresponding to a weight value greater than or equal to a preset threshold is determined as a keyword, and the keyword is a corresponding semantic information. .
所述确定单元103,用于根据预设规则对所述语义信息进行文本识别匹配,以确定与所述语义信息相匹配的快捷操作指令。The determining unit 103 is configured to perform text recognition matching on the semantic information according to a preset rule to determine a shortcut operation instruction that matches the semantic information.
进一步地,如图10所示,所述确定单元103具体包括:获取单元501以及匹配单元502。Further, as shown in FIG. 10, the determining unit 103 specifically includes: an obtaining unit 501 and a matching unit 502.
所述获取单元501,用于获取预设的汉语词库。在本申请实施例中,可以根据现有的数据信息创建相关的专业术语词库,该现有的数据信息可以是专业领域的词语与生活中通俗的词语,同时,所述预设的汉语词库可以根据用户的需求进行创建。譬如,可以根据系统配置有的快捷键以及相关快捷操作,创建与该快捷键相对应的专业术语词库,该专业术语词库即为本申请的预设汉语词库。The obtaining unit 501 is configured to acquire a preset Chinese vocabulary. In the embodiment of the present application, a related professional term vocabulary may be created according to the existing data information, and the existing data information may be a word in a professional field and a popular word in life, and at the same time, the preset Chinese word The library can be created according to the needs of the user. For example, according to the shortcut keys and related shortcut operations configured by the system, a terminology vocabulary corresponding to the shortcut key can be created, and the terminology vocabulary is the preset Chinese vocabulary of the application.
所述匹配单元502,用于将所述语义信息与所述预设的汉语词库进行模糊匹配,以确定与所述语义信息相匹配的快捷键。在本申请实施例中,通过将所述语义信息与所述预设的汉语词库进行模糊匹配,即实现对所述语义信息的文本识别匹配,从而确定该语义信息对应的快捷键,又因为每个快捷键都对应有相应的快捷操作指令,此时即可确定与所述语义信息相匹配的快捷操作指令,从而实现后续的快捷操作。The matching unit 502 is configured to perform fuzzy matching on the semantic information with the preset Chinese vocabulary to determine a shortcut key that matches the semantic information. In the embodiment of the present application, by performing fuzzy matching on the semantic information with the preset Chinese vocabulary, text recognition and matching of the semantic information is implemented, thereby determining a shortcut key corresponding to the semantic information, and because Each shortcut key corresponds to a corresponding shortcut operation instruction, and at this time, a shortcut operation instruction matching the semantic information can be determined, thereby implementing a subsequent shortcut operation.
例如,当传统的系统开发过程中,开发人员如果要复制一段代码,那么选中要复制内容后,可以在对包括“复制”的语音信息进行识别后,确定“复制”为关键词,此时“复制”即为语义信息,通过将“复制”与预设的汉语词库进行模糊匹配,可以确定“ctrl+C”为相应的快捷键,即确定“ctrl+C”对应的快捷操作指令为与“复制”相匹配的快捷操作指令,从而可以实现对选定代码的复制。故进一步的,任何可以通过快捷键简化的操作,均可以省去按键操作,直接通过自然语言发出指令即可实现操作。例如创建一个test类,只需要使用自然语言发出创建一个test类的指令。For example, in the traditional system development process, if the developer wants to copy a piece of code, then after selecting the content to be copied, after identifying the voice information including "copy", it is determined that "copy" is a keyword, at this time " "Copy" is the semantic information. By fuzzy matching the "copy" with the preset Chinese vocabulary, it can be determined that "ctrl+C" is the corresponding shortcut key, that is, the shortcut operation instruction corresponding to "ctrl+C" is determined as "Copy" matches the shortcut action instructions so that copying of the selected code is possible. Therefore, any operation that can be simplified by the shortcut key can save the key operation and directly execute the instruction through the natural language. For example, to create a test class, you only need to use the natural language to issue an instruction to create a test class.
故一般情况下,可以根据现有的数据信息创建相关的专业术语词库,然后 通过将解析后的语言信息的语义与创建相关的专业术语词库进行模糊匹配,以确定与所述语义信息相匹配的快捷操作指令,再通过语音反馈给用户接下来要执行的操作(专业术语反馈),执行相应的操作并语言反馈执行结果,同时执行与反馈的语言实行定制化,可增强用户专业性,方便用户操作学习。Therefore, in general, a related terminology vocabulary can be created according to the existing data information, and then fuzzy matching is performed by synthesizing the semantics of the parsed linguistic information with the professional term vocabulary related to the creation to determine the semantic information. Matching shortcut operation instructions, then feedback to the user to perform the next operation (professional term feedback), perform corresponding operations and language feedback to execute the results, and perform customization with the feedback language to enhance user professionalism. It is convenient for users to operate and learn.
作为进一步的实施例,所述装置还可以包括以下单元:As a further embodiment, the apparatus may further comprise the following units:
运行单元104,用于运行所述快捷键对应的快捷操作指令,以实现相应的快捷操作。在本申请实施例中,通过运行所述快捷键对应的快捷操作指令,可以实现与输入的语音信息相对应的快捷操作。同时,作为另一进一步的实施例,该装置还可以包括显示单元,用于在调用运行快捷操作指令之后,显示与所述快捷操作指令相应的快捷键,从而方便用户进行直接操作。The running unit 104 is configured to run a shortcut operation instruction corresponding to the shortcut key to implement a corresponding shortcut operation. In the embodiment of the present application, by running the shortcut operation instruction corresponding to the shortcut key, a shortcut operation corresponding to the input voice information can be implemented. Meanwhile, as another further embodiment, the apparatus may further include a display unit, configured to display a shortcut key corresponding to the shortcut operation instruction after the operation shortcut operation instruction is invoked, thereby facilitating the user to perform a direct operation.
例如,本申请实施例提供的方法可以应用到Eclipse这种常用的IDE开发工具中为例。用户在不知道某些操作的快捷键的前提下,通过自然语言表达出来,例如,说:“我要创建一个Test的java类”,则可通过语音识别API、语义识别API以及汉语词库的模糊搜索等,将Eclipse这个IDE开发工具中设置的创建类的快捷键“alt+C”所对应的快捷操作指令调用出来,从而创建一个Test.java文件。作为优选的,还可以同时在终端的显示屏幕上显示该快捷键“alt+C”,以提示用户创建类的快捷键为“alt+C”,方便用户下次直接使用对应的快捷键。当然,还可以通过语音播放,以提示用户。For example, the method provided by the embodiment of the present application can be applied to a common IDE development tool such as Eclipse. The user expresses it through natural language without knowing the shortcut keys of some operations. For example, "I want to create a Java class of Test", the voice recognition API, the semantic recognition API, and the Chinese lexicon can be used. Fuzzy search, etc., the shortcut operation instruction corresponding to the shortcut key "alt+C" of the creation class set in the IDE development tool of Eclipse is called to create a Test.java file. Preferably, the shortcut key “alt+C” is displayed on the display screen of the terminal at the same time, so as to prompt the user to create a shortcut key of the class as “alt+C”, so that the user can directly use the corresponding shortcut key next time. Of course, you can also play the voice to remind the user.
又例如,可以对选定的目标,通过语音识别来进行快捷操作指令的调用,以实现对目标的操作处理。如,选中一个xxx.txt文件,发出语音信息:“打开文件”,此时则调用快捷操作指令,选定默认的打开文件工具对文件进行打开,同时还可以返回语音:“已打开xxx.txt文件”;而选中文本内容后,发出语音信息:“复制”,此时确定复制对应的“ctrl+C”快捷键,从而调用该快捷键对应的快捷操作指令执行复制后,可语音反馈:“已复制内容”。指定文本位置,发出语音信息:“粘贴”,此时确定粘贴对应的“ctrl+V”快捷键,从而调用该快捷键对应的快捷操作指令并在指定位置粘贴内容,同时可以语音反馈:“粘贴成功”。For another example, a call of a shortcut operation instruction may be performed on the selected target by voice recognition to implement an operation process on the target. For example, select a xxx.txt file and send a voice message: “Open File”, then call the shortcut operation command, select the default open file tool to open the file, and also return the voice: “Opened xxx.txt After selecting the text content, the voice message is sent: “Copy”, at this time, it is determined that the corresponding “ctrl+C” shortcut key is copied, so that the shortcut operation instruction corresponding to the shortcut key is called to perform the copying, and the voice feedback: “ The content has been copied." Specify the text position, and send out the voice message: “Paste”. At this time, confirm the corresponding “ctrl+V” shortcut key, and then call the shortcut operation instruction corresponding to the shortcut key and paste the content in the specified position. At the same time, you can voice feedback: “Paste success".
上述快捷键识别装置可以实现为一种计算机程序的形式,计算机程序可以在如图11所示的计算机设备上运行。图11为本申请一种计算机设备的结构组成示意图。该设备可以是终端,也可以是服务器,其中,终端可以是智能手机、 平板电脑、笔记本电脑、台式电脑、个人数字助理和穿戴式设备等具有通信功能的电子设备。服务器可以是独立的服务器,也可以是多个服务器组成的服务器集群。参照图11,该计算机设备600包括通过系统总线601连接的处理器602、非易失性存储介质603、内存储器604和网络接口605。其中,该计算机设备600的非易失性存储介质603可存储操作系统6031和计算机程序6032,该计算机程序6032被执行时,可使得处理器602执行一种快捷键识别方法。该计算机设备600的处理器602用于提供计算和控制能力,支撑整个计算机设备600的运行。该内存储器604为非易失性存储介质中的计算机程序的运行提供环境,该计算机程序被处理器执行时,可使得处理器602执行上述实施例的快捷键识别方法。计算机设备600的网络接口605用于进行网络通信,如发送分配的任务等。本领域技术人员可以理解,图11中示出的计算机设备的实施例并不构成对计算机设备具体构成的限定,在其他实施例中,计算机设备可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。例如,在一些实施例中,计算机设备可以仅包括存储器及处理器,在这样的实施例中,存储器及处理器的结构及功能与图11所示实施例一致,在此不再赘述。The above shortcut key recognition means can be implemented in the form of a computer program which can be run on a computer device as shown in FIG. FIG. 11 is a schematic structural diagram of a computer device according to the present application. The device may be a terminal or a server, wherein the terminal may be a communication-enabled electronic device such as a smart phone, a tablet computer, a notebook computer, a desktop computer, a personal digital assistant, and a wearable device. The server can be a standalone server or a server cluster consisting of multiple servers. Referring to FIG. 11, the computer device 600 includes a processor 602, a non-volatile storage medium 603, an internal memory 604, and a network interface 605 connected by a system bus 601. The non-volatile storage medium 603 of the computer device 600 can store an operating system 6031 and a computer program 6032. When the computer program 6032 is executed, the processor 602 can be caused to execute a shortcut key identification method. The processor 602 of the computer device 600 is used to provide computing and control capabilities to support the operation of the entire computer device 600. The internal memory 604 provides an environment for the operation of a computer program in a non-volatile storage medium that, when executed by the processor, causes the processor 602 to perform the shortcut key identification method of the above-described embodiments. The network interface 605 of the computer device 600 is used to perform network communications, such as sending assigned tasks and the like. It will be understood by those skilled in the art that the embodiment of the computer device shown in FIG. 11 does not constitute a limitation on the specific configuration of the computer device. In other embodiments, the computer device may include more or fewer components than illustrated. Or combine some parts, or different parts. For example, in some embodiments, the computer device may include only a memory and a processor. In such an embodiment, the structure and function of the memory and the processor are the same as those of the embodiment shown in FIG. 11, and details are not described herein again.
本申请提供了一种计算机可读存储介质,计算机可读存储介质存储有一个或者一个以上程序,所述一个或者一个以上程序可被一个或者一个以上的处理器执行,以实现上述实施例的快捷键识别方法。The application provides a computer readable storage medium storing one or more programs, the one or more programs being executable by one or more processors to implement the above-described embodiments Key identification method.
本申请前述的存储介质包括:磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)等各种可以存储程序代码的介质。本申请所有实施例中的单元可以通过通用集成电路,例如CPU(Central Processing Unit,中央处理器),或通过ASIC(Application Specific Integrated Circuit,专用集成电路)来实现。本申请实施例快捷键识别方法中的步骤可以根据实际需要进行顺序调整、合并和删减。本申请实施例快捷键识别终端中的单元可以根据实际需要进行合并、划分和删减。The foregoing storage medium of the present application includes: a magnetic disk, an optical disk, a read-only memory (ROM), and the like, which can store various program codes. The units in all the embodiments of the present application may be implemented by a general-purpose integrated circuit, such as a CPU (Central Processing Unit), or by an ASIC (Application Specific Integrated Circuit). The steps in the shortcut key identification method in the embodiment of the present application may be sequentially adjusted, merged, and deleted according to actual needs. In the embodiment of the present application, the units in the shortcut key identification terminal may be combined, divided, and deleted according to actual needs.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。The foregoing is only a specific embodiment of the present application, but the scope of protection of the present application is not limited thereto, and any equivalents can be easily conceived by those skilled in the art within the technical scope disclosed in the present application. Modifications or substitutions are intended to be included within the scope of the present application. Therefore, the scope of protection of this application should be determined by the scope of protection of the claims.

Claims (20)

  1. 一种快捷键识别方法,其特征在于,所述方法包括:A shortcut key identification method, characterized in that the method comprises:
    读取系统的配置文件,以确定相关的快捷键,其中,每个快捷键均对应设置有相应的快捷操作指令;Read the configuration file of the system to determine related shortcut keys, wherein each shortcut key is correspondingly set with a corresponding shortcut operation instruction;
    通过对获取的语音信息进行语音识别以得到文本信息,并对所述文本信息进行语义分析,以确定相应的语义信息;Performing voice recognition on the acquired voice information to obtain text information, and performing semantic analysis on the text information to determine corresponding semantic information;
    根据预设规则对所述语义信息进行文本识别匹配,以确定与所述语义信息相匹配的快捷键。The semantic information is subjected to text recognition matching according to a preset rule to determine a shortcut key that matches the semantic information.
  2. 如权利要求1所述的方法,其特征在于,所述通过对获取的语音信息进行语音识别以得到文本信息,并对所述文本信息进行语义分析,以确定相应的语义信息,包括:The method according to claim 1, wherein the speech information is obtained by performing speech recognition on the acquired speech information, and semantic analysis of the text information is performed to determine corresponding semantic information, including:
    对获取的语音信息进行语音识别,以将语音信息转化为文本信息;Performing voice recognition on the acquired voice information to convert the voice information into text information;
    通过自然语言处理算法对文本信息进行语义分析以得到相应的语义信息。The semantic analysis of the text information is performed by a natural language processing algorithm to obtain corresponding semantic information.
  3. 如权利要求2所述的方法,其特征在于,所述对获取的语音信息进行语音识别,以将语音信息转化为文本信息,包括:The method according to claim 2, wherein the performing voice recognition on the acquired voice information to convert the voice information into text information comprises:
    将获取的语音信息通过语音活动检测转换为纯声波信息;Converting the acquired voice information into pure sound wave information through voice activity detection;
    使用移动窗函数对纯声波信息进行分帧,并对分帧后的纯声波信息进行声学特征提取;The moving sound window function is used to frame the pure sound wave information, and the acoustic feature extraction is performed on the pure sound wave information after the framed;
    利用隐马尔可夫模型,构建一个状态网络,并从状态网络中寻找与纯声波信息最匹配的路径,以得到相应的文本信息。Using the hidden Markov model, a state network is constructed, and the path that best matches the pure sound wave information is found from the state network to obtain the corresponding text information.
  4. 如权利要求2所述的方法,其特征在于,所述通过自然语言处理算法对文本信息进行语义分析以得到相应的语义信息,包括:The method according to claim 2, wherein the semantic analysis of the text information by the natural language processing algorithm to obtain corresponding semantic information comprises:
    对所述文本信息进行分词及词性标注,以得到多个标注有词性的词语;Performing word segmentation and part-of-speech tagging on the text information to obtain a plurality of words marked with part of speech;
    计算每个词语的权重值;Calculate the weight value of each word;
    确定权重值大于或等于预设阀值的词语为关键词,所述关键词即为相应的语义信息。The words whose weight value is greater than or equal to the preset threshold are determined as keywords, and the keywords are corresponding semantic information.
  5. 如权利要求1所述的方法,其特征在于,所述根据预设规则对所述语义信息进行文本识别匹配,以确定与所述语义信息相匹配的快捷键,包括:The method according to claim 1, wherein the text recognition matching of the semantic information according to a preset rule to determine a shortcut key that matches the semantic information comprises:
    获取预设的汉语词库;Obtain a preset Chinese vocabulary;
    将所述语义信息与所述预设的汉语词库进行模糊匹配,以确定与所述语义信息相匹配的快捷键。The semantic information is fuzzy matched with the preset Chinese vocabulary to determine a shortcut key that matches the semantic information.
  6. 一种快捷键识别装置,其特征在于,所述装置包括:A shortcut key recognition device, characterized in that the device comprises:
    读取单元,用于读取系统的配置文件,以确定相关的快捷键,其中,每个快捷键均对应设置有相应的快捷操作指令;a reading unit, configured to read a configuration file of the system to determine related shortcut keys, wherein each shortcut key is correspondingly provided with a corresponding shortcut operation instruction;
    分析单元,用于通过对获取的语音信息进行语音识别以得到文本信息,并对所述文本信息进行语义分析,以确定相应的语义信息;An analyzing unit, configured to perform text recognition on the acquired voice information to obtain text information, and perform semantic analysis on the text information to determine corresponding semantic information;
    确定单元,用于根据预设规则对所述语义信息进行文本识别匹配,以确定与所述语义信息相匹配的快捷操作指令。And a determining unit, configured to perform text recognition matching on the semantic information according to a preset rule to determine a shortcut operation instruction that matches the semantic information.
  7. 如权利要求6所述的装置,其特征在于,所述分析单元,包括:The device according to claim 6, wherein the analyzing unit comprises:
    语音识别单元,用于对获取的语音信息进行语音识别,以将语音信息转化为文本信息;a voice recognition unit, configured to perform voice recognition on the acquired voice information, to convert the voice information into text information;
    语义分析单元,用于通过自然语言处理算法对文本信息进行语义分析以得到相应的语义信息。The semantic analysis unit is configured to perform semantic analysis on the text information through a natural language processing algorithm to obtain corresponding semantic information.
  8. 如权利要求7所述的装置,其特征在于,所述语音识别单元,包括:The device of claim 7, wherein the speech recognition unit comprises:
    转换单元,用于将获取的语音信息通过语音活动检测转换为纯声波信息;a converting unit, configured to convert the acquired voice information into pure sound wave information through voice activity detection;
    特征提取单元,用于使用移动窗函数对纯声波信息进行分帧,并对分帧后的纯声波信息进行声学特征提取;a feature extraction unit, configured to frame the pure sound wave information by using a moving window function, and perform acoustic feature extraction on the pure sound wave information after the framed;
    构建单元,用于利用隐马尔可夫模型,构建一个状态网络,并从状态网络中寻找与纯声波信息最匹配的路径,以得到相应的文本信息。The building unit is configured to construct a state network by using the hidden Markov model, and find a path that best matches the pure sound wave information from the state network to obtain corresponding text information.
  9. 如权利要求7所述的装置,其特征在于,所述语义分析单元,包括:The device according to claim 7, wherein the semantic analysis unit comprises:
    分词单元,用于对所述文本信息进行分词及词性标注,以得到多个标注有词性的词语;a word segmentation unit, configured to perform segmentation and part-of-speech tagging on the text information to obtain a plurality of words marked with part of speech;
    计算单元,用于计算每个词语的权重值;a calculation unit for calculating a weight value of each word;
    调整单元,用于确定权重值大于或等于预设阀值的词语为关键词,所述关键词即为相应的语义信息。The adjusting unit is configured to determine that the word whose weight value is greater than or equal to the preset threshold is a keyword, and the keyword is the corresponding semantic information.
  10. 如权利要求6所述的装置,其特征在于,所述确定单元,包括:The device of claim 6, wherein the determining unit comprises:
    获取单元,用于获取预设的汉语词库;The obtaining unit is configured to obtain a preset Chinese vocabulary;
    匹配单元,用于将所述语义信息与所述预设的汉语词库进行模糊匹配,以确定与所述语义信息相匹配的快捷键。And a matching unit, configured to perform fuzzy matching on the semantic information with the preset Chinese vocabulary to determine a shortcut key that matches the semantic information.
  11. 一种计算机设备,其特征在于,包括:A computer device, comprising:
    存储器,用于存储实现快捷键识别的程序;以及a memory for storing a program that implements shortcut key recognition;
    处理器,用于运行所述存储器中存储的实现快捷键识别的程序,以执行以下操作:a processor, configured to run a program stored in the memory for implementing shortcut key identification to perform the following operations:
    读取系统的配置文件,以确定相关的快捷键,其中,每个快捷键均对应设置有相应的快捷操作指令;Read the configuration file of the system to determine related shortcut keys, wherein each shortcut key is correspondingly set with a corresponding shortcut operation instruction;
    通过对获取的语音信息进行语音识别以得到文本信息,并对所述文本信息进行语义分析,以确定相应的语义信息;Performing voice recognition on the acquired voice information to obtain text information, and performing semantic analysis on the text information to determine corresponding semantic information;
    根据预设规则对所述语义信息进行文本识别匹配,以确定与所述语义信息相匹配的快捷键。The semantic information is subjected to text recognition matching according to a preset rule to determine a shortcut key that matches the semantic information.
  12. 如权利要求11所述的设备,其特征在于,所述通过对获取的语音信息进行语音识别以得到文本信息,并对所述文本信息进行语义分析,以确定相应的语义信息,包括:The device according to claim 11, wherein the voice information is obtained by performing voice recognition on the acquired voice information, and performing semantic analysis on the text information to determine corresponding semantic information, including:
    对获取的语音信息进行语音识别,以将语音信息转化为文本信息;Performing voice recognition on the acquired voice information to convert the voice information into text information;
    通过自然语言处理算法对文本信息进行语义分析以得到相应的语义信息。The semantic analysis of the text information is performed by a natural language processing algorithm to obtain corresponding semantic information.
  13. 如权利要求12所述的设备,其特征在于,所述对获取的语音信息进行语音识别,以将语音信息转化为文本信息,包括:The device according to claim 12, wherein the performing voice recognition on the acquired voice information to convert the voice information into text information comprises:
    将获取的语音信息通过语音活动检测转换为纯声波信息;Converting the acquired voice information into pure sound wave information through voice activity detection;
    使用移动窗函数对纯声波信息进行分帧,并对分帧后的纯声波信息进行声学特征提取;The moving sound window function is used to frame the pure sound wave information, and the acoustic feature extraction is performed on the pure sound wave information after the framed;
    利用隐马尔可夫模型,构建一个状态网络,并从状态网络中寻找与纯声波信息最匹配的路径,以得到相应的文本信息。Using the hidden Markov model, a state network is constructed, and the path that best matches the pure sound wave information is found from the state network to obtain the corresponding text information.
  14. 如权利要求12所述的设备,其特征在于,所述通过自然语言处理算法对文本信息进行语义分析以得到相应的语义信息,包括:The device according to claim 12, wherein the semantic analysis of the text information by the natural language processing algorithm to obtain corresponding semantic information comprises:
    对所述文本信息进行分词及词性标注,以得到多个标注有词性的词语;Performing word segmentation and part-of-speech tagging on the text information to obtain a plurality of words marked with part of speech;
    计算每个词语的权重值;Calculate the weight value of each word;
    确定权重值大于或等于预设阀值的词语为关键词,所述关键词即为相应的语义信息。The words whose weight value is greater than or equal to the preset threshold are determined as keywords, and the keywords are corresponding semantic information.
  15. 如权利要求11所述的设备,其特征在于,所述根据预设规则对所述语义信息进行文本识别匹配,以确定与所述语义信息相匹配的快捷键,包括:The device according to claim 11, wherein the text identification matching of the semantic information according to a preset rule to determine a shortcut key that matches the semantic information comprises:
    获取预设的汉语词库;Obtain a preset Chinese vocabulary;
    将所述语义信息与所述预设的汉语词库进行模糊匹配,以确定与所述语义信息相匹配的快捷键。The semantic information is fuzzy matched with the preset Chinese vocabulary to determine a shortcut key that matches the semantic information.
  16. 一种计算机可读存储介质,其特征在于,计算机可读存储介质存储有一个或者一个以上程序,所述一个或者一个以上程序可被一个或者一个以上的处理器执行,以实现以下步骤:A computer readable storage medium, characterized in that the computer readable storage medium stores one or more programs, the one or more programs being executable by one or more processors to implement the steps of:
    读取系统的配置文件,以确定相关的快捷键,其中,每个快捷键均对应设置有相应的快捷操作指令;Read the configuration file of the system to determine related shortcut keys, wherein each shortcut key is correspondingly set with a corresponding shortcut operation instruction;
    通过对获取的语音信息进行语音识别以得到文本信息,并对所述文本信息进行语义分析,以确定相应的语义信息;Performing voice recognition on the acquired voice information to obtain text information, and performing semantic analysis on the text information to determine corresponding semantic information;
    根据预设规则对所述语义信息进行文本识别匹配,以确定与所述语义信息相匹配的快捷键。The semantic information is subjected to text recognition matching according to a preset rule to determine a shortcut key that matches the semantic information.
  17. 如权利要求16所述的计算机可读存储介质,其特征在于,所述通过对获取的语音信息进行语音识别以得到文本信息,并对所述文本信息进行语义分析,以确定相应的语义信息,包括:The computer readable storage medium according to claim 16, wherein the speech information is obtained by performing speech recognition on the acquired speech information, and the text information is semantically analyzed to determine corresponding semantic information. include:
    对获取的语音信息进行语音识别,以将语音信息转化为文本信息;Performing voice recognition on the acquired voice information to convert the voice information into text information;
    通过自然语言处理算法对文本信息进行语义分析以得到相应的语义信息。The semantic analysis of the text information is performed by a natural language processing algorithm to obtain corresponding semantic information.
  18. 如权利要求17所述的计算机可读存储介质,其特征在于,所述对获取的语音信息进行语音识别,以将语音信息转化为文本信息,包括:The computer readable storage medium according to claim 17, wherein the performing voice recognition on the acquired voice information to convert the voice information into text information comprises:
    将获取的语音信息通过语音活动检测转换为纯声波信息;Converting the acquired voice information into pure sound wave information through voice activity detection;
    使用移动窗函数对纯声波信息进行分帧,并对分帧后的纯声波信息进行声学特征提取;The moving sound window function is used to frame the pure sound wave information, and the acoustic feature extraction is performed on the pure sound wave information after the framed;
    利用隐马尔可夫模型,构建一个状态网络,并从状态网络中寻找与纯声波信息最匹配的路径,以得到相应的文本信息。Using the hidden Markov model, a state network is constructed, and the path that best matches the pure sound wave information is found from the state network to obtain the corresponding text information.
  19. 如权利要求17所述的计算机可读存储介质,其特征在于,所述通过自然语言处理算法对文本信息进行语义分析以得到相应的语义信息,包括:The computer readable storage medium according to claim 17, wherein the semantic analysis of the text information by the natural language processing algorithm to obtain corresponding semantic information comprises:
    对所述文本信息进行分词及词性标注,以得到多个标注有词性的词语;Performing word segmentation and part-of-speech tagging on the text information to obtain a plurality of words marked with part of speech;
    计算每个词语的权重值;Calculate the weight value of each word;
    确定权重值大于或等于预设阀值的词语为关键词,所述关键词即为相应的语义信息。The words whose weight value is greater than or equal to the preset threshold are determined as keywords, and the keywords are corresponding semantic information.
  20. 如权利要求16所述的计算机可读存储介质,其特征在于,所述根据预设规则对所述语义信息进行文本识别匹配,以确定与所述语义信息相匹配的快捷键,包括:The computer readable storage medium according to claim 16, wherein the performing text recognition matching on the semantic information according to a preset rule to determine a shortcut key that matches the semantic information comprises:
    获取预设的汉语词库;Obtain a preset Chinese vocabulary;
    将所述语义信息与所述预设的汉语词库进行模糊匹配,以确定与所述语义信息相匹配的快捷键。The semantic information is fuzzy matched with the preset Chinese vocabulary to determine a shortcut key that matches the semantic information.
PCT/CN2018/085255 2018-03-08 2018-05-02 Shortcut key recognition method and apparatus, device, and computer-readable storage medium WO2019169722A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810191036.0A CN108491379A (en) 2018-03-08 2018-03-08 Shortcut key recognition methods, device, equipment and computer readable storage medium
CN201810191036.0 2018-03-08

Publications (1)

Publication Number Publication Date
WO2019169722A1 true WO2019169722A1 (en) 2019-09-12

Family

ID=63338223

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/085255 WO2019169722A1 (en) 2018-03-08 2018-05-02 Shortcut key recognition method and apparatus, device, and computer-readable storage medium

Country Status (2)

Country Link
CN (1) CN108491379A (en)
WO (1) WO2019169722A1 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109491264B (en) * 2018-12-24 2022-01-21 青岛海信智慧家居系统股份有限公司 Household equipment control method and device
CN112819061B (en) * 2021-01-27 2024-05-10 北京小米移动软件有限公司 Password information identification method, device, equipment and storage medium
CN112908327A (en) * 2021-02-02 2021-06-04 上海市胸科医院 Voice control method, device, equipment and storage medium of application program
CN114138227A (en) * 2021-12-08 2022-03-04 江西台德智慧科技有限公司 Intelligent voice shortcut key input method and intelligent voice shortcut key input system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1641563A (en) * 2004-01-09 2005-07-20 顺德市顺达电脑厂有限公司 Computer device voice rapid control device and method thereof
US20100229116A1 (en) * 2009-03-05 2010-09-09 Denso Corporation Control aparatus
CN103294370A (en) * 2012-03-05 2013-09-11 北京千橡网景科技发展有限公司 Method and equipment for triggering keystroke operation
CN103329196A (en) * 2011-05-20 2013-09-25 三菱电机株式会社 Information apparatus
CN104750257A (en) * 2013-12-30 2015-07-01 鸿富锦精密工业(武汉)有限公司 Keyboard combination and voice recognition method
CN105183778A (en) * 2015-08-11 2015-12-23 百度在线网络技术(北京)有限公司 Service providing method and apparatus

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102622085A (en) * 2012-04-11 2012-08-01 北京航空航天大学 Multidimensional sense man-machine interaction system and method
KR20140000003A (en) * 2012-06-22 2014-01-02 송병식 A combination of keyboard shortcuts run of voice
CN106710593B (en) * 2015-11-17 2020-07-14 腾讯科技(深圳)有限公司 Method, terminal and server for adding account
CN107679033B (en) * 2017-09-11 2021-12-14 百度在线网络技术(北京)有限公司 Text sentence break position identification method and device
CN107665705B (en) * 2017-09-20 2020-04-21 平安科技(深圳)有限公司 Voice keyword recognition method, device, equipment and computer readable storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1641563A (en) * 2004-01-09 2005-07-20 顺德市顺达电脑厂有限公司 Computer device voice rapid control device and method thereof
US20100229116A1 (en) * 2009-03-05 2010-09-09 Denso Corporation Control aparatus
CN103329196A (en) * 2011-05-20 2013-09-25 三菱电机株式会社 Information apparatus
CN103294370A (en) * 2012-03-05 2013-09-11 北京千橡网景科技发展有限公司 Method and equipment for triggering keystroke operation
CN104750257A (en) * 2013-12-30 2015-07-01 鸿富锦精密工业(武汉)有限公司 Keyboard combination and voice recognition method
CN105183778A (en) * 2015-08-11 2015-12-23 百度在线网络技术(北京)有限公司 Service providing method and apparatus

Also Published As

Publication number Publication date
CN108491379A (en) 2018-09-04

Similar Documents

Publication Publication Date Title
WO2021093449A1 (en) Wakeup word detection method and apparatus employing artificial intelligence, device, and medium
JP5099953B2 (en) Generation of unified task-dependent language model by information retrieval method
US10614803B2 (en) Wake-on-voice method, terminal and storage medium
KR101066741B1 (en) Semantic object synchronous understanding for highly interactive interface
Reddy et al. Speech to text conversion using android platform
KR101042119B1 (en) Semantic object synchronous understanding implemented with speech application language tags
US9805718B2 (en) Clarifying natural language input using targeted questions
WO2019169722A1 (en) Shortcut key recognition method and apparatus, device, and computer-readable storage medium
WO2019001194A1 (en) Voice recognition method, device, apparatus, and storage medium
KR20030078388A (en) Apparatus for providing information using voice dialogue interface and method thereof
JP2002116796A (en) Voice processor and method for voice processing and storage medium
WO2020238045A1 (en) Intelligent speech recognition method and apparatus, and computer-readable storage medium
JP2006053906A (en) Efficient multi-modal method for providing input to computing device
US10565982B2 (en) Training data optimization in a service computing system for voice enablement of applications
US11532301B1 (en) Natural language processing
WO2021051564A1 (en) Speech recognition method, apparatus, computing device and storage medium
Kumar et al. Enabling the rapid development and adoption of speech-user interfaces
KR20190115405A (en) Search method and electronic device using the method
US11626107B1 (en) Natural language processing
JP4653598B2 (en) Syntax / semantic analysis device, speech recognition device, and syntax / semantic analysis program
Stefanovic et al. Voice control system with advanced recognition
Kepuska et al. Speech corpus generation from DVDs of movies and tv series
JP3691773B2 (en) Sentence analysis method and sentence analysis apparatus capable of using the method
JPWO2009041220A1 (en) Abbreviation generation apparatus and program, and abbreviation generation method
CN111104118A (en) AIML-based natural language instruction execution method and system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18908646

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 11.12.2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18908646

Country of ref document: EP

Kind code of ref document: A1