CN111599350B - Command word customization identification method and system - Google Patents

Command word customization identification method and system Download PDF

Info

Publication number
CN111599350B
CN111599350B CN202010266075.XA CN202010266075A CN111599350B CN 111599350 B CN111599350 B CN 111599350B CN 202010266075 A CN202010266075 A CN 202010266075A CN 111599350 B CN111599350 B CN 111599350B
Authority
CN
China
Prior art keywords
training
command word
model
project
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010266075.XA
Other languages
Chinese (zh)
Other versions
CN111599350A (en
Inventor
许东星
曹昊
周雷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Unisound Intelligent Technology Co Ltd
Xiamen Yunzhixin Intelligent Technology Co Ltd
Original Assignee
Unisound Intelligent Technology Co Ltd
Xiamen Yunzhixin Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Unisound Intelligent Technology Co Ltd, Xiamen Yunzhixin Intelligent Technology Co Ltd filed Critical Unisound Intelligent Technology Co Ltd
Priority to CN202010266075.XA priority Critical patent/CN111599350B/en
Publication of CN111599350A publication Critical patent/CN111599350A/en
Application granted granted Critical
Publication of CN111599350B publication Critical patent/CN111599350B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0635Training updating or merging of old and new templates; Mean values; Weighting

Abstract

The invention provides a command word customized identification method and a system, wherein the method comprises the following steps: step 1: receiving an input project requirement, analyzing a project command word list based on the project requirement, and generating a project data acquisition task; step 2: issuing a training data acquisition task through an online task platform; and step 3: acquiring test data in a preset scene through a recording device based on the test data acquisition task; and 4, step 4: generating a command word voice recognition model according to the project command word list, the training data and the test data based on an automatic model training platform; and 5: and adding the command word sound recognition model into a version management tool, and constructing an engine through Jenkins. The command word customized recognition method of the invention adopts the method of collecting training data (voice) and simulating data through the online task platform, thereby greatly reducing the cost and period of data collection and ensuring the performance of command word recognition.

Description

Command word customization identification method and system
Technical Field
The invention relates to the technical field of voice recognition, in particular to a command word customization recognition method and system.
Background
At present, an off-line command word detection system is generally used for solving the problem of voice recognition of fixed and limited command words. The generic model of command word detection systems often has difficulty achieving good performance due to user age, accent, etc. Aiming at the problems of user age, accent and the like, a model customization method is needed for different projects.
The traditional model customization not only needs to collect a large amount of voices and labels of real scenes, but also needs a large amount of manual participation in the model customization and release process to adjust and optimize parameters, so that the project cycle is long, and the consumed labor cost and material resource cost are greatly increased.
Disclosure of Invention
The invention provides a command word customization recognition method, which is a method for collecting training data (voice) through an online task platform and performing data simulation, so that the cost and the period of data collection are greatly reduced, and the performance of command word recognition is ensured.
The embodiment of the invention provides a command word customization identification method, which comprises the following steps:
step 1: receiving an input project requirement, analyzing a project command word list based on the project requirement, and generating a project data acquisition task; the project data acquisition task comprises a training data acquisition task and a test data acquisition task;
step 2: issuing a training data acquisition task through an online task platform, and receiving training data uploaded based on the training data acquisition task through the online task platform;
and step 3: acquiring test data in a preset scene through a recording device based on the test data acquisition task;
and 4, step 4: generating a command word sound recognition model according to the project command word list, the training data and the test data based on an automatic model training platform;
and 5: and adding the command word voice recognition model into a version management tool, and constructing an engine through Jenkins.
Preferably, the generating of the command word speech recognition model according to the project command vocabulary, the training data and the test data specifically includes:
configuring training data into a plurality of training groups according to a preset first rule;
configuring the test data into test groups;
performing data simulation and expansion on the training set by adopting a data enhancement method;
sequentially adopting one of a plurality of training groups after data enhancement, training the deep neural network model by adjusting parameter configuration, and obtaining a plurality of initial models; the training groups correspond to the initial models one by one;
performing model evaluation on each initial model by adopting a test group and generating an evaluation report, wherein the evaluation report comprises the reference recognition rate of the initial model;
selecting a model with the highest reference recognition rate from the plurality of initial model models as a command word voice recognition model;
and outputting a command word voice recognition model and a release evaluation report.
Preferably, the data enhancement method includes one or more of loading noise, increasing reverberation, and increasing or decreasing speech rate.
Preferably, the preset scenario includes: one or more of a mall, a movie theater, a parking lot, a school and a vegetable farm.
Preferably, the training data comprises a close-talking silent speech without background noise.
The invention also provides a command word customization recognition system, which comprises:
the task generation module is used for receiving the input project requirement, analyzing a project command word list based on the project requirement and generating a project data acquisition task; the project data acquisition task comprises a training data acquisition task and a test data acquisition task;
the training data acquisition module is used for issuing a training data acquisition task through the online task platform and receiving training data uploaded based on the training data acquisition task through the online task platform;
the test data acquisition module is used for acquiring test data in a preset scene through the recording equipment based on the test data acquisition task;
the model generation module is used for generating a command word sound recognition model according to the project command word list, the training data and the test data based on the automatic model training platform;
and the engine generation module is used for adding the command word voice recognition model into the version management tool and constructing an engine through Jenkins.
Preferably, the model generation module specifically operates to:
configuring training data into a plurality of training groups according to a preset first rule;
configuring the test data into test groups;
performing data simulation and expansion on the training set by adopting a data enhancement method;
sequentially adopting one of a plurality of training groups after data enhancement, training the deep neural network model by adjusting parameter configuration, and obtaining a plurality of initial models; the training groups correspond to the initial models one by one;
sequentially carrying out model evaluation on each initial model by adopting the test group and generating an evaluation report, wherein the evaluation report comprises the reference recognition rate of the initial model;
selecting a model with the highest reference recognition rate from the plurality of initial model models as a command word voice recognition model;
and outputting a command word voice recognition model and issuing an evaluation report.
Preferably, the data enhancement method includes one or more of loading noise, increasing reverberation, and increasing or decreasing speech rate.
Preferably, the preset scenario includes: one or more of a mall, a movie theater, a parking lot, a school and a vegetable farm.
Preferably, the training data comprises a close-talking silent speech without background noise.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
fig. 1 is a schematic diagram of a command word customization identification method according to an embodiment of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it should be understood that they are presented herein only to illustrate and explain the present invention and not to limit the present invention.
The embodiment of the invention provides a command word customization identification method, as shown in fig. 1, comprising the following steps:
step 1: receiving an input project requirement, analyzing a project command word list based on the project requirement, and generating a project data acquisition task; the project data acquisition task comprises a training data acquisition task and a test data acquisition task;
and 2, step: issuing a training data acquisition task through an online task platform, and receiving training data uploaded based on the training data acquisition task through the online task platform;
and 3, step 3: acquiring test data in a preset scene through recording equipment based on the test data acquisition task;
and 4, step 4: generating a command word sound recognition model according to the project command word list, the training data and the test data based on an automatic model training platform;
and 5: and adding the command word voice recognition model into a version management tool, and constructing an engine through Jenkins.
The working principle and the beneficial effects of the technical scheme are as follows:
receiving an input project requirement, and analyzing a project command word list based on the project requirement; the project requirement can be a customized command word, and a plurality of command words which can be customized are stored in the project command word table; analyzing the project command word list to generate a project data acquisition task; it is desirable to customize how much training data, and how much test data, the command words in the project requirements need to be collected. The training data is a task issued through an online task platform, and received tasks record voice required by the training data; the test data is collected by a special project group in a preset scene by using a recording device. After data acquisition is finished, training a model by adopting training data, and testing convergence is carried out on the model by using test data so as to obtain a command word tone recognition model; the model needs to be capable of being used, and a command word voice recognition model is added into a version management tool, and an engine is built through Jenkins; and completing the command word customized recognition. The command word voice recognition model is a deep learning convolutional neural network model and is used for recognizing voice so as to recognize whether a command word exists in the voice. The version management tool is used for managing a plurality of voice recognition models; the engine is an application program which is constructed by taking the deep-learning convolutional neural network model as a core and is used for recognizing the voice, and the application program comprises processing programs such as voice recording and voice noise reduction.
The command word customization recognition method of the invention collects training data (voice) through the on-line task platform and carries out data simulation, thereby greatly reducing the cost and the period of data collection and ensuring the performance of command word recognition.
In addition, in the model training and publishing process, a process-based tool-based standardized training process (an automatic model training platform) is used for replacing manual participation, and the project efficiency can be greatly improved.
In one embodiment, the generating a command word pronunciation recognition model according to the project command word list, the training data and the test data specifically comprises:
configuring training data into a plurality of training groups according to a preset first rule;
configuring the test data into test groups;
performing data simulation and expansion on the training set by adopting a data enhancement method;
sequentially adopting one of a plurality of training groups after data enhancement, training the deep neural network model by adjusting parameter configuration, and obtaining a plurality of initial models; the training groups correspond to the initial models one by one;
performing model evaluation on each initial model by adopting a test group and generating an evaluation report, wherein the evaluation report comprises the reference recognition rate of the initial model;
selecting a model with the highest reference recognition rate from the plurality of initial model models as a command word voice recognition model;
outputting a command word voice recognition model and issuing an evaluation report;
the data enhancement method comprises one or more of loading noise, increasing reverberation, and increasing or decreasing speech speed.
The working principle and the beneficial effects of the technical scheme are as follows:
the preset first rule is not simply to evenly distribute the training data into a plurality of groups. The training set used for training the model is a training set with enhanced data, which is clean speech in nature, and different data enhancement methods are added to generate different enhanced training data, and then the different enhanced training data are combined into different combinations.
The automation of model training is realized, manual participation is replaced, and the project efficiency can be greatly improved. And generating a plurality of initial models by adopting a plurality of groups of training data, thereby selecting the model with the highest recognition rate from the generated plurality of initial models and ensuring that the final engine has higher recognition rate. The initial model, which is essentially a deep learning convolutional neural network model, is a model in an initial state generated after training by training data, and is not tested and verified by testing data.
In one embodiment, the preset scenario includes: one or more of a mall, a movie theater, a parking lot, a school and a vegetable farm.
The preset scenes are specifically applied scenes of the engine, and specific interferences exist in markets, cinemas, parking lots, schools, vegetable fields and the like, so that the recognition rate of the engine is remarkably improved by testing the data acquired in the scenes.
In one embodiment, the training data includes near-speaking quiet speech without background noise.
The close-talking quiet voice is voice in a quiet environment within a preset distance; the training data must be pure speech, i.e. non-interfering speech, so that close-talking quiet speech is relatively optimal as training data.
The invention also provides a command word customization recognition system, which comprises:
the task generation module is used for receiving the input project requirements, analyzing a project command word list based on the project requirements and generating project data acquisition tasks; the project data acquisition task comprises a training data acquisition task and a test data acquisition task;
the training data acquisition module is used for issuing a training data acquisition task through the online task platform and receiving training data uploaded based on the training data acquisition task through the online task platform;
the test data acquisition module is used for acquiring test data in a preset scene through the recording equipment based on the test data acquisition task;
the model generation module is used for generating a command word sound recognition model according to the project command word list, the training data and the test data based on the automatic model training platform;
and the engine generation module is used for adding the command word voice recognition model into the version management tool and constructing an engine through Jenkins.
The working principle and the beneficial effects of the technical scheme are as follows:
the task generation module receives an input project requirement and analyzes a project command word list based on the project requirement; the project requirement can be a customized command word, and a plurality of command words which can be customized are stored in the project command word table; analyzing the project command word list to generate a project data acquisition task; it is desirable to customize how much training data, and how much test data, the command words in the project requirements need to be collected. The training data is a task issued by an on-line task platform through a training data acquisition module, and the received task records the voice required by the training data; the test data is acquired by a test data acquisition module through a special project group by adopting a recording device in a preset scene. After data acquisition is completed, the model generation module adopts a training data training model, and test data carries out test convergence on the model so as to obtain a command word tone recognition model; the model needs to be used, and an engine generation module is also needed to be adopted to add the command word voice recognition model into a version management tool, and an engine is built through Jenkins; thus, the command word customization recognition is completed. Jenkins is an open-source software project, is a continuous integration tool developed based on Java, is used for monitoring continuous and repeated work, aims to provide an open and easy-to-use software platform, and enables continuous integration of software to be possible.
The command word customization recognition system of the invention collects training data (voice) through the on-line task platform and carries out data simulation, thereby greatly reducing the cost and the period of data collection and ensuring the performance of command word recognition.
In addition, in the model training and publishing process, a process-based tool-based standardized training process (an automatic model training platform) is used for replacing manual participation, and the project efficiency can be greatly improved.
In one embodiment, the model generation module is specifically operable to:
configuring training data into a plurality of training groups according to a preset first rule;
configuring the test data into test groups;
performing data simulation and expansion on the training set by adopting a data enhancement method;
sequentially adopting one of a plurality of training groups after data enhancement, and training the deep neural network model by adjusting parameter configuration to obtain a plurality of initial models; the training groups correspond to the initial models one by one;
performing model evaluation on each initial model by adopting a test group and generating an evaluation report, wherein the evaluation report comprises the reference recognition rate of the initial model;
selecting a model with the highest reference recognition rate from the plurality of initial model models as a command word voice recognition model;
outputting a command word voice recognition model and issuing an evaluation report;
the data enhancement method comprises one or more of loading noise, increasing reverberation, and increasing or decreasing speech speed.
The working principle and the beneficial effects of the technical scheme are as follows:
the preset first rule is not simply to evenly distribute the training data into a plurality of groups. The training set used for training the model is a training set with enhanced data, which is clean speech in nature, and different data enhancement methods are added to generate different enhanced training data, and then the different enhanced training data are combined into different combinations.
The automation of model training is realized, manual participation is replaced, and the project efficiency can be greatly improved. And generating a plurality of initial models by adopting a plurality of groups of training data, so that a model with the highest recognition rate is selected from the generated plurality of initial models, and the final engine is ensured to have higher recognition rate.
In one embodiment, the preset scenario includes: one or more of a mall, a movie theater, a parking lot, a school and a vegetable farm.
The preset scenes are specifically applied scenes of the engine, and specific interferences exist in markets, cinemas, parking lots, schools, vegetable fields and the like, so that the recognition rate of the engine is remarkably improved by testing the data acquired in the scenes.
In one embodiment, the training data includes a close-talking quiet voice without background noise.
The training data must be pure speech, i.e. non-interfering speech, so that close-talking quiet speech is relatively optimal as training data.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (8)

1. A command word customized recognition method is characterized by comprising the following steps:
step 1: receiving an input project requirement, analyzing a project command word list based on the project requirement, and generating a project data acquisition task; the project data acquisition task comprises a training data acquisition task and a test data acquisition task;
and 2, step: the training data acquisition task is issued through an online task platform, and the training data uploaded based on the training data acquisition task is received through the online task platform;
and step 3: collecting test data in a preset scene through recording equipment based on the test data collection task;
and 4, step 4: generating a command word sound recognition model according to the project command word list, the training data and the test data based on an automatic model training platform;
and 5: adding the command word voice recognition model into a version management tool, and constructing an engine through Jenkins;
the method comprises the following steps of generating a command word sound recognition model according to the project command word list, the training data and the test data based on an automatic model training platform, and specifically comprises the following steps:
configuring the training data into a plurality of training groups according to a preset first rule;
configuring the test data into test groups;
performing data simulation and expansion on the training set by adopting a data enhancement method;
sequentially adopting one of the plurality of training groups after data enhancement, and training the deep neural network model by adjusting parameter configuration to obtain a plurality of initial models; the training groups correspond to the initial models one by one;
sequentially carrying out model evaluation on each initial model by adopting the test group and generating an evaluation report, wherein the evaluation report comprises the reference recognition rate of the initial model;
selecting a model with the highest reference recognition rate from a plurality of initial models as the command word voice recognition model;
outputting the command word tone recognition model and issuing the assessment report.
2. The method of claim 1, wherein the data enhancement method comprises one or more of loading noise, adding reverberation, increasing or decreasing speech rate.
3. The method for customized recognition of command words according to claim 1, wherein the preset scenario comprises: one or more of a mall, a movie theater, a parking lot, a school and a vegetable farm.
4. The method of claim 1, wherein the training data comprises a background noise free silent speech.
5. A command word custom recognition system, comprising:
the task generation module is used for receiving an input project requirement, analyzing a project command word list based on the project requirement and generating a project data acquisition task; the project data acquisition task comprises a training data acquisition task and a test data acquisition task;
the training data acquisition module is used for releasing the training data acquisition task through an online task platform and receiving training data uploaded based on the training data acquisition task through the online task platform;
the test data acquisition module is used for acquiring test data in a preset scene through the recording equipment based on the test data acquisition task;
the model generation module is used for generating a command word sound recognition model according to the project command word list, the training data and the test data based on an automatic model training platform;
the engine generation module is used for adding the command word voice recognition model into a version management tool and constructing an engine through Jenkins;
the model generation module specifically operates as follows:
configuring the training data into a plurality of training groups according to a preset first rule;
configuring the test data into test groups;
performing data simulation and expansion on the training set by adopting a data enhancement method;
sequentially adopting one of the plurality of training groups after data enhancement, and training the deep neural network model by adjusting parameter configuration to obtain a plurality of initial models; the training sets correspond to the initial models one to one;
sequentially carrying out model evaluation on each initial model by adopting the test group and generating an evaluation report, wherein the evaluation report comprises the reference recognition rate of the initial model;
selecting a model with the highest reference recognition rate from a plurality of initial models as the command word voice recognition model;
outputting the command word tone recognition model and issuing the evaluation report.
6. The system of claim 5, wherein the data enhancement method comprises one or more of loading noise, increasing reverberation, increasing speech rate, or decreasing speech rate.
7. The command word custom recognition system of claim 5, wherein the preset scenario comprises: one or more of a mall, a movie theater, a parking lot, a school and a vegetable farm.
8. The command word custom recognition system of claim 5, wherein the training data comprises a background noise free, close-talking silent speech.
CN202010266075.XA 2020-04-07 2020-04-07 Command word customization identification method and system Active CN111599350B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010266075.XA CN111599350B (en) 2020-04-07 2020-04-07 Command word customization identification method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010266075.XA CN111599350B (en) 2020-04-07 2020-04-07 Command word customization identification method and system

Publications (2)

Publication Number Publication Date
CN111599350A CN111599350A (en) 2020-08-28
CN111599350B true CN111599350B (en) 2023-02-28

Family

ID=72187411

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010266075.XA Active CN111599350B (en) 2020-04-07 2020-04-07 Command word customization identification method and system

Country Status (1)

Country Link
CN (1) CN111599350B (en)

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104064184A (en) * 2014-06-24 2014-09-24 科大讯飞股份有限公司 Construction method of heterogeneous decoding network, system thereof, voice recognition method and system thereof
CN106098059A (en) * 2016-06-23 2016-11-09 上海交通大学 customizable voice awakening method and system
CN106328124A (en) * 2016-08-24 2017-01-11 安徽咪鼠科技有限公司 Voice recognition method based on user behavior characteristics
CN106611599A (en) * 2015-10-21 2017-05-03 展讯通信(上海)有限公司 Voice recognition method and device based on artificial neural network and electronic equipment
CN107123417A (en) * 2017-05-16 2017-09-01 上海交通大学 Optimization method and system are waken up based on the customized voice that distinctive is trained
CN107221326A (en) * 2017-05-16 2017-09-29 百度在线网络技术(北京)有限公司 Voice awakening method, device and computer equipment based on artificial intelligence
CN107369439A (en) * 2017-07-31 2017-11-21 北京捷通华声科技股份有限公司 A kind of voice awakening method and device
CN107871506A (en) * 2017-11-15 2018-04-03 北京云知声信息技术有限公司 The awakening method and device of speech identifying function
CN108281137A (en) * 2017-01-03 2018-07-13 中国科学院声学研究所 A kind of universal phonetic under whole tone element frame wakes up recognition methods and system
CN108932943A (en) * 2018-07-12 2018-12-04 广州视源电子科技股份有限公司 Order word sound detection method, device, equipment and storage medium
CN109144518A (en) * 2018-08-21 2019-01-04 郑州云海信息技术有限公司 A kind of image file construction method and device based on jenkins
CN109408033A (en) * 2017-09-04 2019-03-01 郑州云海信息技术有限公司 A kind of image file construction method and device based on jenkins
CN109814879A (en) * 2019-01-16 2019-05-28 福建省天奕网络科技有限公司 Automate CI/CD project dispositions method, storage medium
CN110797019A (en) * 2014-05-30 2020-02-14 苹果公司 Multi-command single-speech input method
CN110808036A (en) * 2019-11-07 2020-02-18 南京大学 Incremental voice command word recognition method
CN110832578A (en) * 2017-07-24 2020-02-21 美的集团股份有限公司 Customizable wake-up voice commands

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9152211B2 (en) * 2012-10-30 2015-10-06 Google Technology Holdings LLC Electronic device with enhanced notifications
US20190362709A1 (en) * 2018-05-25 2019-11-28 Motorola Mobility Llc Offline Voice Enrollment
CA3067776A1 (en) * 2018-09-28 2020-03-28 Sonos, Inc. Systems and methods for selective wake word detection using neural network models
CN110265040B (en) * 2019-06-20 2022-05-17 Oppo广东移动通信有限公司 Voiceprint model training method and device, storage medium and electronic equipment
CN110310628B (en) * 2019-06-27 2022-05-20 百度在线网络技术(北京)有限公司 Method, device and equipment for optimizing wake-up model and storage medium

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110797019A (en) * 2014-05-30 2020-02-14 苹果公司 Multi-command single-speech input method
CN104064184A (en) * 2014-06-24 2014-09-24 科大讯飞股份有限公司 Construction method of heterogeneous decoding network, system thereof, voice recognition method and system thereof
CN106611599A (en) * 2015-10-21 2017-05-03 展讯通信(上海)有限公司 Voice recognition method and device based on artificial neural network and electronic equipment
CN106098059A (en) * 2016-06-23 2016-11-09 上海交通大学 customizable voice awakening method and system
CN106328124A (en) * 2016-08-24 2017-01-11 安徽咪鼠科技有限公司 Voice recognition method based on user behavior characteristics
CN108281137A (en) * 2017-01-03 2018-07-13 中国科学院声学研究所 A kind of universal phonetic under whole tone element frame wakes up recognition methods and system
CN107123417A (en) * 2017-05-16 2017-09-01 上海交通大学 Optimization method and system are waken up based on the customized voice that distinctive is trained
CN107221326A (en) * 2017-05-16 2017-09-29 百度在线网络技术(北京)有限公司 Voice awakening method, device and computer equipment based on artificial intelligence
CN110832578A (en) * 2017-07-24 2020-02-21 美的集团股份有限公司 Customizable wake-up voice commands
CN107369439A (en) * 2017-07-31 2017-11-21 北京捷通华声科技股份有限公司 A kind of voice awakening method and device
CN109408033A (en) * 2017-09-04 2019-03-01 郑州云海信息技术有限公司 A kind of image file construction method and device based on jenkins
CN107871506A (en) * 2017-11-15 2018-04-03 北京云知声信息技术有限公司 The awakening method and device of speech identifying function
CN108932943A (en) * 2018-07-12 2018-12-04 广州视源电子科技股份有限公司 Order word sound detection method, device, equipment and storage medium
CN109144518A (en) * 2018-08-21 2019-01-04 郑州云海信息技术有限公司 A kind of image file construction method and device based on jenkins
CN109814879A (en) * 2019-01-16 2019-05-28 福建省天奕网络科技有限公司 Automate CI/CD project dispositions method, storage medium
CN110808036A (en) * 2019-11-07 2020-02-18 南京大学 Incremental voice command word recognition method

Also Published As

Publication number Publication date
CN111599350A (en) 2020-08-28

Similar Documents

Publication Publication Date Title
US10861480B2 (en) Method and device for generating far-field speech data, computer device and computer readable storage medium
CN108511000B (en) Method and system for testing identification rate of awakening words of intelligent sound box
US8538755B2 (en) Customizable method and system for emotional recognition
CN103745731A (en) Automatic voice recognition effect testing system and automatic voice recognition effect testing method
CN110277089B (en) Updating method of offline voice recognition model, household appliance and server
CN112383451B (en) Intelligent household appliance intelligent level testing system and method based on voice interaction
CN113129927B (en) Voice emotion recognition method, device, equipment and storage medium
CN109671430B (en) Voice processing method and device
CN111179908A (en) Testing method and system of intelligent voice equipment
CN111724769A (en) Production method of intelligent household voice recognition model
CN112185342A (en) Voice conversion and model training method, device and system and storage medium
CN111599350B (en) Command word customization identification method and system
CN113409798A (en) Method, device and equipment for generating noise-containing voice data in vehicle
CN101819797B (en) Electronic device with interactive audio recording function and recording method thereof
Engelbrecht et al. Analysis of a new simulation approach to dialog system evaluation
CN109101414B (en) Massive UI test generation method and device based on buried point data
CN116261091A (en) Bluetooth testing system and method capable of customizing testing flow
CN111341343A (en) Online updating system and method for abnormal sound detection
CN116244202A (en) Automatic performance test method and device
CN113595811B (en) Equipment performance testing method and device, storage medium and electronic device
CN112634879B (en) Voice conference management method, device, equipment and medium
CN110600006B (en) Speech recognition evaluation method and system
CN110727883A (en) Method and system for analyzing personalized growth map of child
CN117292711A (en) Voice equipment testing method, device and equipment
López et al. Voice control in smart homes using distant microphones: a voiceXML-based approach

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant