CN110610710A - Construction device and construction method of self-learning voice recognition system - Google Patents

Construction device and construction method of self-learning voice recognition system Download PDF

Info

Publication number
CN110610710A
CN110610710A CN201910838612.0A CN201910838612A CN110610710A CN 110610710 A CN110610710 A CN 110610710A CN 201910838612 A CN201910838612 A CN 201910838612A CN 110610710 A CN110610710 A CN 110610710A
Authority
CN
China
Prior art keywords
wave
output
speech recognition
recognition system
adjacent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910838612.0A
Other languages
Chinese (zh)
Other versions
CN110610710B (en
Inventor
樊茂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Amlogic Shanghai Co Ltd
Amlogic Inc
Original Assignee
Amlogic Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Amlogic Shanghai Co Ltd filed Critical Amlogic Shanghai Co Ltd
Priority to CN201910838612.0A priority Critical patent/CN110610710B/en
Publication of CN110610710A publication Critical patent/CN110610710A/en
Priority to PCT/CN2020/109393 priority patent/WO2021042969A1/en
Application granted granted Critical
Publication of CN110610710B publication Critical patent/CN110610710B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/18Artificial neural networks; Connectionist approaches
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces
    • G10L17/24Interactive procedures; Man-machine interfaces the user being prompted to utter a password or a predefined phrase

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention provides a construction device and a construction method of a self-learning voice recognition system, wherein the construction device is applied to the voice recognition system, the voice recognition system comprises a microphone and a voice recognition module applied with the construction device, the microphone is connected with the voice recognition module, the construction device comprises an analysis unit, and the analysis unit is used for analyzing an output signal of the microphone to obtain a plurality of signal parameters; and the recognition unit is connected with the analysis unit and judges whether the output signal is a preset activated voice or not according to the signal parameter. The voice-activated sleep mode has the advantages that the voice is activated to wake up, so that the power module, the ADC and the CPU are put into sleep in the standby process, and the energy consumption in the standby process is reduced.

Description

Construction device and construction method of self-learning voice recognition system
Technical Field
The invention relates to the technical field of voice recognition, in particular to a device and a method for constructing a self-learning voice recognition system.
Background
With the rapid development of computer application technology, speech or other types of voice recognition technology are applied more and more widely, and the demand for voice recognition is increasing. In the ultra-high definition smart television at present, the smart speaker still needs to retain the voice wake-up function during the standby period, so that the voice recognition system still needs to work, that is, the power module, the ADC (Analog-to-Digital Converter), and the CPU (Central Processing Unit) are all in the working mode, so that a large amount of energy is consumed during the standby process.
Disclosure of Invention
In view of the above problems in the prior art, a device for constructing a self-learning speech recognition system is provided to reduce energy consumption during standby.
The specific technical scheme is as follows:
the utility model provides a construct device of self-learning speech recognition system, is applied to among the speech recognition system, and the speech recognition system includes microphone and the speech recognition module that has the construction device of application, and microphone and speech recognition module are connected, and wherein, the construction device includes:
the analysis unit is used for analyzing the output signal of the microphone to obtain a plurality of signal parameters;
and the recognition unit is connected with the analysis unit and judges whether the output signal is a preset activated voice or not according to the signal parameter.
Preferably, the self-learning speech recognition system is constructed such that the output signal is a waveform signal.
Preferably, the self-learning speech recognition system is constructed such that the analysis unit sequentially stores the signal parameters of each type in the corresponding sequence and outputs the signal parameters of each sequence to the recognition unit.
Preferably, the building means of the self-learning speech recognition system, wherein,
the identification unit is a neural network, and the neural network comprises:
a first calculation unit for outputting a first output parameter according to the signal parameters of the plurality of sequences;
a second calculation unit for outputting a second output parameter according to the signal parameters of the plurality of sequences;
a third calculation unit for outputting a third output parameter according to the signal parameter of the corresponding sequence;
a fourth calculating unit, configured to output a fourth output parameter according to the signal parameter of the corresponding sequence;
the hidden layer comprises a plurality of first nodes, each first node is connected with the first computing unit, the second computing unit, the third computing unit and the fourth computing unit, each first node is provided with one piece of feature information for activating voice, and the first nodes receive and judge whether the first output parameters, the second output parameters, the third output parameters and the fourth output parameters accord with the corresponding feature information or not and output the judgment result;
and the output layer comprises a plurality of second nodes, each second node is connected with each first node, each second node is provided with a corresponding activated voice, and whether the output signal conforms to the activated voice is judged according to the judgment result.
Preferably, the construction device of the self-learning speech recognition system, wherein the types of the signal parameters include a trough, a peak, and an interval time between adjacent troughs and peaks.
Preferably, the construction device of the self-learning speech recognition system, wherein the first output parameter is an envelope value; and/or
The second output parameter is the number of wave edges formed by adjacent wave troughs and wave crests; and/or
The third output parameter is the difference between two adjacent wave troughs; and/or
The fourth output parameter is the difference between two adjacent peaks.
Preferably, the building means of the self-learning speech recognition system, wherein,
the first calculating unit calculates to obtain an envelope value through a trough, a crest and interval time; and/or
The second calculating unit calculates the number of wave edges formed by adjacent wave troughs and wave peaks through the wave troughs and the wave peaks; and/or
The third calculating unit calculates through the wave trough to obtain the difference between two adjacent wave troughs; and/or
And the fourth calculating unit calculates the difference between two adjacent wave crests through the wave crests.
The method is applied to the voice recognition system, the voice recognition system comprises a microphone and a voice recognition module with a construction device, and the microphone is connected with the voice recognition module, wherein the construction method comprises the following steps:
step S1, analyzing the output signal of the microphone to obtain a plurality of signal parameters;
and step S2, judging whether the output signal is a preset activated voice according to the signal parameter.
Preferably, the building means of the self-learning speech recognition system, wherein,
in step S2, a neural network is provided, and whether the output signal is a preset active voice is determined by the neural network.
Preferably, the apparatus for constructing a self-learning speech recognition system, wherein the neural network comprises:
the first calculating unit is used for outputting an envelope value according to the wave trough, the wave crest and the interval time;
the second calculating unit is used for outputting the number of wave edges formed by adjacent wave troughs and wave peaks according to the wave troughs and the wave peaks;
the third calculating unit is used for outputting the difference between two adjacent wave troughs according to the wave troughs;
the fourth calculating unit is used for outputting the difference between two adjacent wave crests according to the wave crests;
the hidden layer comprises a plurality of first nodes, each first node is connected with the first computing unit, the second computing unit, the third computing unit and the fourth computing unit, each first node is provided with one piece of feature information for activating voice, the first nodes receive and judge whether the envelope value, the number of wave edges, the difference between two adjacent wave troughs and the difference between two adjacent wave crests conform to the corresponding feature information or not, and output the judgment result;
the output layer comprises a plurality of second nodes, each second node is connected with each first node, each second node is provided with corresponding activated voice, and whether the output signal accords with the activated voice is judged according to the judgment result;
step S2 includes the following steps:
step S21, calculating through a trough, a crest and interval time to obtain an envelope value; and
calculating through the wave troughs and the wave peaks to obtain the number of wave edges formed by the adjacent wave troughs and the adjacent wave peaks; and
calculating through the wave troughs to obtain the difference between two adjacent wave troughs; and
calculating through the wave crests to obtain the difference between two adjacent wave crests;
step S22, each first node receives and judges whether the envelope value, the number of wave edges, the difference between two adjacent wave troughs and the difference between two adjacent wave crests accord with the characteristic information or not, and outputs the judgment result;
in step S23, each second node determines whether the output signal corresponds to the activated voice according to the determination result, and outputs the determination result.
The technical scheme has the following advantages or beneficial effects: the power module, the ADC and the CPU are put into sleep in the standby process by activating voice to wake up, so that energy consumption in the standby process is reduced.
Drawings
Embodiments of the present invention will now be described more fully hereinafter with reference to the accompanying drawings. The drawings are, however, to be regarded as illustrative and explanatory only and are not restrictive of the scope of the invention.
FIG. 1 is a schematic structural diagram of an embodiment of a device for constructing a self-learning speech recognition system according to the present invention;
FIG. 2 is a schematic structural diagram of a neural network according to an embodiment of the apparatus for constructing the self-learning speech recognition system of the present invention;
FIG. 3 is a flow chart of an embodiment of a method of constructing a self-learning speech recognition system of the present invention;
FIG. 4 is a flowchart illustrating step S2 of an embodiment of a method for constructing a self-learning speech recognition system according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict.
The invention is further described with reference to the following drawings and specific examples, which are not intended to be limiting.
The invention includes a construction device of a self-learning speech recognition system, which is applied to the speech recognition system, the speech recognition system comprises a microphone and a speech recognition module with the construction device, the microphone is connected with the speech recognition module, as shown in figure 1, the construction device comprises:
the analysis unit is used for analyzing the output signal of the microphone to obtain a plurality of signal parameters;
and the recognition unit is connected with the analysis unit and judges whether the output signal is a preset activated voice or not according to the signal parameter.
In the above embodiment, the recognition unit recognizes whether the signal parameter in the analysis unit is a preset activated voice, and performs wake-up operation by activating the voice, so that the power module, the ADC and the CPU sleep in the standby process, and the energy consumption in the standby process is reduced.
The preset activated voices can be preset number, wherein the preset number can be 2, 3 or 4, and the preset activated voices are obtained according to the first node in the hidden layer, so that the preset activated voices are not too many, and energy consumption is reduced.
Further, in the above-described embodiment, the output signal is a waveform signal. Thus, a plurality of signal parameters can be obtained in the waveform signal, and for example, the types of the signal parameters may include a trough, a peak, and a time interval between adjacent troughs and peaks.
Further, in the above-described embodiment, the analysis unit may sequentially save each type of signal parameter into a corresponding sequence, and output the signal parameter of each sequence to the recognition unit.
For example, the sequence in which valleys are located may be { drop }1,drop2,……dropnDrop is used to denote a trough;
the sequence in which the peaks lie may be { rise1,rise2,……risenRise is used to represent a peak;
the sequence in which the intervals are located may be { T }1,T2,……TnWhere T is used to denote the interval time.
Further, in the above embodiment, the identification unit may be a neural network, as shown in fig. 2, the neural network includes:
a first calculation unit 10 for outputting a first output parameter based on the signal parameters of the plurality of sequences;
a second calculation unit 20 for outputting a second output parameter based on the signal parameters of the plurality of sequences;
a third calculating unit 30 for outputting a third output parameter according to the signal parameter of the corresponding sequence;
a fourth calculating unit 40 for outputting a fourth output parameter according to the signal parameter of the corresponding sequence;
the hidden layer comprises a plurality of first nodes 50, each first node 50 is connected with the first computing unit 10, the second computing unit 20, the third computing unit 30 and the fourth computing unit 40, each first node 50 is provided with one piece of feature information for activating voice, and the first nodes 50 receive and judge whether the first output parameters, the second output parameters, the third output parameters and the fourth output parameters accord with the corresponding feature information or not and output the judgment result;
and the output layer comprises a plurality of second nodes 60, each second node 60 is connected with each first node 50, each second node 60 is provided with a corresponding activated voice, and whether the output signal accords with the activated voice is judged according to the judgment result.
The number of the hidden layers can be set according to the requirements of users.
In the neural network described above, each node may be a filter.
Further, as a preferred embodiment, the first output parameter is an envelope value;
the second output parameter is the number of wave edges formed by adjacent wave troughs and wave crests;
the third output parameter is the difference between two adjacent wave troughs;
the fourth output parameter is the difference between two adjacent peaks. The first calculating unit 10 calculates the trough, the peak and the interval time to obtain an envelope value;
the second calculating unit 20 calculates the number of wave edges formed by adjacent wave troughs and wave peaks through the wave troughs and the wave peaks;
the third calculating unit 30 calculates the difference between two adjacent wave troughs through the wave trough;
the fourth calculating unit 40 calculates the difference between two adjacent peaks by using the peaks.
Wherein, it should be noted that, the difference between the two adjacent troughs is the difference obtained by subtracting the next trough from the previous trough in the trough sequence; the difference between the two adjacent peaks is the difference of the former peak minus the latter peak in the peak sequence.
Further, the neural network may set a plurality of preset activated voices for training, signal parameters in output signals corresponding to the preset activated voices are input into the neural network, the first calculating unit 10 in the neural network calculates according to the trough, the peak and the interval time to obtain an envelope value, the second calculating unit 20 calculates according to the trough and the peak to obtain the number of wave edges formed by adjacent trough and peak, the third calculating unit 30 calculates according to the trough to obtain the difference between two adjacent troughs, the fourth calculating unit 40 calculates according to the peak to obtain the difference between two adjacent peaks, each first node 50 in the hidden layer receives and determines whether the envelope value, the number of wave edges, the difference between two adjacent troughs and the difference between two adjacent peaks conform to the characteristic information and outputs the determination result, each second node 60 in the output layer determines whether the output signals conform to the activated voices according to the determination result, when the output signal is the corresponding activated voice, the judgment result is output, and the signal parameters in the output signal corresponding to the preset activated voice are repeatedly input for training; when the output signal is not the corresponding activated voice, the weight of the first node 50 corresponding to the judgment result is adjusted, the signal parameter in the output signal is continuously input for training, and when the output layer judges that the output signal is the corresponding activated voice, the signal parameter in the output signal corresponding to other preset activated voices is input for training, so that the activated voice corresponding to the output signal is obtained through prediction.
The judgment result may be represented by a logical value, for example, the logical value of the preset activated voice in the corresponding second node 60 is 1010101010, the signal parameter in the output signal corresponding to the preset activated voice is input in the neural network, each first node 50 in the hidden layer receives and judges whether the output parameter meets the feature information, and when the output parameter meets the feature information, the logical value corresponding to the judgment result is 1; when the output parameter does not accord with the characteristic information, outputting a logic value corresponding to the judgment result to be 0; the second node 60 of the output layer determines whether the output signal conforms to a preset activated voice according to the received determination result, and when the determination result is the corresponding logical value 1010101010, the second node 60 outputs the logical value corresponding to the determination result as 1 to indicate that the output signal conforms to the preset activated voice; when the determination result is not the corresponding logic value 1010101010, the second node 60 outputs the logic value corresponding to the determination result as 0 to indicate that the output signal does not conform to the preset activation voice.
The method for constructing the self-learning voice recognition system is applied to the voice recognition system, the voice recognition system comprises a microphone and a voice recognition module with a construction device, the microphone is connected with the voice recognition module, and as shown in fig. 4, the construction method comprises the following steps:
step S1, analyzing the output signal of the microphone to obtain a plurality of signal parameters;
and step S2, judging whether the output signal is a preset activated voice according to the signal parameter.
In the above embodiment, whether the signal parameter is the preset activated voice or not is analyzed, and the wake-up operation is performed by the activated voice, so that the power module, the ADC and the CPU are put to sleep in the standby process, and the energy consumption in the standby process is reduced.
Further, in the above embodiment, in step S2, a neural network is provided, and whether the output signal is a preset active voice is determined by the neural network.
The neural network includes:
a first calculating unit 10 for outputting an envelope value according to a valley, a peak and an interval time;
the second calculating unit 20 is configured to output the number of wave edges formed by adjacent wave troughs and wave peaks according to the wave troughs and wave peaks;
a third calculating unit 30 for outputting the difference between two adjacent troughs according to the trough;
a fourth calculating unit 40, configured to output a difference between two adjacent peaks according to the peak;
the hidden layer comprises a plurality of first nodes 50, each first node 50 is connected with the first calculating unit 10, the second calculating unit 20, the third calculating unit 30 and the fourth calculating unit 40, each first node 50 is provided with one piece of feature information for activating voice, and the first nodes 50 receive and judge whether the envelope value, the number of wave edges, the difference between two adjacent wave troughs and the difference between two adjacent wave crests accord with the corresponding feature information or not and output the judgment result;
an output layer including a plurality of second nodes 60, each second node 60 being connected to each first node 50, each second node 60 setting a corresponding activated voice, and determining whether an output signal corresponds to the activated voice according to a determination result;
step S2 includes the following steps:
step S21, calculating through a trough, a crest and interval time to obtain an envelope value; and
calculating through the wave troughs and the wave peaks to obtain the number of wave edges formed by the adjacent wave troughs and the adjacent wave peaks; and
calculating through the wave troughs to obtain the difference between two adjacent wave troughs; and
calculating through the wave crests to obtain the difference between two adjacent wave crests;
step S22, each first node 50 receives and determines whether the envelope value, the number of wave edges, the difference between two adjacent wave troughs and the difference between two adjacent wave crests match the characteristic information, and outputs the determination result;
in step S23, each second node 60 determines whether the output signal corresponds to the activated voice according to the determination result, and outputs the determination result.
While the invention has been described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention.

Claims (10)

1. A constructing apparatus of a self-learning speech recognition system, applied to the speech recognition system, the speech recognition system comprising a microphone and a speech recognition module applied with the constructing apparatus, the microphone and the speech recognition module being connected, wherein the constructing apparatus comprises:
the analysis unit is used for analyzing the output signal of the microphone to obtain a plurality of signal parameters;
and the recognition unit is connected with the analysis unit and judges whether the output signal is a preset activated voice or not according to the signal parameter.
2. The apparatus for constructing a self-learning speech recognition system of claim 1, wherein the output signal is a waveform signal.
3. The apparatus for constructing a self-learning speech recognition system according to claim 1, wherein the analysis unit sequentially stores the signal parameters of each type into a corresponding sequence, and outputs the signal parameters of each of the sequences to the recognition unit.
4. The self-learning speech recognition system construction apparatus of claim 3,
the identification unit is a neural network, and the neural network comprises:
a first calculation unit for outputting a first output parameter according to the signal parameters of the plurality of sequences;
a second calculation unit for outputting a second output parameter according to the signal parameters of the plurality of sequences;
a third calculation unit for outputting a third output parameter according to the signal parameter of the corresponding sequence;
a fourth calculating unit, configured to output a fourth output parameter according to the signal parameter of the corresponding sequence;
a hidden layer, including a plurality of first nodes, where each first node is connected to the first computing unit, the second computing unit, the third computing unit, and the fourth computing unit, and each first node sets a piece of feature information of the activated voice, receives and determines whether the first output parameter, the second output parameter, the third output parameter, and the fourth output parameter conform to the corresponding feature information, and outputs a determination result;
and the output layer comprises a plurality of second nodes, each second node is connected with each first node, each second node is provided with a corresponding activated voice, and whether the output signal accords with the activated voice is judged according to the judgment result.
5. The apparatus for constructing a self-learning speech recognition system according to claim 3, wherein the types of the signal parameters include a trough, a peak, and a time interval between the adjacent trough and the peak.
6. The self-learning speech recognition system construction apparatus of claim 5,
the first output parameter is an envelope value; and/or
The second output parameter is the number of wave edges formed by the adjacent wave troughs and the adjacent wave crests; and/or
The third output parameter is the difference between two adjacent wave troughs; and/or
The fourth output parameter is the difference between two adjacent peaks.
7. The self-learning speech recognition system construction apparatus of claim 6,
the first calculating unit calculates the envelope value through the trough, the peak and the interval time; and/or
The second calculating unit calculates the number of the wave edges formed by the adjacent wave troughs and the adjacent wave peaks through the wave troughs and the wave peaks; and/or
The third calculating unit calculates the difference between two adjacent wave troughs through the wave troughs; and/or
The fourth calculating unit calculates the difference between two adjacent wave crests through the wave crests.
8. A construction method of a self-learning speech recognition system, which is applied to the speech recognition system, wherein the speech recognition system comprises a microphone and a speech recognition module applied with the construction device, and the microphone is connected with the speech recognition module, and the construction method comprises the following steps:
step S1, analyzing the output signal of the microphone to obtain a plurality of signal parameters;
and step S2, judging whether the output signal is a preset activated voice according to the signal parameter.
9. The method for constructing a self-learning speech recognition system of claim 8, wherein in step S2, a neural network is provided, and it is determined whether the output signal is a predetermined active speech.
10. The method of constructing a self-learning speech recognition system of claim 9 wherein the neural network comprises:
the first calculating unit is used for outputting an envelope value according to the wave trough, the wave crest and the interval time;
the second calculating unit is used for outputting the number of wave edges formed by the adjacent wave troughs and wave peaks according to the wave troughs and the wave peaks;
a third calculating unit, configured to output a difference between two adjacent troughs according to the trough;
a fourth calculating unit, configured to output a difference between two adjacent peaks according to the peak;
a hidden layer, including a plurality of first nodes, where each first node is connected to the first computing unit, the second computing unit, the third computing unit, and the fourth computing unit, each first node sets a piece of feature information of the activated voice, and the first nodes receive and determine whether the envelope value, the number of wave edges, a difference between two adjacent wave troughs, and a difference between two adjacent wave crests match the corresponding feature information, and output a determination result;
the output layer comprises a plurality of second nodes, each second node is connected with each first node, each second node is provided with corresponding activated voice, and whether the output signal accords with the activated voice is judged according to the judgment result;
the step S2 includes the steps of:
step S21, calculating the envelope value through the trough, the crest and the interval time; and
calculating through the wave troughs and the wave peaks to obtain the number of the wave edges formed by the adjacent wave troughs and the adjacent wave peaks; and
calculating through the wave troughs to obtain the difference between two adjacent wave troughs; and
calculating through the wave crests to obtain the difference between two adjacent wave crests;
step S22, each first node receives and determines whether the envelope value, the number of wave edges, the difference between two adjacent wave troughs and the difference between two adjacent wave crests match the characteristic information, and outputs a determination result;
step S23, each second node determines whether the output signal matches the activated voice according to the determination result, and outputs the determination result.
CN201910838612.0A 2019-09-05 2019-09-05 Construction device and construction method of self-learning voice recognition system Active CN110610710B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910838612.0A CN110610710B (en) 2019-09-05 2019-09-05 Construction device and construction method of self-learning voice recognition system
PCT/CN2020/109393 WO2021042969A1 (en) 2019-09-05 2020-08-14 Construction apparatus and construction method for self-learning speech recognition system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910838612.0A CN110610710B (en) 2019-09-05 2019-09-05 Construction device and construction method of self-learning voice recognition system

Publications (2)

Publication Number Publication Date
CN110610710A true CN110610710A (en) 2019-12-24
CN110610710B CN110610710B (en) 2022-04-01

Family

ID=68892341

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910838612.0A Active CN110610710B (en) 2019-09-05 2019-09-05 Construction device and construction method of self-learning voice recognition system

Country Status (2)

Country Link
CN (1) CN110610710B (en)
WO (1) WO2021042969A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021042969A1 (en) * 2019-09-05 2021-03-11 晶晨半导体(上海)股份有限公司 Construction apparatus and construction method for self-learning speech recognition system

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150301796A1 (en) * 2014-04-17 2015-10-22 Qualcomm Incorporated Speaker verification
CN105632486A (en) * 2015-12-23 2016-06-01 北京奇虎科技有限公司 Voice wake-up method and device of intelligent hardware
CN105741838A (en) * 2016-01-20 2016-07-06 百度在线网络技术(北京)有限公司 Voice wakeup method and voice wakeup device
CN107102713A (en) * 2016-02-19 2017-08-29 北京君正集成电路股份有限公司 It is a kind of to reduce the method and device of power consumption
US20180254041A1 (en) * 2016-04-11 2018-09-06 Sonde Health, Inc. System and method for activation of voice interactive services based on user state
CN108538305A (en) * 2018-04-20 2018-09-14 百度在线网络技术(北京)有限公司 Audio recognition method, device, equipment and computer readable storage medium
CN108922553A (en) * 2018-07-19 2018-11-30 苏州思必驰信息科技有限公司 Wave arrival direction estimating method and system for sound-box device
CN109166571A (en) * 2018-08-06 2019-01-08 广东美的厨房电器制造有限公司 Wake-up word training method, device and the household appliance of household appliance
CN109360585A (en) * 2018-12-19 2019-02-19 晶晨半导体(上海)股份有限公司 A kind of voice-activation detecting method
US20190214002A1 (en) * 2018-01-09 2019-07-11 Lg Electronics Inc. Electronic device and method of controlling the same

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11189273B2 (en) * 2017-06-29 2021-11-30 Amazon Technologies, Inc. Hands free always on near field wakeword solution
CN110610710B (en) * 2019-09-05 2022-04-01 晶晨半导体(上海)股份有限公司 Construction device and construction method of self-learning voice recognition system

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150301796A1 (en) * 2014-04-17 2015-10-22 Qualcomm Incorporated Speaker verification
CN105632486A (en) * 2015-12-23 2016-06-01 北京奇虎科技有限公司 Voice wake-up method and device of intelligent hardware
CN105741838A (en) * 2016-01-20 2016-07-06 百度在线网络技术(北京)有限公司 Voice wakeup method and voice wakeup device
CN107102713A (en) * 2016-02-19 2017-08-29 北京君正集成电路股份有限公司 It is a kind of to reduce the method and device of power consumption
US20180254041A1 (en) * 2016-04-11 2018-09-06 Sonde Health, Inc. System and method for activation of voice interactive services based on user state
US20190214002A1 (en) * 2018-01-09 2019-07-11 Lg Electronics Inc. Electronic device and method of controlling the same
CN108538305A (en) * 2018-04-20 2018-09-14 百度在线网络技术(北京)有限公司 Audio recognition method, device, equipment and computer readable storage medium
CN108922553A (en) * 2018-07-19 2018-11-30 苏州思必驰信息科技有限公司 Wave arrival direction estimating method and system for sound-box device
CN109166571A (en) * 2018-08-06 2019-01-08 广东美的厨房电器制造有限公司 Wake-up word training method, device and the household appliance of household appliance
CN109360585A (en) * 2018-12-19 2019-02-19 晶晨半导体(上海)股份有限公司 A kind of voice-activation detecting method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JWU-SHENG HU,ET AL.: "Wake-up-word detection by estimating formants from spatial eigenspace information", 《2012 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION》 *
李燕诚等: "基于似然比测试的语音激活检测算法", 《计算机工程 》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021042969A1 (en) * 2019-09-05 2021-03-11 晶晨半导体(上海)股份有限公司 Construction apparatus and construction method for self-learning speech recognition system

Also Published As

Publication number Publication date
CN110610710B (en) 2022-04-01
WO2021042969A1 (en) 2021-03-11

Similar Documents

Publication Publication Date Title
CN111223497B (en) Nearby wake-up method and device for terminal, computing equipment and storage medium
CN110473536B (en) Awakening method and device and intelligent device
CN111880856B (en) Voice wakeup method and device, electronic equipment and storage medium
CN106448663A (en) Voice wakeup method and voice interaction device
CN107767863A (en) voice awakening method, system and intelligent terminal
CN105741838A (en) Voice wakeup method and voice wakeup device
CN110570873B (en) Voiceprint wake-up method and device, computer equipment and storage medium
CN110767231A (en) Voice control equipment awakening word identification method and device based on time delay neural network
US11295761B2 (en) Method for constructing voice detection model and voice endpoint detection system
WO2021098153A1 (en) Method, system, and electronic apparatus for detecting change of target user, and storage medium
CN111508493B (en) Voice wake-up method and device, electronic equipment and storage medium
CN104282307A (en) Method, device and terminal for awakening voice control system
CN110610710B (en) Construction device and construction method of self-learning voice recognition system
CN111312222A (en) Awakening and voice recognition model training method and device
CN104751227A (en) Method and system for constructing deep neural network
CN111429901A (en) IoT chip-oriented multi-stage voice intelligent awakening method and system
CN111179944B (en) Voice awakening and age detection method and device and computer readable storage medium
CN106612367A (en) Speech wake method based on microphone and mobile terminal
CN113782009A (en) Voice awakening system based on Savitzky-Golay filter smoothing method
CN111223489B (en) Specific keyword identification method and system based on Attention mechanism
WO2023010861A1 (en) Wake-up method, apparatus, device, and computer storage medium
CN111091819A (en) Voice recognition device and method, voice interaction system and method
CN109961804B (en) Intelligent equipment satisfaction evaluation method and device and storage medium
CN111192588A (en) System awakening method and device
CN204856459U (en) Keyword pronunciation of distinguishable sound source position awaken system and mobile terminal up

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant